Articles for tag: selenium

Search on blog:

Python: How to scrape edx.org with scrapy

It is example code to scrape it:

#!/usr/bin/env python3

#
# https://stackoverflow.com/a/48067671/1832058
# 

from scrapy.http import Request
from scrapy.item import Field, Item
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractor import LinkExtractor
from scrapy.loader import ItemLoader
import json

class Course_spider(CrawlSpider):

    name …

read more | czytaj więcej

Python: How to scrape ef.edu with requests, BS

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.02.26
# https://stackoverflow.com/questions/60405929/python-beautifulsoup-adding-words-from-an-html-paragraph-tag-to-list

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.ef.edu/english-resources/english-vocabulary/top-1000-words/") 

soup = BeautifulSoup(page.content, "html.parser")
para = soup.find(class_="field-item even")

second_p …

read more | czytaj więcej

Python: How to scrape epicgames.com with free games with requests

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.05.18
# https://stackoverflow.com/questions/61876744/scraper-returns-null-result/

import requests

url = 'https://store-site-backend-static.ak.epicgames.com/freeGamesPromotions?locale=en-US&country=PL&allowCountries=PL'

r = requests.get(url)

data = r.json()

#print(r.text)

for item in data …

read more | czytaj więcej

Python: How to scrape espn.com (1) with scrapy, requests, pandas

It is example code to scrape it:

Data is in <script> between data: and queue: in JSON format.

You can use standard string functions (ie. find(), slicing) to cut off this part.
And then you can use module json to convert to python dictionary.
And then you have to only …

read more | czytaj więcej

Python: How to scrape espn.com (1) with scrapy, requests, pandas/api-requests-dataframe

It is example code to scrape it:

import requests
import pandas as pd

url = 'https://site.web.api.espn.com/apis/common/v3/sports/football/nfl/statistics/byathlete?region=us&lang=en&contentorigin=espn&isqualified=false&limit=50&category=offense%3Arushing&sort=rushing.rushingYards%3Adesc&season=2018&seasontype=2&page …

read more | czytaj więcej

Python: How to scrape espn.com (2) with requests

It is example code to scrape it:

# date: 2020.03.01
# https://stackoverflow.com/questions/60471569/turning-for-loop-into-multiprocessing-loop/

import requests
import time

def get_data(query):

    url = 'https://site.web.api.espn.com/apis/common/v3/search?region=us&lang=en&query={}&limit=5&mode=prefix&type=player'.format(query)

    r = requests …

read more | czytaj więcej

Python: How to scrape facebook.com with selenium

It is example code to scrape it:

#
# https://stackoverflow.com/a/47539575/1832058
# 

from selenium import webdriver

browser = webdriver.Chrome() #'/usr/local/bin/chromedriver')

browser.get('https://www.facebook.com/SparColruytGroup/app/300001396778554?app_data=DD722A43-C774-FC01-8823-8016BFF8F0D0')
browser.implicitly_wait(5)

iframe = browser.find_element_by_css_selector('#pagelet_app_runner iframe')
browser.switch_to_frame(iframe)

iframe = browser.find_element_by_css_selector('#qualifio_insert_place …

read more | czytaj więcej

Python: How to scrape fbref.com with requests

It is example code to scrape it:

#!/usr/bin/env python

# date: 2019.09.21
# 

from bs4 import BeautifulSoup as BS
import requests 

url = 'https://fbref.com/en/matches/033092ef/Northampton-Town-Lincoln-City-August-4-2018-League-Two'

response = requests.get(url)
soup = BS(response.content, 'html.parser')

stats = soup.find('div', id="team_stats")

data = []
for row …

read more | czytaj więcej

Python: How to scrape fcainfoweb.nic.in

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.05.28
# 

from selenium import webdriver 
from selenium.webdriver.support.ui import Select
import pandas as pd
import time

# --- functions ---

def get_data(start_date, end_date, product):

    # select `Variation Report`
    driver.find_element_by_id('ctl00_MainContent_Rbl_Rpt_type_1').click()

    # select `Daily Variant`
    element_variation = driver …

read more | czytaj więcej

Python: How to scrape fileinfo.com

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.04.19
# https://stackoverflow.com/questions/61298422/extracting-specific-elements-in-a-table-with-selenium-in-python/

import selenium.webdriver

driver = selenium.webdriver.Firefox()

# --- video ---

url = 'https://fileinfo.com/filetypes/video'
driver.get(url)

all_items = driver.find_elements_by_xpath('//td/a')

for item in all_items:
    print(item.text …

read more | czytaj więcej

Python: How to scrape finance.naver.com

It is example code to scrape it:

from bs4 import  BeautifulSoup
import urllib.request as req

url = "https://finance.naver.com/sise/"
res = req.urlopen(url)
soup = BeautifulSoup(res, "html.parser")

rows = soup.select("#contentarea_right #trend_tab_1 tr")
for row in rows:
    cols = row.select('td')
    print("-", cols[0].text, '|', cols …

read more | czytaj więcej

Python: How to scrape finance.yahoo.com-quote-spy with requests

It is example code to scrape it:

# date: 2019.04.23

import requests
from bs4 import BeautifulSoup
import json

url = 'https://finance.yahoo.com/quote/SPY'
result = requests.get(url)

html = BeautifulSoup(result.content, 'html.parser')
script = html.find_all('script')[-3].text
data = script[112:-12]
print(data[:10], data …

read more | czytaj więcej

Python: How to scrape finance.yahoo.com with news with selenium

It is example code to scrape it:

#!/usr/bin/env python3

# author: https://blog.furas.pl
# date: 2020.07.11
# 
from selenium import webdriver
import time

#driver = webdriver.Chrome()
driver = webdriver.Firefox()

driver.get("https://finance.yahoo.com/quote/INFY/news?p=INFY")

for i in range(20):
       driver.execute_script …

read more | czytaj więcej

Python: How to scrape flashscore.com

It is example code to scrape it:

# date: 2020.06.10
# https://stackoverflow.com/questions/62293949/web-scraping-with-bs4-pyhton3-cant-find-elements/62294633#62294633

import requests
import bs4 as bs

#url = 'https://www.flashscore.com/field-hockey/netherlands/hoofdklasse/standings/'

url = 'https://d.flashscore.com/x/feed/ss_1_INmPqO86_GOMWObX1_table_overall'

headers = {
#    'User-Agent': 'Mozilla/5.0'
#    'User-Agent': 'Mozilla/5 …

read more | czytaj więcej

Python: How to scrape ford.co.uk with dowload-manual with Selenium + BS

It is example code to scrape it:

# https://stackoverflow.com/questions/60377798/error-while-selecting-dependent-drop-down-and-click-the-option-in-python/60378558#60378558
# Error while selecting dependent drop down and click the option In Python


# BTW: sometimes page shows popup window at start but I didn't try to solve this problem

# BTW: I had to check `if …

read more | czytaj więcej

Python: How to scrape forexfactory.com

It is example code to scrape it:

#!/usr/bin/env python3 

# date: 2019.12.30
# https://stackoverflow.com/questions/59535798/python-webscraping-with-beautifulsoup-not-displaying-full-content/59536553#59536553

import requests
from bs4 import BeautifulSoup

r = requests.get("https://www.forexfactory.com/#detail=108867")
# page uses JavaScript to redirect page so browser may shows different results …

read more | czytaj więcej

Python: How to scrape forum.toribash.com

It is example code to scrape it:

#
# https://stackoverflow.com/a/48078358/1832058
# 

import requests
from lxml import html

s = requests.session()

result = s.get("http://forum.toribash.com/tori_spy.php")
tree = html.fromstring(result.content)

for script in tree.xpath("//script"):
    if script.text and 'highestid' in script.text …

read more | czytaj więcej

Python: How to scrape fr.alliexpress.com with requests

It is example code to scrape it:

#!/usr/bin/env python3

#
# https://stackoverflow.com/a/47851923/1832058
#

import urllib.request
from bs4 import BeautifulSoup

headers = {
    #'User-Agent': 'Mozilla/5.0',

    'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:48.0) Gecko/20100101 Firefox/48.0',

    #'User-Agent': 'Mozilla/5.0 (Windows …

read more | czytaj więcej

Python: How to scrape fundamentus.com.br with requests, pandas, json

It is example code to scrape it:

# author: https://blog.furas.pl
# date: 2020.07.16
# link: https://stackoverflow.com/questions/62921395/pandas-include-key-to-json-file/

import requests
import pandas as pd
import json

url = 'http://www.fundamentus.com.br/resultado.php'

headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64 …

read more | czytaj więcej

Python: How to scrape fundrazr.com

It is example code to scrape it:

#
# https://stackoverflow.com/a/47495628/1832058
#

import scrapy
import pyquery

class MySpider(scrapy.Spider):

    name = 'myspider'

    start_urls = ['https://fundrazr.com/find?category=Health']

    def parse(self, response):
        print('--- css 1 ---')
        for title in response.css('h2'):
            print('>>>', title)

        print('--- css 2 ---')
        for title …

read more | czytaj więcej

« Page: 4 / 12 »