Articles for tag: beautifusoup

Search on blog:

Python: How to scrape curecity.in with selenium

It is example code to scrape it:

#!/usr/bin/env python3 

# date: 2019.12.18
# https://stackoverflow.com/questions/59386434/selenium-webdriver-i-want-to-click-on-the-next-page-till-last-page/59387563#59387563

from selenium import webdriver
#from bs4 import BeautifulSoup as bs
import time

url = 'https://curecity.in/vendor-list.php?category=Doctor&filters_location=Jaipur&filters%5Bsubareas_global%5D=&filters_speciality='

#driver …

read more | czytaj więcej

Python: How to scrape data.gov with requests

It is example code to scrape it:

#
# https://api.data.gov/
# https://regulationsgov.github.io/developers/basics/
#
# https://stackoverflow.com/a/48030949/1832058
#

import requests
import json
import time

all_titles = ['EPA-HQ-OAR-2013-0602']

api_key = 'PB36zotwgisM02kED1vWwvf7BklqCObDGVoyssVE'
api_base='https://api.data.gov/regulations/v3/'

api_url = '{}docket.json?api_key={}&docketId='.format(api_base, api_key)

try:
    for …

read more | czytaj więcej

Python: How to scrape deezer.com with requests

It is example code to scrape it:

import requests
from bs4 import BeautifulSoup
import json

base_url = 'https://www.deezer.com/en/profile/1589856782/loved'

r = requests.get(base_url)

soup = BeautifulSoup(r.text, 'html.parser')

all_scripts = soup.find_all('script')

data = json.loads(all_scripts[6].get_text()[27:])

print('key:', data.keys())
print …

read more | czytaj więcej

Python: How to scrape doctor.webmd.com with scrapy

It is example code to scrape it:

#!/usr/bin/env python3

import scrapy

class MySpider(scrapy.Spider):

    name = 'myspider'

    #allowed_domains = [''link'']
    start_urls = ['https://doctor.webmd.com/find-a-doctor/specialty/psychiatry/arizona/phoenix?pagenumber=1']

    def parse(self, response):

        doctors_urls =  (response.xpath('//*[@class="doctorName"]//@href').extract())

        for doctor in doctors_urls:
            doctor = response …

read more | czytaj więcej

Python: How to scrape dps.psx.com.pk with selenium

It is example code to scrape it:

#!/usr/bin/env python3 

# date: 2019.11.23
# https://stackoverflow.com/questions/59008770/want-to-read-a-tag-data-using-selenium

from selenium import webdriver

driver = webdriver.Firefox()
driver.get('https://dps.psx.com.pk/')

last_table = driver.find_elements_by_xpath("//table")[-1]

for row in last_table.find_elements_by_xpath(".//tr")[1:]:
    print(row.find_element_by_xpath …

read more | czytaj więcej

Python: How to scrape drugbank.ca with requests

It is example code to scrape it:

#
# https://stackoverflow.com/a/47716786/1832058
#
# https://stackoverflow.com/a/48116666/1832058
#

import requests
from bs4 import BeautifulSoup

def get_details(url):
    print('details:', url)

    # get subpage
    r = requests.get(url)
    soup = BeautifulSoup(r.text ,"lxml")

    # get data on subpabe
    dts = soup.findAll('dt …

read more | czytaj więcej

Python: How to scrape drugeye.pharorg.com with requests

It is example code to scrape it:

# date: 2019.09.09
# link: https://stackoverflow.com/questions/57856461/python-run-search-function-on-net-web-page

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0'}

r = requests.get('http://www.drugeye.pharorg.com/', headers=headers)
soup = BeautifulSoup(r.text,'lxml')

payload = {
    'ttt': 'asd',
    'b1': 'wait...',
    'Passgenericname …

read more | czytaj więcej

Python: How to scrape e-turysta.pl with requests, BS

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.01.10
# https://stackoverflow.com/questions/59674049/multiple-pages-web-scraping-with-python-and-beautiful-soup/

import requests
from bs4 import BeautifulSoup # HTML data structure
import pandas as pd

def get_page_data(number):
    print('number:', number)

    url = 'https://e-turysta.pl/noclegi-krakow/?page={}'.format(number)
    response = requests …

read more | czytaj więcej

Python: How to scrape ec.europa.eu with requests, BS

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.01.10
# https://stackoverflow.com/questions/59674921/how-can-i-scrape-image-url-from-this-website/

import requests
from bs4 import BeautifulSoup as BS

s = requests.Session()

url = 'https://ec.europa.eu/taxation_customs/dds2/ebti/ebti_consultation.jsp?Lang=en&Lang=en&refcountry=&reference=&valstartdate=&valstartdateto …

read more | czytaj więcej

Python: How to scrape edx.org with scrapy

It is example code to scrape it:

#!/usr/bin/env python3

#
# https://stackoverflow.com/a/48067671/1832058
# 

from scrapy.http import Request
from scrapy.item import Field, Item
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractor import LinkExtractor
from scrapy.loader import ItemLoader
import json

class Course_spider(CrawlSpider):

    name …

read more | czytaj więcej

Python: How to scrape ef.edu with requests, BS

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.02.26
# https://stackoverflow.com/questions/60405929/python-beautifulsoup-adding-words-from-an-html-paragraph-tag-to-list

import requests
from bs4 import BeautifulSoup

page = requests.get("https://www.ef.edu/english-resources/english-vocabulary/top-1000-words/") 

soup = BeautifulSoup(page.content, "html.parser")
para = soup.find(class_="field-item even")

second_p …

read more | czytaj więcej

Python: How to scrape epicgames.com with free games with requests

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.05.18
# https://stackoverflow.com/questions/61876744/scraper-returns-null-result/

import requests

url = 'https://store-site-backend-static.ak.epicgames.com/freeGamesPromotions?locale=en-US&country=PL&allowCountries=PL'

r = requests.get(url)

data = r.json()

#print(r.text)

for item in data …

read more | czytaj więcej

Python: How to scrape espn.com (1) with scrapy, requests, pandas

It is example code to scrape it:

Data is in <script> between data: and queue: in JSON format.

You can use standard string functions (ie. find(), slicing) to cut off this part.
And then you can use module json to convert to python dictionary.
And then you have to only …

read more | czytaj więcej

Python: How to scrape espn.com (1) with scrapy, requests, pandas/api-requests-dataframe

It is example code to scrape it:

import requests
import pandas as pd

url = 'https://site.web.api.espn.com/apis/common/v3/sports/football/nfl/statistics/byathlete?region=us&lang=en&contentorigin=espn&isqualified=false&limit=50&category=offense%3Arushing&sort=rushing.rushingYards%3Adesc&season=2018&seasontype=2&page …

read more | czytaj więcej

Python: How to scrape espn.com (2) with requests

It is example code to scrape it:

# date: 2020.03.01
# https://stackoverflow.com/questions/60471569/turning-for-loop-into-multiprocessing-loop/

import requests
import time

def get_data(query):

    url = 'https://site.web.api.espn.com/apis/common/v3/search?region=us&lang=en&query={}&limit=5&mode=prefix&type=player'.format(query)

    r = requests …

read more | czytaj więcej

Python: How to scrape facebook.com with selenium

It is example code to scrape it:

#
# https://stackoverflow.com/a/47539575/1832058
# 

from selenium import webdriver

browser = webdriver.Chrome() #'/usr/local/bin/chromedriver')

browser.get('https://www.facebook.com/SparColruytGroup/app/300001396778554?app_data=DD722A43-C774-FC01-8823-8016BFF8F0D0')
browser.implicitly_wait(5)

iframe = browser.find_element_by_css_selector('#pagelet_app_runner iframe')
browser.switch_to_frame(iframe)

iframe = browser.find_element_by_css_selector('#qualifio_insert_place …

read more | czytaj więcej

Python: How to scrape fbref.com with requests

It is example code to scrape it:

#!/usr/bin/env python

# date: 2019.09.21
# 

from bs4 import BeautifulSoup as BS
import requests 

url = 'https://fbref.com/en/matches/033092ef/Northampton-Town-Lincoln-City-August-4-2018-League-Two'

response = requests.get(url)
soup = BS(response.content, 'html.parser')

stats = soup.find('div', id="team_stats")

data = []
for row …

read more | czytaj więcej

Python: How to scrape fcainfoweb.nic.in

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.05.28
# 

from selenium import webdriver 
from selenium.webdriver.support.ui import Select
import pandas as pd
import time

# --- functions ---

def get_data(start_date, end_date, product):

    # select `Variation Report`
    driver.find_element_by_id('ctl00_MainContent_Rbl_Rpt_type_1').click()

    # select `Daily Variant`
    element_variation = driver …

read more | czytaj więcej

Python: How to scrape fileinfo.com

It is example code to scrape it:

#!/usr/bin/env python3

# date: 2020.04.19
# https://stackoverflow.com/questions/61298422/extracting-specific-elements-in-a-table-with-selenium-in-python/

import selenium.webdriver

driver = selenium.webdriver.Firefox()

# --- video ---

url = 'https://fileinfo.com/filetypes/video'
driver.get(url)

all_items = driver.find_elements_by_xpath('//td/a')

for item in all_items:
    print(item.text …

read more | czytaj więcej

Python: How to scrape finance.naver.com

It is example code to scrape it:

from bs4 import  BeautifulSoup
import urllib.request as req

url = "https://finance.naver.com/sise/"
res = req.urlopen(url)
soup = BeautifulSoup(res, "html.parser")

rows = soup.select("#contentarea_right #trend_tab_1 tr")
for row in rows:
    cols = row.select('td')
    print("-", cols[0].text, '|', cols …

read more | czytaj więcej

« Page: 3 / 11 »