问题:
sorry for the noobish question.
I'm learning to use BeautifulSoup, and I'm trying to extract a specific string of data within a table.
The website is https://airtmrates...
可以将文章内容翻译成中文,广告屏蔽插件会导致该功能失效:
问题:
sorry for the noobish question.
I'm learning to use BeautifulSoup, and I'm trying to extract a specific string of data within a table.
The website is https://airtmrates.com/ and the exact string I'm trying to get is:
VES Bolivar Soberano Bank Value Value Value
The table doesn't have any class so I have no idea how to find and parse that string.
I've been pulling something out of my buttcheeks but I've failed miserably. Here's the last code I tried so you can have a laugh:
def airtm():
#URLs y ejecución de BS
url = requests.get("https://airtmrates.com/")
response = requests.get(url)
html = response.content
soup_ = soup(url, 'html.parser')
columns = soup_.findAll('td', text = re.compile('VES'), attrs = {'::before'})
return columns
回答1:
The page is dynamic meaning you'll need the page to render before parsing. You can do that with either Selenium or Requests-HTML
I'm not too familiar with Requests-HTML, but I have used Selenium in the past. This should get you going. Also, whenever I'm tooking to pull a <table>
, tag I like to use pandas to parse. But BeautifulSoup can still be used, just takes a little more work to iterate through the table
, tr
, td
tags. Pandas can do that work for you with .read_html()
:
from selenium import webdriver
import pandas as pd
def airtm(url):
#URLs y ejecución de BS
driver = webdriver.Chrome("C:/chromedriver_win32/chromedriver.exe")
driver.get(url)
tables = pd.read_html(driver.page_source)
df = tables[0]
df = df[df['Code'] == 'VES']
driver.close()
return df
results = airtm('https://airtmrates.com/')
Output:
print (results)
Code Name Method Rate Buy Sell
120 VES Bolivar Soberano Bank 2526.7 2687.98 2383.68
143 VES Bolivar Soberano Mercado Pago 2526.7 2631.98 2429.52
264 VES Bolivar Soberano MoneyGram 2526.7 2776.59 2339.54
455 VES Bolivar Soberano Western Union 2526.7 2746.41 2383.68