大佬教程收集整理的这篇文章主要介绍了在循环中向 Pandas df 添加新行,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
我很好奇如何使用来自循环交互的新数据附加或连接 Pandas df。我使用 SELEnium 查看网页,使用 BeautifulSoup 阅读 HTML。从那里,我每页得到两个数据表。我在多个页面上运行它,我想将第 2 页上的表 1 中的数据添加到第 1 页上的表 1,两个页面上的表 2 也是如此。
我想我需要在 df 上添加一个 append 函数,但我不确定。
from SELEnium import webdriver
from SELEnium.webdriver.common.by import By
from SELEnium.webdriver.support.ui import webdriverwait
from SELEnium.webdriver.support import expected_conditions as EC
import csv
import time
from SELEnium.webdriver.common.action_chains import ActionChains
from SELEnium.webdriver.Chrome.options import Options
from SELEnium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup as soup
import pandas as pd
urls = ["https://racing.hkjc.com/racing/information/English/Racing/LocalResults.aspx?RaceDate=2021/02/06","https://racing.hkjc.com/racing/information/English/Racing/LocalResults.aspx?RaceDate=2021/02/10"]
dataList_races = [] #empty List
x = 0 #counter
dataList_results = [] #empty List
x = 0 #counter
for url in urls:
driver = webdriver.Chrome()
driver.get(url)
HTML = driver.execute_script("return document.getElementsByTagname('HTML')[0].INNERHTML")
webdriverwait(driver,20).until(EC.visibility_of_all_elements_located((By.CLASS_name,"f_fs13")))
HTMLStr = driver.page_source
soup_level1 = soup(HTMLStr,'HTMl.parser')
race_soup = soup_level1.find('tbody',{'class':'f_fs13'}).find_parent('table')
results_soup = soup_level1.find('tbody',{'class':'f_fs12'}).find_parent('table')
df_races = pd.read_HTML(str(race_soup))[0]
dataList_races.append(df_races[0])
df_results = pd.read_HTML(str(results_soup))[0]
dataList_results.append(df_results[0])
print(df_results)
driver.close()
任何见解都会很棒。阅读这里的评论和帖子,以及观看 YT 视频,让我无所适从。
在您的循环中,对您要附加的任何 df 执行此操作:
df.loc[len(df.indeX)] = data_element
对于你的情况
from SELEnium import webdriver
from SELEnium.webdriver.common.by import By
from SELEnium.webdriver.support.ui import WebDriverWait
from SELEnium.webdriver.support import expected_conditions as EC
import csv
import time
from SELEnium.webdriver.common.action_chains import ActionChains
from SELEnium.webdriver.chrome.options import Options
from SELEnium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup as soup
import pandas as pd
urls = ["https://racing.hkjc.com/racing/information/English/Racing/LocalResults.aspx?RaceDate=2021/02/06","https://racing.hkjc.com/racing/information/English/Racing/LocalResults.aspx?RaceDate=2021/02/10"]
datalist_races = [] #empty list
x = 0 #counter
datalist_results = [] #empty list
x = 0 #counter
for url in urls:
driver = webdriver.Chrome()
driver.get(url)
html = driver.execute_script("return document.getElementsByTagName('html')[0].innerHTML")
WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME,"f_fs13")))
htmlStr = driver.page_source
soup_level1 = soup(htmlStr,'html.parser')
race_soup = soup_level1.find('tbody',{'class':'f_fs13'}).find_parent('table')
results_soup = soup_level1.find('tbody',{'class':'f_fs12'}).find_parent('table')
df_races = pd.read_html(str(race_soup))[0]
datalist_races.loc[len(datalist_races.indeX)] = df_races.loc[0]
df_results = pd.read_html(str(results_soup))[0]
datalist_results.loc[len(datalist_results.indeX)] = df_results.loc[0]
print(df_results)
driver.close()
以上是大佬教程为你收集整理的在循环中向 Pandas df 添加新行全部内容,希望文章能够帮你解决在循环中向 Pandas df 添加新行所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。