大佬教程收集整理的这篇文章主要介绍了更新 df 列以匹配不同列上的顺序 - pandas 问题:,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
.csv
文件,如下所示:
Symbol,Address,Website,Price,Vol
URBT,0.99,12345
TSPG,1.99,12346
CRBO,2.99,12347
PVSP,3.99,12348
TPRP,4.99,12349
VMHG,5.99,12350
TORM,6.99,12351
SORT,7.99,12352
MRTI,8.99,12353
VTMC,9.99,12354
@H_419_0@我想用这个更新 Address
和 Website
列:
[{'ticker': 'VMHG','address': '555 NE 34th St. Suite 1207','website': 'http://www.VictoryYachts.com'},{'ticker': 'CRBO','address': '1700 broaDWay','website': 'http://www.carbonenergycorp.com'},{'ticker': 'PVSP','address': '800 Westchester Ave.','website': 'https://www.pervasip.net'},{'ticker': 'VTMC','address': '55 W. 47 Street','website': 'http://www.vtmc.us'},{'ticker': 'SORT','address': '31 Clinton Ave','website': 'http://www.incjet.com'},{'ticker': 'URBT','address': '11705 Willake Street','website': 'https://urbt.tv'},{'ticker': 'TORM','address': '722 Burleson Street','website': 'http://www.torminerals.com'},{'ticker': 'MRTI','address': '104 Armour road','website': 'http://www.mrti.com'},{'ticker': 'TPRP','address': '1000 Walnut St.','website': 'http://www.towerpropertIEs.com'},{'ticker': 'TSPG','address': '525 Milltown Rd,','website': 'http://www.tgipower.com/'}]
@H_419_0@我可以这样做:
import pandas as pd
if __name__ == "__main__":
df = pd.read_csv("tickers.csv")
symbols = df["Symbol"].to_List()
scraped_data = [{'ticker': 'VMHG','website': 'http://www.tgipower.com/'}]
sorted_scraped_data = sorted(
scraped_data,key=lambda i: symbols.index(I["ticker"])
)
df.loc[:,["Address","Website"]] = [
[I["address"],I["website"]] for i in sorted_scraped_data
]
df.to_csv("tickers.csv",index=falsE)
@H_419_0@这行得通,但感觉像是将 Symbol
列的顺序与 Dicts
的列表相匹配的一种Hacky 解决方法。
@H_419_0@如果我想用 pure@H_696_29@ pandas
来做到这一点,而不先排序,例如:
import pandas as pd
if __name__ == "__main__":
df = pd.read_csv("tickers.csv")
symbols = df["Symbol"].to_List()
scraped_data = [{'ticker': 'VMHG','website': 'http://www.tgipower.com/'}]
df.loc[:,I["website"]] for i in scraped_data
if Df[(df["Symbol"] == I["ticker"])]
]
df.to_csv("tickers.csv",index=falsE)
@H_419_0@我得到一个ValueError
:
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty,a.bool(),a.item(),a.any() or a.all().
df
以匹配 Symbol
列的顺序?@H_419_0@期望的输出是:
Symbol,11705 Willake Street,https://urbt.tv,"525 Milltown Rd,",http://www.tgipower.com/,1700 broaDWay,http://www.carbonenergycorp.com,800 Westchester Ave.,https://www.pervasip.net,1000 Walnut St.,http://www.towerpropertIEs.com,555 NE 34th St. Suite 1207,http://www.VictoryYachts.com,722 Burleson Street,http://www.torminerals.com,31 Clinton Ave,http://www.incjet.com,104 Armour road,http://www.mrti.com,55 W. 47 Street,http://www.vtmc.us,12354
您收到此错误是因为您只是使用以下代码段对 DataFrame 进行切片:
df.loc[:,["Address","Website"]] = [
[I["address"],I["website"]] for i in scraped_data
if Df[(df["Symbol"] == I["ticker"])] # This part fails
]
您仍然无法检查其真值。例如,您必须实际获得第一场比赛。
这是一种可能更简单的替代方法,先将抓取的数据转换为 DataFrame:
scraped_df = pd.DataFrame(scraped_data).set_index('ticker')
df['Address','Website'] = scraped_df.loc[df.Symbol,['address','website']].values
输出:
Symbol Address Website Price Vol
0 URBT 11705 Willake Street https://urbt.tv 0.99 12345
1 TSPG 525 Milltown Rd,http://www.tgipower.com/ 1.99 12346
2 CRBO 1700 Broadway http://www.carbonenergycorp.com 2.99 12347
3 PVSP 800 Westchester Ave. https://www.pervasip.net 3.99 12348
4 TPRP 1000 Walnut St. http://www.towerproperties.com 4.99 12349
5 VMHG 555 NE 34th St. Suite 1207 http://www.VictoryYachts.com 5.99 12350
6 TORM 722 Burleson Street http://www.torminerals.com 6.99 12351
7 SORT 31 Clinton Ave http://www.incjet.com 7.99 12352
8 MRTI 104 Armour road http://www.mrti.com 8.99 12353
9 VTMC 55 W. 47 Street http://www.vtmc.us 9.99 12354
,
首先用 NaN 替换 blnaks。其次将 json int 规范化为新的 df。最后使用 combine_first 合并原始 df 和归一化产生的 df。组合代码如下
s=pd.json_normalize(j).rename(columns={"ticker": "Symbol","address": "Address","website":"Website"})
df1=s.combine_first(df.replace('',np.nan))
df1=df1[['Symbol','Address','Website','Price','Vol']]
哪里
j=[{'ticker': 'VMHG','address': '555 NE 34th St. Suite 1207','website': 'http://www.VictoryYachts.com'},{'ticker': 'CRBO','address': '1700 Broadway','website': 'http://www.carbonenergycorp.com'},{'ticker': 'PVSP','address': '800 Westchester Ave.','website': 'https://www.pervasip.net'},{'ticker': 'VTMC','address': '55 W. 47 Street','website': 'http://www.vtmc.us'},{'ticker': 'SORT','address': '31 Clinton Ave','website': 'http://www.incjet.com'},{'ticker': 'URBT','address': '11705 Willake Street','website': 'https://urbt.tv'},{'ticker': 'TORM','address': '722 Burleson Street','website': 'http://www.torminerals.com'},{'ticker': 'MRTI','address': '104 Armour road','website': 'http://www.mrti.com'},{'ticker': 'TPRP','address': '1000 Walnut St.','website': 'http://www.towerproperties.com'},{'ticker': 'TSPG','address': '525 Milltown Rd,','website': 'http://www.tgipower.com/'}]
Symbol Address Website Price \
0 VMHG 555 NE 34th St. Suite 1207 http://www.VictoryYachts.com 0.99
1 CRBO 1700 Broadway http://www.carbonenergycorp.com 1.99
2 PVSP 800 Westchester Ave. https://www.pervasip.net 2.99
3 VTMC 55 W. 47 Street http://www.vtmc.us 3.99
4 SORT 31 Clinton Ave http://www.incjet.com 4.99
5 URBT 11705 Willake Street https://urbt.tv 5.99
6 TORM 722 Burleson Street http://www.torminerals.com 6.99
7 MRTI 104 Armour road http://www.mrti.com 7.99
8 TPRP 1000 Walnut St. http://www.towerproperties.com 8.99
9 TSPG 525 Milltown Rd,http://www.tgipower.com/ 9.99
Vol
0 12345.0
1 12346.0
2 12347.0
3 12348.0
4 12349.0
5 12350.0
6 12351.0
7 12352.0
8 12353.0
9 12354.0
,
df1
是您要加入 df1 的数据。当您分配列值时,Pandas 在索引上执行 join
- 这就是我创建 Symbol
和 ticker
的原因:
scraped_data = [{'ticker': 'VMHG','website': 'http://www.tgipower.com/'}]
df_scraped = pd.DataFrame.from_Dict(q).set_index("ticker")
df = df.set_index("Symbol")
df[["Address","Website"]] = df_scraped[["address","website"]]
df = df.reset_index()
输出df
:
Symbol Address Website Price Vol
0 URBT 11705 Willake Street https://urbt.tv 0.99 12345
1 TSPG 525 Milltown Rd,http://www.tgipower.com/ 1.99 12346
2 CRBO 1700 Broadway http://www.carbonenergycorp.com 2.99 12347
3 PVSP 800 Westchester Ave. https://www.pervasip.net 3.99 12348
4 TPRP 1000 Walnut St. http://www.towerproperties.com 4.99 12349
5 VMHG 555 NE 34th St. Suite 1207 http://www.VictoryYachts.com 5.99 12350
6 TORM 722 Burleson Street http://www.torminerals.com 6.99 12351
7 SORT 31 Clinton Ave http://www.incjet.com 7.99 12352
8 MRTI 104 Armour road http://www.mrti.com 8.99 12353
9 VTMC 55 W. 47 Street http://www.vtmc.us 9.99 12354
以上是大佬教程为你收集整理的更新 df 列以匹配不同列上的顺序 - pandas 问题:全部内容,希望文章能够帮你解决更新 df 列以匹配不同列上的顺序 - pandas 问题:所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。