大佬教程收集整理的这篇文章主要介绍了获取包含“Message”列的所有行,其中至少有一个单词在 Array 中,大佬教程大佬觉得挺不错的,现在分享给大家,也给大家做个参考。
df = pd.read_csv('...csv')
array = [.....,....,....]
results = df[df.message.isin(array).fillna(false)]
message
列包含多个单词。
我们如何获取所有包含“message”列的行,其中 message
中的至少一个单词在 array
中?
示例:
ClIEnt message City Phone
Jackson I will BACk soon Rome 1111
Cole Please try to be patIEnt Cairo 2222
Rains Sure anything you want,anything Paris 3333
Array = ['try','anything','patIEnt']
结果:
Cole Please try to be patIEnt Cairo 2222
Rains Sure anything you want,anything Paris 3333
也许是这样的(在没有循环的单行中):
import pandas as pd
data = [['Client','message','City','Phone'],['Jackson','I will BACk soon','Rome',1111],['Cole','Please try to be patient','Cairo',2222 ],['Rains','Sure anything you want,anything','Paris',3333 ]]
Array = ['try','anything','patient']
df = pd.DataFrame(data[1:],columns=data[0])
print (df[df['message'].str.contains('|'.join(Array))])
灵感来自How to test if a String contains one of the subStrings in a list,in pandas?和SELEct by partial String from a pandas DataFrame
,这样的事情可以解决您的问题:
Array = ['try','patient']
def find_words(element):
for word in Array:
if word in element:
return True
return false
results = df[df["message"].apply(find_words)]
编辑:
除了上述之外,我们还可以用单线将其拉下来。这不是一个优雅的解决方案,但它有效:)
Array = ['try','patient']
results = df[df["message"].apply(lambda x: True if word in x else false for word in Array).any(1)]
,
import numpy as np
import random
import pandas as pd
def generate_sample_data(words_to_be_detected):
## Generate Random words
num_random_words = 20
random_words = [''.join(np.random.choice(letters,np.random.randint(2,5)))
for _ in range(np.random.randint(num_random_words))]
## Combine lists
list_of_possible_words = random_words + words_to_be_detected
## define number of rows to be generated
df_num_rows = 5
## Generate sample data
data_sample_Dict = [
{"message": list(np.random.choice(list_of_possible_words,np.random.randint(0,5)))}
for _ in range(df_num_rows)
]
return pd.DataFrame(data_sample_Dict)
## Define words to be found at Series
words_to_be_detected = ['XXXX','YYYY']
## Generate Synthetic Data
df = generate_sample_data(words_to_be_detected)
contains_str_mask = df.message.astype(str).str.contains('|'.join(words_to_be_detected))
display(df)
print('only the ones we are looking fore')
display(df[contains_str_mask])
总结如下:
df.message.astype(str).str.contains('|'.join(words_to_be_detected))
以上是大佬教程为你收集整理的获取包含“Message”列的所有行,其中至少有一个单词在 Array 中全部内容,希望文章能够帮你解决获取包含“Message”列的所有行,其中至少有一个单词在 Array 中所遇到的程序开发问题。
如果觉得大佬教程网站内容还不错,欢迎将大佬教程推荐给程序员好友。
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
如您有任何意见或建议可联系处理。小编QQ:384754419,请注明来意。