Find intersection of two columns in Python Pandas -> list of strings
I believe need length
with set.intersection
in list comprehension:
df['C'] = [len(set(a).intersection(b)) for a, b in zip(df.A, df.B)]
Or:
df['C'] = [len(set(a) & set(b)) for a, b in zip(df.A, df.B)]
Sample:
df = pd.DataFrame(data={'A':[['car', 'passenger', 'truck'], ['car', 'truck']], 'B':[['car', 'house', 'flower', 'truck'], ['car', 'house']]})print (df) A B0 [car, passenger, truck] [car, house, flower, truck]1 [car, truck] [car, house]df['C'] = [len(set(a).intersection(b)) for a, b in zip(df.A, df.B)]print (df) A B C0 [car, passenger, truck] [car, house, flower, truck] 21 [car, truck] [car, house] 1