Selecting columns by list (and columns are subset of list)
I think you need Index.intersection
:
df = pd.DataFrame({'A':[1,2,3], 'B':[4,5,6], 'C':[7,8,9], 'D':[1,3,5], 'E':[5,3,6], 'F':[7,4,3]})print (df) A B C D E F0 1 4 7 1 5 71 2 5 8 3 3 42 3 6 9 5 6 3lst = ['A','R','B']print (df.columns.intersection(lst))Index(['A', 'B'], dtype='object')data = df[df.columns.intersection(lst)]print (data) A B0 1 41 2 52 3 6
Another solution with numpy.intersect1d
:
data = df[np.intersect1d(df.columns, lst)]print (data) A B0 1 41 2 52 3 6
Few other ways, and list comprehension is much faster
In [1357]: df[df.columns & lst]Out[1357]: A B0 1 41 2 52 3 6In [1358]: df[[c for c in df.columns if c in lst]]Out[1358]: A B0 1 41 2 52 3 6
Timings
In [1360]: %timeit [c for c in df.columns if c in lst]100000 loops, best of 3: 2.54 µs per loopIn [1359]: %timeit df.columns & lst1000 loops, best of 3: 231 µs per loopIn [1362]: %timeit df.columns.intersection(lst)1000 loops, best of 3: 236 µs per loopIn [1363]: %timeit np.intersect1d(df.columns, lst)10000 loops, best of 3: 26.6 µs per loop
Details
In [1365]: dfOut[1365]: A B C D E F0 1 4 7 1 5 71 2 5 8 3 3 42 3 6 9 5 6 3In [1366]: lstOut[1366]: ['A', 'R', 'B']