Python how to combine two columns of a dataframe into a single list?
The underlying numpy array is organized array([[row1], [row2], ..., [rowN]])
so we can ravel
it, which should be very fast.
df[['data1', 'data2']].to_numpy().ravel().tolist()#[20, 120, 30, 456, 40, 34]
Because I was interested: Here are all the proposed methods, plus another with chain, and some timings for making your output from 2 columns vs the length of the DataFrame.
import perfplotimport pandas as pdimport numpy as npfrom itertools import chainperfplot.show( setup=lambda n: pd.DataFrame(np.random.randint(1, 10, (n, 2))), kernels=[ lambda df: df[[0, 1]].to_numpy().ravel().tolist(), lambda df: [x for i in zip(df[0], df[1]) for x in i], lambda df: [*chain.from_iterable(df[[0,1]].to_numpy())], lambda df: df[[0,1]].stack().tolist() # proposed by @anky_91 ], labels=['ravel', 'zip', 'chain', 'stack'], n_range=[2 ** k for k in range(20)], equality_check=np.allclose, xlabel="len(df)")