Pandas DataFrame stack multiple column values into single column Pandas DataFrame stack multiple column values into single column python python

Pandas DataFrame stack multiple column values into single column


You can melt your dataframe:

>>> keys = [c for c in df if c.startswith('key.')]>>> pd.melt(df, id_vars='topic', value_vars=keys, value_name='key')   topic variable  key0      8    key.0  abc1      9    key.0  xab2      8    key.1  def3      9    key.1  xcd4      8    key.2  ghi5      9    key.2  xef

It also gives you the source of the key.


From v0.20, melt is a first class function of the pd.DataFrame class:

>>> df.melt('topic', value_name='key').drop('variable', 1)   topic  key0      8  abc1      9  xab2      8  def3      9  xcd4      8  ghi5      9  xef


OK , cause one of the current answer is mark as duplicated of this question, I will answer here.

By Using wide_to_long

pd.wide_to_long(df, ['key'], 'topic', 'age').reset_index().drop('age',1)Out[123]:    topic  key0      8  abc1      9  xab2      8  def3      9  xcd4      8  ghi5      9  xef


After trying various ways, I find the following is more or less intuitive, provided stack's magic is understood:

# keep topic as index, stack other columns 'against' itstacked = df.set_index('topic').stack()# set the name of the new series createddf = stacked.reset_index(name='key')# drop the 'source' level (key.*)df.drop('level_1', axis=1, inplace=True)

The resulting dataframe is as required:

   topic  key0      8  abc1      8  def2      8  ghi3      9  xab4      9  xcd5      9  xef

You may want to print intermediary results to understand the process in full. If you don't mind having more columns than needed, the key steps are set_index('topic'), stack() and reset_index(name='key').