Union of two pandas DataFrames Union of two pandas DataFrames pandas pandas

Union of two pandas DataFrames


Merge with an indicator argument, and remap the result:

m = {'left_only': 'df1', 'right_only': 'df2', 'both': 'df1, df2'}result = df1.merge(df2, on=['A'], how='outer', indicator='B')result['B'] = result['B'].map(m)result   A         B0  a  df1, df21  b       df12  c       df2


We use outer join to solve this -

df1 = pd.DataFrame({'A':['a','b']})df2 = pd.DataFrame({'A':['a','c']})df1['col1']='df1'df2['col2']='df2'df=pd.merge(df1, df2, on=['A'], how="outer").fillna('')df['B']=df['col1']+','+df['col2']df['B'] = df['B'].str.strip(',')df=df[['A','B']]df   A        B0  a  df1,df21  b      df12  c      df2


Use the command below:

df3 = pd.concat([df1.assign(source='df1'), df2.assign(source='df2')]) \    .groupby('A') \    .aggregate(list) \    .reset_index()

The result will be:

   A      source0  a  [df1, df2]1  b       [df1]2  c       [df2]

The assign will add a column named source with value df1 and df2 to your dataframes. groupby command groups rows with same A value to single row. aggregate command describes how to aggregate other columns (source) for each group of rows with same A. I have used list aggregate function so that the source column be the list of values with same A.