How to sort a dataframe with the first occurences of each unique element in a column? How to sort a dataframe with the first occurences of each unique element in a column? pandas pandas

How to sort a dataframe with the first occurences of each unique element in a column?


factorize + argsort

df.iloc[np.argsort(df['fehmi'].factorize()[0])]

   necmi     fehmi0      0     trial3     15     trial1      3     error6      8     error2     14  manifest4      2        no7      2        no8     -1        no5     71      only


Solution with ordered Categorical by first values of fehmi column, so possible use DataFrame.sort_values:

df['fehmi'] = pd.Categorical(df['fehmi'], ordered=True, categories=df['fehmi'].unique())df = df.sort_values('fehmi')print (df)   necmi     fehmi0      0     trial3     15     trial1      3     error6      8     error2     14  manifest4      2        no7      2        no8     -1        no5     71      only


Not the most elegant solution, but relies on creating indices and rejoining on original df.

ordering_df = df.reset_index().groupby("fehmi").first().reset_index()[["fehmi", "index"]]result_df = df.merge(ordering_df, how="left", on="fehmi").sort_values(["index", "necmi"])[["necmi", "fehmi"]]necmi   fehmi0       trial15      trial3       error8       error14      manifest-1      no2       no2       no71      only