Move column by name to front of table in pandas Move column by name to front of table in pandas python python

Move column by name to front of table in pandas


We can use ix to reorder by passing a list:

In [27]:# get a list of columnscols = list(df)# move the column to head of list using index, pop and insertcols.insert(0, cols.pop(cols.index('Mid')))colsOut[27]:['Mid', 'Net', 'Upper', 'Lower', 'Zsore']In [28]:# use ix to reorderdf = df.ix[:, cols]dfOut[28]:                      Mid Net  Upper   Lower  ZsoreAnswer_option                                      More_than_once_a_day    2  0%  0.22%  -0.12%     65Once_a_day              3  0%  0.32%  -0.19%     45Several_times_a_week    4  2%  2.45%   1.10%     78Once_a_week             6  1%  1.63%  -0.40%     65

Another method is to take a reference to the column and reinsert it at the front:

In [39]:mid = df['Mid']df.drop(labels=['Mid'], axis=1,inplace = True)df.insert(0, 'Mid', mid)dfOut[39]:                      Mid Net  Upper   Lower  ZsoreAnswer_option                                      More_than_once_a_day    2  0%  0.22%  -0.12%     65Once_a_day              3  0%  0.32%  -0.19%     45Several_times_a_week    4  2%  2.45%   1.10%     78Once_a_week             6  1%  1.63%  -0.40%     65

You can also use loc to achieve the same result as ix will be deprecated in a future version of pandas from 0.20.0 onwards:

df = df.loc[:, cols]


Maybe I'm missing something, but a lot of these answers seem overly complicated. You should be able to just set the columns within a single list:

Column to the front:

df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]

Or if instead, you want to move it to the back:

df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]

Or if you wanted to move more than one column:

cols_to_move = ['Mid', 'Zsore']df           = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]


I prefer this solution:

col = df.pop("Mid")df.insert(0, col.name, col)

It's simpler to read and faster than other suggested answers.

def move_column_inplace(df, col, pos):    col = df.pop(col)    df.insert(pos, col.name, col)

Performance assessment:

For this test, the currently last column is moved to the front in each repetition. In-place methods generally perform better. While citynorman's solution can be made in-place, Ed Chum's method based on .loc and sachinnm's method based on reindex cannot.

While other methods are generic, citynorman's solution is limited to pos=0. I didn't observe any performance difference between df.loc[cols] and df[cols], which is why I didn't include some other suggestions.

I tested with python 3.6.8 and pandas 0.24.2 on a MacBook Pro (Mid 2015).

import numpy as npimport pandas as pdn_cols = 11df = pd.DataFrame(np.random.randn(200000, n_cols),                  columns=range(n_cols))def move_column_inplace(df, col, pos):    col = df.pop(col)    df.insert(pos, col.name, col)def move_to_front_normanius_inplace(df, col):    move_column_inplace(df, col, 0)    return dfdef move_to_front_chum(df, col):    cols = list(df)    cols.insert(0, cols.pop(cols.index(col)))    return df.loc[:, cols]def move_to_front_chum_inplace(df, col):    col = df[col]    df.drop(col.name, axis=1, inplace=True)    df.insert(0, col.name, col)    return dfdef move_to_front_elpastor(df, col):    cols = [col] + [ c for c in df.columns if c!=col ]    return df[cols] # or df.loc[cols]def move_to_front_sachinmm(df, col):    cols = df.columns.tolist()    cols.insert(0, cols.pop(cols.index(col)))    df = df.reindex(columns=cols, copy=False)    return dfdef move_to_front_citynorman_inplace(df, col):    # This approach exploits that reset_index() moves the index    # at the first position of the data frame.    df.set_index(col, inplace=True)    df.reset_index(inplace=True)    return dfdef test(method, df):    col = np.random.randint(0, n_cols)    method(df, col)col = np.random.randint(0, n_cols)ret_mine = move_to_front_normanius_inplace(df.copy(), col)ret_chum1 = move_to_front_chum(df.copy(), col)ret_chum2 = move_to_front_chum_inplace(df.copy(), col)ret_elpas = move_to_front_elpastor(df.copy(), col)ret_sach = move_to_front_sachinmm(df.copy(), col)ret_city = move_to_front_citynorman_inplace(df.copy(), col)# Assert equivalence of solutions.assert(ret_mine.equals(ret_chum1))assert(ret_mine.equals(ret_chum2))assert(ret_mine.equals(ret_elpas))assert(ret_mine.equals(ret_sach))assert(ret_mine.equals(ret_city))

Results:

# For n_cols = 11:%timeit test(move_to_front_normanius_inplace, df)# 1.05 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)%timeit test(move_to_front_citynorman_inplace, df)# 1.68 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)%timeit test(move_to_front_sachinmm, df)# 3.24 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_chum, df)# 3.84 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_elpastor, df)# 3.85 ms ± 58.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_chum_inplace, df)# 9.67 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)# For n_cols = 31:%timeit test(move_to_front_normanius_inplace, df)# 1.26 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_citynorman_inplace, df)# 1.95 ms ± 260 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_sachinmm, df)# 10.7 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_chum, df)# 11.5 ms ± 869 µs per loop (mean ± std. dev. of 7 runs, 100 loops each%timeit test(move_to_front_elpastor, df)# 11.4 ms ± 598 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)%timeit test(move_to_front_chum_inplace, df)# 31.4 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)