How to access multi-level index in pandas data frame? How to access multi-level index in pandas data frame? pandas pandas

How to access multi-level index in pandas data frame?


You can use MultiIndex slicing (use slice(None) instead of colon):

df = df.loc[(slice(None), 'one'), :]

Result:

                0         1         2         3bar one -0.424972  0.567020  0.276232 -1.087401baz one  0.404705  0.577046 -1.715002 -1.039268foo one  1.075770 -0.109050  1.643563 -1.469388qux one -1.294524  0.413738  0.276662 -0.472035

Finally you can drop the first index column:

df.index = df.index.droplevel(0)

Result:

            0         1         2         3one -0.424972  0.567020  0.276232 -1.087401one  0.404705  0.577046 -1.715002 -1.039268one  1.075770 -0.109050  1.643563 -1.469388one -1.294524  0.413738  0.276662 -0.472035


Use DataFrame.xs and if need both levels add drop_level=False:

df1 = df.xs('one', level=1, drop_level=False)print (df1)bar one -0.424972  0.567020  0.276232 -1.087401baz one  0.404705  0.577046 -1.715002 -1.039268foo one  1.075770 -0.109050  1.643563 -1.469388qux one -1.294524  0.413738  0.276662 -0.472035

For second remove first level by DataFrame.reset_index with drop=True, so possible select by label with DataFrame.loc:

df2 = df.reset_index(level=0, drop=True).loc['one']#alternative#df2 = df.xs('one', level=1, drop_level=False).reset_index(level=0, drop=True)print (df2)            0         1         2         3one -0.424972  0.567020  0.276232 -1.087401one  0.404705  0.577046 -1.715002 -1.039268one  1.075770 -0.109050  1.643563 -1.469388one -1.294524  0.413738  0.276662 -0.472035

More common is used xs without duplicated levels - so after select one is removed this level:

df3 = df.xs('one', level=1)print (df3)            0         1         2         3bar -0.424972  0.567020  0.276232 -1.087401baz  0.404705  0.577046 -1.715002 -1.039268foo  1.075770 -0.109050  1.643563 -1.469388qux -1.294524  0.413738  0.276662 -0.472035


Since the question involves multi-indexing and the sequence of the index is 'bar' and then 'one' which can be verified by using df.index command:

MultiIndex([('bar', 'one'),            ('bar', 'two'),            ('baz', 'one'),            ('baz', 'two'),            ('foo', 'one'),            ('foo', 'two'),            ('qux', 'one'),            ('qux', 'two')],           )

The output that you are looking for can be accessed using df.loc[('bar','one')]

The output it produces is

0    0.1626931    0.4205182   -0.1520413   -1.039439Name: (bar, one), dtype: float64