selecting from multi-index pandas selecting from multi-index pandas python python

selecting from multi-index pandas


One way is to use the get_level_values Index method:

In [11]: dfOut[11]:     0A B1 4  12 5  23 6  3In [12]: df.iloc[df.index.get_level_values('A') == 1]Out[12]:     0A B1 4  1

In 0.13 you'll be able to use xs with drop_level argument:

df.xs(1, level='A', drop_level=False) # axis=1 if columns

Note: if this were column MultiIndex rather than index, you could use the same technique:

In [21]: df1 = df.TIn [22]: df1.iloc[:, df1.columns.get_level_values('A') == 1]Out[22]:A  1B  40  1


You can also use query which is very readable in my opinion and straightforward to use:

import pandas as pddf = pd.DataFrame({'A': [1, 2, 3, 4], 'B': [10, 20, 50, 80], 'C': [6, 7, 8, 9]})df = df.set_index(['A', 'B'])      CA B    1 10  62 20  73 50  84 80  9

For what you had in mind you can now simply do:

df.query('A == 1')      CA B    1 10  6

You can also have more complex queries using and

df.query('A >= 1 and B >= 50')      CA B    3 50  84 80  9

and or

df.query('A == 1 or B >= 50')      CA B    1 10  63 50  84 80  9

You can also query on different index levels, e.g.

df.query('A == 1 or C >= 8')

will return

      CA B    1 10  63 50  84 80  9

If you want to use variables inside your query, you can use @:

b_threshold = 20c_threshold = 8df.query('B >= @b_threshold and C <= @c_threshold')      CA B    2 20  73 50  8


You can use DataFrame.xs():

In [36]: df = DataFrame(np.random.randn(10, 4))In [37]: df.columns = [np.random.choice(['a', 'b'], size=4).tolist(), np.random.choice(['c', 'd'], size=4)]In [38]: df.columns.names = ['A', 'B']In [39]: dfOut[39]:A      b             aB      d      d      d      d0 -1.406  0.548 -0.635  0.5761 -0.212 -0.583  1.012 -1.3772  0.951 -0.349 -0.477 -1.2303  0.451 -0.168  0.949  0.5454 -0.362 -0.855  1.676 -2.8815  1.283  1.027  0.085 -1.2826  0.583 -1.406  0.327 -0.1467 -0.518 -0.480  0.139  0.8518 -0.030 -0.630 -1.534  0.5349  0.246 -1.558 -1.885 -1.543In [40]: df.xs('a', level='A', axis=1)Out[40]:B      d      d0 -0.635  0.5761  1.012 -1.3772 -0.477 -1.2303  0.949  0.5454  1.676 -2.8815  0.085 -1.2826  0.327 -0.1467  0.139  0.8518 -1.534  0.5349 -1.885 -1.543

If you want to keep the A level (the drop_level keyword argument is only available starting from v0.13.0):

In [42]: df.xs('a', level='A', axis=1, drop_level=False)Out[42]:A      aB      d      d0 -0.635  0.5761  1.012 -1.3772 -0.477 -1.2303  0.949  0.5454  1.676 -2.8815  0.085 -1.2826  0.327 -0.1467  0.139  0.8518 -1.534  0.5349 -1.885 -1.543