How to take column-slices of dataframe in pandas

2017 Answer - pandas 0.20: .ix is deprecated. Use .loc

.loc uses label based indexing to select both rows and columns. The labels being the values of the index or the columns. Slicing with .loc includes the last element.

Let's assume we have a DataFrame with the following columns:
foo, bar, quz, ant, cat, sat, dat.

# selects all rows and all columns beginning at 'foo' up to and including 'sat'df.loc[:, 'foo':'sat']# foo bar quz ant cat sat

.loc accepts the same slice notation that Python lists do for both row and columns. Slice notation being start:stop:step

# slice from 'foo' to 'cat' by every 2nd columndf.loc[:, 'foo':'cat':2]# foo quz cat# slice from the beginning to 'bar'df.loc[:, :'bar']# foo bar# slice from 'quz' to the end by 3df.loc[:, 'quz'::3]# quz sat# attempt from 'sat' to 'bar'df.loc[:, 'sat':'bar']# no columns returned# slice from 'sat' to 'bar'df.loc[:, 'sat':'bar':-1]sat cat ant quz bar# slice notation is syntatic sugar for the slice function# slice from 'quz' to the end by 2 with slice functiondf.loc[:, slice('quz',None, 2)]# quz cat dat# select specific columns with a list# select columns foo, bar and datdf.loc[:, ['foo','bar','dat']]# foo bar dat

You can slice by rows and columns. For instance, if you have 5 rows with labels v, w, x, y, z

# slice from 'w' to 'y' and 'foo' to 'ant' by 3df.loc['w':'y', 'foo':'ant':3]#    foo ant# w# x# y

python pandas numpy dataframe slice

Note: .ix has been deprecated since Pandas v0.20. You should instead use .loc or .iloc, as appropriate.

The DataFrame.ix index is what you want to be accessing. It's a little confusing (I agree that Pandas indexing is perplexing at times!), but the following seems to do what you want:

>>> df = DataFrame(np.random.rand(4,5), columns = list('abcde'))>>> df.ix[:,'b':]      b         c         d         e0  0.418762  0.042369  0.869203  0.9723141  0.991058  0.510228  0.594784  0.5343662  0.407472  0.259811  0.396664  0.8942023  0.726168  0.139531  0.324932  0.906575

where .ix[row slice, column slice] is what is being interpreted. More on Pandas indexing here: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-advanced

python pandas numpy dataframe slice

Lets use the titanic dataset from the seaborn package as an example

# Load dataset (pip install seaborn)>> import seaborn.apionly as sns>> titanic = sns.load_dataset('titanic')

using the column names

>> titanic.loc[:,['sex','age','fare']]

using the column indices

>> titanic.iloc[:,[2,3,6]]

using ix (Older than Pandas <.20 version)

>> titanic.ix[:,[‘sex’,’age’,’fare’]]

>> titanic.ix[:,[2,3,6]]

using the reindex method

>> titanic.reindex(columns=['sex','age','fare'])

CodeHunter

How to take column-slices of dataframe in pandas

2017 Answer - pandas 0.20: .ix is deprecated. Use .loc

using the column names

using the column indices

using ix (Older than Pandas <.20 version)

using the reindex method

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last