Pandas: Subindexing dataframes: Copies vs views Pandas: Subindexing dataframes: Copies vs views python python

Pandas: Subindexing dataframes: Copies vs views


Your answer lies in the pandas docs: returning-a-view-versus-a-copy.

Whenever an array of labels or a boolean vector are involved in the indexing operation, the result will be a copy. With single label / scalar indexing and slicing, e.g. df.ix[3:6] or df.ix[:, 'A'], a view will be returned.

In your example, bar is a view of slices of foo. If you wanted a copy, you could have used the copy method. Modifying bar also modifies foo. pandas does not appear to have a copy-on-write mechanism.

See my code example below to illustrate:

In [1]: import pandas as pd   ...: import numpy as np   ...: foo = pd.DataFrame(np.random.random((10,5)))   ...: In [2]: pd.__version__Out[2]: '0.12.0.dev-35312e4'In [3]: np.__version__Out[3]: '1.7.1'In [4]: # DataFrame has copy method   ...: foo_copy = foo.copy()In [5]: bar = foo.iloc[3:5,1:4]In [6]: bar == foo.iloc[3:5,1:4] == foo_copy.iloc[3:5,1:4]Out[6]:       1     2     33  True  True  True4  True  True  TrueIn [7]: # Changing the view   ...: bar.ix[3,1] = 5In [8]: # View and DataFrame still equal   ...: bar == foo.iloc[3:5,1:4]Out[8]:       1     2     33  True  True  True4  True  True  TrueIn [9]: # It is now different from a copy of original   ...: bar == foo_copy.iloc[3:5,1:4]Out[9]:        1     2     33  False  True  True4   True  True  True