Pandas uses substantially more memory for storage than asked for Pandas uses substantially more memory for storage than asked for pandas pandas

Pandas uses substantially more memory for storage than asked for


So I make a 8000 byte array:

In [248]: x=np.ones(1000)In [249]: df=pd.DataFrame({'MyCol': x}, dtype=float)In [250]: df.info()<class 'pandas.core.frame.DataFrame'>Int64Index: 1000 entries, 0 to 999Data columns (total 1 columns):MyCol    1000 non-null float64dtypes: float64(1)memory usage: 15.6 KB

So that 8k for the data, and 8k for the index.

I add a column - usage increases by the size of x:

In [251]: df['col2']=xIn [252]: df.info()<class 'pandas.core.frame.DataFrame'>Int64Index: 1000 entries, 0 to 999Data columns (total 2 columns):MyCol    1000 non-null float64col2     1000 non-null float64dtypes: float64(2)memory usage: 23.4 KBIn [253]: x.nbytesOut[253]: 8000