Pandas vs. Numpy Dataframes

python pandas numpy multidimensional-array dataframe

pandas focuses on tabular data structures and when doing the operations (addition, subtraction etc.) it looks at the labels - not positions.

Consider the following DataFrame:

df = pd.DataFrame(np.random.randn(5, 3), index=list('abcde'), columns=list('xyz'))

Here, df[1:] is:

df[1:]Out:           x         y         zb  1.003035  0.172960  1.160033c  0.117608 -1.114294 -0.557413d -1.312315  1.171520 -1.034012e -0.380719 -0.422896  1.073535

And df[:-1] is:

df[:-1]Out:           x         y         za  1.367916  1.087607 -0.625777b  1.003035  0.172960  1.160033c  0.117608 -1.114294 -0.557413d -1.312315  1.171520 -1.034012

If you do df[1:] / df[:-1] it will divide row b's by row b's, row c's by row c's and row d's by row d's. For row a and e, it will not be able to find corresponding rows in the other DataFrame (either in the first one or in the second one) so it will return nan:

df[1:] / df[:-1]Out:      x    y    za  NaN  NaN  NaNb  1.0  1.0  1.0c  1.0  1.0  1.0d  1.0  1.0  1.0e  NaN  NaN  NaN

If you just want to do element-wise division ignoring the labels, accessing the underlying numpy array by .values for one of the frames is a way of telling pandas to ignore labels. Since numpy arrays don't have labels, pandas will just do element-wise operations:

df[1:]/df[:-1].valuesOut:            x         y         zb   0.733258  0.159028 -1.853749c   0.117252 -6.442482 -0.480515d -11.158359 -1.051357  1.855018e   0.290112 -0.360981 -1.038223

CodeHunter

Pandas vs. Numpy Dataframes

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last