How do I convert a pandas Series or index to a Numpy array? [duplicate]

python pandas

To get a NumPy array, you should use the values attribute:

In [1]: df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=['a', 'b', 'c']); df   A  Ba  1  4b  2  5c  3  6In [2]: df.index.valuesOut[2]: array(['a', 'b', 'c'], dtype=object)

This accesses how the data is already stored, so there's no need for a conversion.
Note: This attribute is also available for many other pandas' objects.

In [3]: df['A'].valuesOut[3]: Out[16]: array([1, 2, 3])

To get the index as a list, call tolist:

In [4]: df.index.tolist()Out[4]: ['a', 'b', 'c']

And similarly, for columns.

python pandas

You can use df.index to access the index object and then get the values in a list using df.index.tolist(). Similarly, you can use df['col'].tolist() for Series.

python pandas

pandas >= 0.24

Deprecate your usage of `.values` in favour of these methods!

From v0.24.0 onwards, we will have two brand spanking new, preferred methods for obtaining NumPy arrays from Index, Series, and DataFrame objects: they are to_numpy(), and .array. Regarding usage, the docs mention:

We haven’t removed or deprecated Series.values or DataFrame.values, but we highly recommend and using .array or .to_numpy() instead.

See this section of the v0.24.0 release notes for more information.

to_numpy() Method

df.index.to_numpy()# array(['a', 'b'], dtype=object)df['A'].to_numpy()#  array([1, 4])

By default, a view is returned. Any modifications made will affect the original.

v = df.index.to_numpy()v[0] = -1df    A  B-1  1  2b   4  5

If you need a copy instead, use to_numpy(copy=True);

v = df.index.to_numpy(copy=True)v[-1] = -123df   A  Ba  1  2b  4  5

Note that this function also works for DataFrames (while .array does not).

array Attribute
This attribute returns an ExtensionArray object that backs the Index/Series.

pd.__version__# '0.24.0rc1'# Setup.df = pd.DataFrame([[1, 2], [4, 5]], columns=['A', 'B'], index=['a', 'b'])df   A  Ba  1  2b  4  5

df.index.array    # <PandasArray># ['a', 'b']# Length: 2, dtype: objectdf['A'].array# <PandasArray># [1, 4]# Length: 2, dtype: int64

From here, it is possible to get a list using list:

list(df.index.array)# ['a', 'b']list(df['A'].array)# [1, 4]

or, just directly call .tolist():

df.index.tolist()# ['a', 'b']df['A'].tolist()# [1, 4]

Regarding what is returned, the docs mention,

For Series and Indexes backed by normal NumPy arrays, Series.array will return a new arrays.PandasArray, which is a thin (no-copy) wrapper around a numpy.ndarray. arrays.PandasArray isn’t especially useful on its own, but it does provide the same interface as any extension array defined in pandas or by a third-party library.

So, to summarise, .array will return either

The existing ExtensionArray backing the Index/Series, or
If there is a NumPy array backing the series, a new ExtensionArray object is created as a thin wrapper over the underlying array.

Rationale for adding TWO new methods
These functions were added as a result of discussions under two GitHub issues GH19954 and GH23623.

Specifically, the docs mention the rationale:

[...] with .values it was unclear whether the returned value would be the actual array, some transformation of it, or one of pandas custom arrays (like Categorical). For example, with PeriodIndex, .values generates a new ndarray of period objects each time. [...]

These two functions aim to improve the consistency of the API, which is a major step in the right direction.

Lastly, .values will not be deprecated in the current version, but I expect this may happen at some point in the future, so I would urge users to migrate towards the newer API, as soon as you can.

CodeHunter

How do I convert a pandas Series or index to a Numpy array? [duplicate]

pandas >= 0.24

Deprecate your usage of `.values` in favour of these methods!

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last

How do I convert a pandas Series or index to a Numpy array? [duplicate]

pandas >= 0.24

Deprecate your usage of .values in favour of these methods!

Recent Posts

Deprecate your usage of `.values` in favour of these methods!