# How do I convert a pandas Series or index to a Numpy array? [duplicate]

To get a NumPy array, you should use the `values`

attribute:

`In [1]: df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]}, index=['a', 'b', 'c']); df A Ba 1 4b 2 5c 3 6In [2]: df.index.valuesOut[2]: array(['a', 'b', 'c'], dtype=object)`

*This accesses how the data is already stored, so there's no need for a conversion.Note: This attribute is also available for many other pandas' objects.*

`In [3]: df['A'].valuesOut[3]: Out[16]: array([1, 2, 3])`

To get the index as a list, call `tolist`

:

`In [4]: df.index.tolist()Out[4]: ['a', 'b', 'c']`

And similarly, for columns.

## pandas >= 0.24

### Deprecate your usage of `.values`

in favour of these methods!

From v0.24.0 onwards, we will have two brand spanking new, preferred methods for obtaining NumPy arrays from `Index`

, `Series`

, and `DataFrame`

objects: they are ** to_numpy()**, and

**. Regarding usage, the docs mention:**

`.array`

We haven’t removed or deprecated

`Series.values`

or`DataFrame.values`

, butwe highly recommend and using`.array`

or`.to_numpy()`

instead.

See this section of the v0.24.0 release notes for more information.

`df.index.to_numpy()# array(['a', 'b'], dtype=object)df['A'].to_numpy()# array([1, 4])`

By default, a view is returned. Any modifications made will affect the original.

`v = df.index.to_numpy()v[0] = -1df A B-1 1 2b 4 5`

If you need a copy instead, use `to_numpy(copy=True`

);

`v = df.index.to_numpy(copy=True)v[-1] = -123df A Ba 1 2b 4 5`

Note that this function also works for DataFrames (while `.array`

does not).

`array`

Attribute

This attribute returns an `ExtensionArray`

object that backs the Index/Series.

`pd.__version__# '0.24.0rc1'# Setup.df = pd.DataFrame([[1, 2], [4, 5]], columns=['A', 'B'], index=['a', 'b'])df A Ba 1 2b 4 5`

`df.index.array # <PandasArray># ['a', 'b']# Length: 2, dtype: objectdf['A'].array# <PandasArray># [1, 4]# Length: 2, dtype: int64`

From here, it is possible to get a list using `list`

:

`list(df.index.array)# ['a', 'b']list(df['A'].array)# [1, 4]`

or, just directly call `.tolist()`

:

`df.index.tolist()# ['a', 'b']df['A'].tolist()# [1, 4]`

Regarding what is returned, the docs mention,

For

`Series`

and`Index`

es backed by normal NumPy arrays,`Series.array`

will return a new`arrays.PandasArray`

, which is a thin (no-copy) wrapper around a`numpy.ndarray`

.`arrays.PandasArray`

isn’t especially useful on its own, but it does provide the same interface as any extension array defined in pandas or by a third-party library.

So, to summarise, `.array`

will return either

- The existing
`ExtensionArray`

backing the Index/Series, or - If there is a NumPy array backing the series, a new
`ExtensionArray`

object is created as a thin wrapper over the underlying array.

**Rationale for adding TWO new methods**

These functions were added as a result of discussions under two GitHub issues GH19954 and GH23623.

Specifically, the docs mention the rationale:

[...] with

`.values`

it was unclear whether the returned value would be the actual array, some transformation of it, or one of pandas custom arrays (like`Categorical`

). For example, with`PeriodIndex`

,`.values`

generates a new`ndarray`

of period objects each time. [...]

These two functions aim to improve the consistency of the API, which is a major step in the right direction.

Lastly, `.values`

will not be deprecated in the current version, but I expect this may happen at some point in the future, so I would urge users to migrate towards the newer API, as soon as you can.