Convert a numpy array of lists to a numpy array Convert a numpy array of lists to a numpy array arrays arrays

Convert a numpy array of lists to a numpy array


Though going by way of lists is faster than by way of vstack:

In [1617]: timeit np.array(arr[:,1].tolist())...100000 loops, best of 3: 11.5 µs per loopIn [1618]: timeit np.vstack(arr[:,1])...10000 loops, best of 3: 54.1 µs per loop

vstack is doing:

np.concatenate([np.atleast_2d(a) for a in arr[:,1]],axis=0)

Some alternatives:

In [1627]: timeit np.array([a for a in arr[:,1]])100000 loops, best of 3: 18.6 µs per loopIn [1629]: timeit np.stack(arr[:,1],axis=0)10000 loops, best of 3: 60.2 µs per loop

Keep in mind that the object array just contains pointers to the lists which are else where in memory. While the 2d nature of arr makes it easy to select the 2nd column, arr[:,1] is effectively a list of lists. And most operations on it treat it as such. Things like reshape don't cross that object boundary.


One way would be to use stacking operations with something like np.vstack -

np.vstack(arr[:, 1])

Sample run -

In [234]: arrOut[234]: array([[1, ['a', 'b', 'c']],       [2, ['a', 'b', 'c']]], dtype=object)In [235]: arr[:,1]Out[235]: array([['a', 'b', 'c'], ['a', 'b', 'c']], dtype=object)In [236]: np.vstack(arr[:, 1])Out[236]: array([['a', 'b', 'c'],       ['a', 'b', 'c']],       dtype='|S1')

I believe np.vstack would internally use np.concatenate. So, to directly use it, we would have -

np.concatenate(arr[:, 1]).reshape(len(arr),-1)