How to convert a Numpy 2D array with object dtype to a regular 2D array of floats How to convert a Numpy 2D array with object dtype to a regular 2D array of floats arrays arrays

How to convert a Numpy 2D array with object dtype to a regular 2D array of floats


Nasty little problem... I have been fooling around with this toy example:

>>> arr = np.array([['one', [1, 2, 3]],['two', [4, 5, 6]]], dtype=np.object)>>> arrarray([['one', [1, 2, 3]],       ['two', [4, 5, 6]]], dtype=object)

My first guess was:

>>> np.array(arr[:, 1])array([[1, 2, 3], [4, 5, 6]], dtype=object)

But that keeps the object dtype, so perhaps then:

>>> np.array(arr[:, 1], dtype=np.float)Traceback (most recent call last):  File "<stdin>", line 1, in <module>ValueError: setting an array element with a sequence.

You can normally work around this doing the following:

>>> np.array(arr[:, 1], dtype=[('', np.float)]*3).view(np.float).reshape(-1, 3)Traceback (most recent call last):  File "<stdin>", line 1, in <module>TypeError: expected a readable buffer object

Not here though, which was kind of puzzling. Apparently it is the fact that the objects in your array are lists that throws this off, as replacing the lists with tuples works:

>>> np.array([tuple(j) for j in arr[:, 1]],...          dtype=[('', np.float)]*3).view(np.float).reshape(-1, 3)array([[ 1.,  2.,  3.],       [ 4.,  5.,  6.]])

Since there doesn't seem to be any entirely satisfactory solution, the easiest is probably to go with:

>>> np.array(list(arr[:, 1]), dtype=np.float)array([[ 1.,  2.,  3.],       [ 4.,  5.,  6.]])

Although that will not be very efficient, probably better to go with something like:

>>> np.fromiter((tuple(j) for j in arr[:, 1]), dtype=[('', np.float)]*3,...             count=len(arr)).view(np.float).reshape(-1, 3)array([[ 1.,  2.,  3.],       [ 4.,  5.,  6.]])


Based on Jaime's toy example I think you can do this very simply using np.vstack():

arr = np.array([['one', [1, 2, 3]],['two', [4, 5, 6]]], dtype=np.object)float_arr = np.vstack(arr[:, 1]).astype(np.float)

This will work regardless of whether the 'numeric' elements in your object array are 1D numpy arrays, lists or tuples.


This works great working on your array arr to convert from an object to an array of floats. Number processing is extremely easy after. Thanks for that last post!!!! I just modified it to include any DataFrame size:

float_arr = np.vstack(arr[:, :]).astype(np.float)