Pandas: create a dataframe from 2D numpy arrays preserving their sequential order Pandas: create a dataframe from 2D numpy arrays preserving their sequential order numpy numpy

Pandas: create a dataframe from 2D numpy arrays preserving their sequential order


I think the simplest approach is flattening the arrays by using ravel:

df = pd.DataFrame({'lat': lat.ravel(), 'long': long.ravel(), 'val': val.ravel()})print (df)   lat  long  val0   10   100   171   20   102    22   30   103   113   20   105   864   11   101   845   33   102    16   21   100    97   20   102    58   10   103   10


Something like this -

# Create stacked arrayIn [100]: arr = np.column_stack((lat.ravel(),long.ravel(),val.ravel()))# Create dataframe from it and assign column names    In [101]: pd.DataFrame(arr,columns=('lat','long','val'))Out[101]:    lat  long  val0   10   100   171   20   102    22   30   103   113   20   105   864   11   101   845   33   102    16   21   100    97   20   102    58   10   103   10

Runtime test -

In [103]: lat = np.random.rand(30,30)In [104]: long = np.random.rand(30,30)In [105]: val = np.random.rand(30,30)In [106]: %timeit pd.DataFrame({'lat': lat.ravel(), 'long': long.ravel(), 'val': val.ravel()})1000 loops, best of 3: 452 µs per loopIn [107]: arr = np.column_stack((lat.ravel(),long.ravel(),val.ravel()))In [108]: %timeit np.column_stack((lat.ravel(),long.ravel(),val.ravel()))100000 loops, best of 3: 12.4 µs per loopIn [109]: %timeit pd.DataFrame(arr,columns=('lat','long','val'))1000 loops, best of 3: 217 µs per loop


No need to ravel first. You can just stack and go.

lat, long, val = np.arange(5), np.arange(5), np.arange(5)arr = np.stack((lat, long, val), axis=1)cols = ['lat', 'long', 'val']df = pd.DataFrame(arr, columns=cols)   lat  long  val0    0     0    01    1     1    12    2     2    23    3     3    34    4     4    4