Neural Network LSTM input shape from dataframe Neural Network LSTM input shape from dataframe python python

Neural Network LSTM input shape from dataframe


Below is an example that sets up time series data to train an LSTM. The model output is nonsense as I only set it up to demonstrate how to build the model.

import pandas as pdimport numpy as np# Get some time series datadf = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/timeseries.csv")df.head()

Time series dataframe:

Date      A       B       C      D      E      F      G0   2008-03-18  24.68  164.93  114.73  26.27  19.21  28.87  63.441   2008-03-19  24.18  164.89  114.75  26.22  19.07  27.76  59.982   2008-03-20  23.99  164.63  115.04  25.78  19.01  27.04  59.613   2008-03-25  24.14  163.92  114.85  27.41  19.61  27.84  59.414   2008-03-26  24.44  163.45  114.84  26.86  19.53  28.02  60.09

You can build put inputs into a vector and then use pandas .cumsum() function to build the sequence for the time series:

# Put your inputs into a single listdf['single_input_vector'] = df[input_cols].apply(tuple, axis=1).apply(list)# Double-encapsulate list so that you can sum it in the next step and keep time steps as separate elementsdf['single_input_vector'] = df.single_input_vector.apply(lambda x: [list(x)])# Use .cumsum() to include previous row vectors in the current row list of vectorsdf['cumulative_input_vectors'] = df.single_input_vector.cumsum()

The output can be set up in a similar way, but it will be a single vector instead of a sequence:

# If your output is multi-dimensional, you need to capture those dimensions in one object# If your output is a single dimension, this step may be unnecessarydf['output_vector'] = df[output_cols].apply(tuple, axis=1).apply(list)

The input sequences have to be the same length to run them through the model, so you need to pad them to be the max length of your cumulative vectors:

# Pad your sequences so they are the same lengthfrom keras.preprocessing.sequence import pad_sequencesmax_sequence_length = df.cumulative_input_vectors.apply(len).max()# Save it as a list   padded_sequences = pad_sequences(df.cumulative_input_vectors.tolist(), max_sequence_length).tolist()df['padded_input_vectors'] = pd.Series(padded_sequences).apply(np.asarray)

Training data can be pulled from the dataframe and put into numpy arrays. Note that the input data that comes out of the dataframe will not make a 3D array. It makes an array of arrays, which is not the same thing.

You can use hstack and reshape to build a 3D input array.

# Extract your training dataX_train_init = np.asarray(df.padded_input_vectors)# Use hstack to and reshape to make the inputs a 3d vectorX_train = np.hstack(X_train_init).reshape(len(df),max_sequence_length,len(input_cols))y_train = np.hstack(np.asarray(df.output_vector)).reshape(len(df),len(output_cols))

To prove it:

>>> print(X_train_init.shape)(11,)>>> print(X_train.shape)(11, 11, 6)>>> print(X_train == X_train_init)False

Once you have training data you can define the dimensions of your input layer and output layers.

# Get your input dimensions# Input length is the length for one input sequence (i.e. the number of rows for your sample)# Input dim is the number of dimensions in one input vector (i.e. number of input columns)input_length = X_train.shape[1]input_dim = X_train.shape[2]# Output dimensions is the shape of a single output vector# In this case it's just 1, but it could be moreoutput_dim = len(y_train[0])

Build the model:

from keras.models import Model, Sequentialfrom keras.layers import LSTM, Dense# Build the modelmodel = Sequential()# I arbitrarily picked the output dimensions as 4model.add(LSTM(4, input_dim = input_dim, input_length = input_length))# The max output value is > 1 so relu is used as final activation.model.add(Dense(output_dim, activation='relu'))model.compile(loss='mean_squared_error',              optimizer='sgd',              metrics=['accuracy'])

Finally you can train the model and save the training log as history:

# Set batch_size to 7 to show that it doesn't have to be a factor or multiple of your sample sizehistory = model.fit(X_train, y_train,              batch_size=7, nb_epoch=3,              verbose = 1)

Output:

Epoch 1/311/11 [==============================] - 0s - loss: 3498.5756 - acc: 0.0000e+00     Epoch 2/311/11 [==============================] - 0s - loss: 3498.5755 - acc: 0.0000e+00     Epoch 3/311/11 [==============================] - 0s - loss: 3498.5757 - acc: 0.0000e+00 

That's it. Use model.predict(X) where X is the same format (other than the number of samples) as X_train in order to make predictions from the model.


Tensor shape

You're right that Keras is expecting a 3D tensor for an LSTM neural network, but I think the piece you are missing is that Keras expects that each observation can have multiple dimensions.

For example, in Keras I have used word vectors to represent documents for natural language processing. Each word in the document is represented by an n-dimensional numerical vector (so if n = 2 the word 'cat' would be represented by something like [0.31, 0.65]). To represent a single document, the word vectors are lined up in sequence (e.g. 'The cat sat.' = [[0.12, 0.99], [0.31, 0.65], [0.94, 0.04]]). A document would be a single sample in a Keras LSTM.

This is analogous to your time series observations. A document is like a time series, and a word is like a single observation in your time series, but in your case it's just that the representation of your observation is just n = 1 dimensions.

Because of that, I think your tensor should be something like [[[a1], [a2], ... , [aT]], [[b1], [b2], ..., [bT]], ..., [[x1], [x2], ..., [xT]]], where x corresponds to nb_samples, timesteps = T, and input_dim = 1, because each of your observations is only one number.

Batch size

Batch size should be set to maximize throughput without exceeding the memory capacity on your machine, per this Cross Validated post. As far as I know your input does not need to be a multiple of your batch size, neither when training the model and making predictions from it.

Examples

If you're looking for sample code, on the Keras Github there are a number of examples using LSTM and other network types that have sequenced input.