Understanding Keras LSTMs: Role of Batch-size and Statefulness Understanding Keras LSTMs: Role of Batch-size and Statefulness python python

Understanding Keras LSTMs: Role of Batch-size and Statefulness


Let me explain it via an example:

So let's say you have the following series: 1,2,3,4,5,6,...,100. You have to decide how many timesteps your lstm will learn, and reshape your data as so. Like below:

if you decide time_steps = 5, you have to reshape your time series as a matrix of samples in this way:

1,2,3,4,5 -> sample1

2,3,4,5,6 -> sample2

3,4,5,6,7 -> sample3

etc...

By doing so, you will end with a matrix of shape (96 samples x 5 timesteps)

This matrix should be reshape as (96 x 5 x 1) indicating Keras that you have just 1 time series. If you have more time series in parallel (as in your case), you do the same operation on each time series, so you will end with n matrices (one for each time series) each of shape (96 sample x 5 timesteps).

For the sake of argument, let's say you 3 time series. You should concat all of three matrices into one single tensor of shape (96 samples x 5 timeSteps x 3 timeSeries). The first layer of your lstm for this example would be:

    model = Sequential()    model.add(LSTM(32, input_shape=(5, 3)))

The 32 as first parameter is totally up to you. It means that at each point in time, your 3 time series will become 32 different variables as output space. It is easier to think each time step as a fully conected layer with 3 inputs and 32 outputs but with a different computation than FC layers.

If you are about stacking multiple lstm layers, use return_sequences=True parameter, so the layer will output the whole predicted sequence rather than just the last value.

your target shoud be the next value in the series you want to predict.

Putting all together, let say you have the following time series:

Time series 1 (master): 1,2,3,4,5,6,..., 100

Time series 2 (support): 2,4,6,8,10,12,..., 200

Time series 3 (support): 3,6,9,12,15,18,..., 300

Create the input and target tensor

x     -> y

1,2,3,4,5 -> 6

2,3,4,5,6 -> 7

3,4,5,6,7 -> 8

reformat the rest of time series, but forget about the target since you don't want to predict those series

Create your model

    model = Sequential()    model.add(LSTM(32, input_shape=(5, 3), return_sequences=True)) # Input is shape (5 timesteps x 3 timeseries), output is shape (5 timesteps x 32 variables) because return_sequences  = True    model.add(LSTM(8))  # output is shape (1 timesteps x 8 variables) because return_sequences = False    model.add(Dense(1, activation='linear')) # output is (1 timestep x 1 output unit on dense layer). It is compare to target variable.

Compile it and train. A good batch size is 32. Batch size is the size your sample matrices are splited for faster computation. Just don't use statefull