Keras model.summary() result - Understanding the # of Parameters Keras model.summary() result - Understanding the # of Parameters python python

Keras model.summary() result - Understanding the # of Parameters


The number of parameters is 7850 because with every hidden unit you have 784 input weights and one weight of connection with bias. This means that every hidden unit gives you 785 parameters. You have 10 units so it sums up to 7850.

The role of this additional bias term is really important. It significantly increases the capacity of your model. You can read details e.g. here Role of Bias in Neural Networks.


I feed a 514 dimensional real-valued input to a Sequential model in Keras.My model is constructed in following way :

    predictivemodel = Sequential()    predictivemodel.add(Dense(514, input_dim=514, W_regularizer=WeightRegularizer(l1=0.000001,l2=0.000001), init='normal'))    predictivemodel.add(Dense(257, W_regularizer=WeightRegularizer(l1=0.000001,l2=0.000001), init='normal'))    predictivemodel.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

When I print model.summary() I get following result:

Layer (type)    Output Shape  Param #     Connected to                   ================================================================dense_1 (Dense) (None, 514)   264710      dense_input_1[0][0]              ________________________________________________________________activation_1    (None, 514)   0           dense_1[0][0]                    ________________________________________________________________dense_2 (Dense) (None, 257)   132355      activation_1[0][0]               ================================================================Total params: 397065________________________________________________________________ 

For the dense_1 layer , number of params is 264710.This is obtained as : 514 (input values) * 514 (neurons in the first layer) + 514 (bias values)

For dense_2 layer, number of params is 132355.This is obtained as : 514 (input values) * 257 (neurons in the second layer) + 257 (bias values for neurons in the second layer)


For Dense Layers:

output_size * (input_size + 1) == number_parameters 

For Conv Layers:

output_channels * (input_channels * window_size + 1) == number_parameters

Consider following example,

model = Sequential([Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),Conv2D(64, (3, 3), activation='relu'),Conv2D(128, (3, 3), activation='relu'),Dense(num_classes, activation='softmax')])model.summary()_________________________________________________________________Layer (type)                 Output Shape              Param #   =================================================================conv2d_1 (Conv2D)            (None, 222, 222, 32)      896       _________________________________________________________________conv2d_2 (Conv2D)            (None, 220, 220, 64)      18496     _________________________________________________________________conv2d_3 (Conv2D)            (None, 218, 218, 128)     73856     _________________________________________________________________dense_9 (Dense)              (None, 218, 218, 10)      1290      =================================================================

Calculating params,

assert 32 * (3 * (3*3) + 1) == 896assert 64 * (32 * (3*3) + 1) == 18496assert 128 * (64 * (3*3) + 1) == 73856assert num_classes * (128 + 1) == 1290