Clarification in the Theano tutorial

python numpy theano gradient-descent deep-learning

The initialization of param_update using theano.shared(.) only tells Theano to reserve a variable that will be used by Theano functions. This initialization code is only called once, and will not be used later on to reset the value of param_update to 0.

The actual value of param_update will be updated according to the last line

updates.append((param_update, momentum*param_update + (1. - momentum)*T.grad(cost, param)))

when train function that was constructed by having this update dictionary as an argument ([23] in the tutorial):

train = theano.function([mlp_input, mlp_target], cost,                        updates=gradient_updates_momentum(cost, mlp.params, learning_rate, momentum))

Each time train is called, Theano will compute the gradient of the cost w.r.t. param and update param_update to a new update direction according to momentum rule. Then, param will be updated by following the update direction saved in param_update with an appropriate learning_rate.

CodeHunter

Clarification in the Theano tutorial

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last