Neural Network to predict nth square Neural Network to predict nth square python-3.x python-3.x

Neural Network to predict nth square


Following Filip Malczak's and Seanny123's suggestions and comments, I implemented a neural network in tensorflow to check what happens when we try to teach it to predict (and interpolate) the 2-nd square.

Training on continuous interval

I trained the network on the interval [-7,7] (taking 300 points inside this interval, to make it continuous), and then tested it on the interval [-30,30]. The activation functions are ReLu, and the network has 3 hidden layers, each one is of size 50. epochs=500. The result is depicted in the figure below. enter image description here

So basically, inside (and also close to) the interval [-7,7], the fit is quite perfect, and then it continues more or less linearly outside. It is nice to see that at least initially, the slope of the network's output tries to "match" the slope of x^2. If we increase the test interval, the two graphs diverge quite a lot, as one can see in the figure below:

enter image description here

Training on even numbers

Finally, if instead I train the network on the set of all even integers in the interval [-100,100], and apply it on the set of all integers (even and odd) in this interval, I get:enter image description here

When training the network to produce the image above, I increased the epochs to 2500 to get a better accuracy. The rest of the parameters stayed unchanged. So it seems that interpolating "inside" the training interval works quite well (maybe except of the area around 0, where the fit is a bit worse).

Here is the code that I used for the first figure:

import tensorflow as tfimport matplotlib.pyplot as pltimport numpy as npfrom tensorflow.python.framework.ops import reset_default_graph#preparing training datatrain_x=np.linspace(-7,7,300).reshape(-1,1)train_y=train_x**2#setting network featuresdimensions=[50,50,50,1]epochs=500batch_size=5reset_default_graph()X=tf.placeholder(tf.float32, shape=[None,1])Y=tf.placeholder(tf.float32, shape=[None,1])weights=[]biases=[]n_inputs=1#initializing variablesfor i,n_outputs in enumerate(dimensions):    with tf.variable_scope("layer_{}".format(i)):        w=tf.get_variable(name="W",shape=[n_inputs,n_outputs],initializer=tf.random_normal_initializer(mean=0.0,stddev=0.02,seed=42))        b=tf.get_variable(name="b",initializer=tf.zeros_initializer(shape=[n_outputs]))        weights.append(w)        biases.append(b)        n_inputs=n_outputsdef forward_pass(X,weights,biases):    h=X    for i in range(len(weights)):        h=tf.add(tf.matmul(h,weights[i]),biases[i])        h=tf.nn.relu(h)    return houtput_layer=forward_pass(X,weights,biases)cost=tf.reduce_mean(tf.squared_difference(output_layer,Y),1)cost=tf.reduce_sum(cost)optimizer=tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)with tf.Session() as sess:    sess.run(tf.global_variables_initializer())    #train the network    for i in range(epochs):        idx=np.arange(len(train_x))        np.random.shuffle(idx)        for j in range(len(train_x)//batch_size):            cur_idx=idx[batch_size*j:batch_size*(j+1)]            sess.run(optimizer,feed_dict={X:train_x[cur_idx],Y:train_y[cur_idx]})        #current_cost=sess.run(cost,feed_dict={X:train_x,Y:train_y})        #print(current_cost)    #apply the network on the test data    test_x=np.linspace(-30,30,300)    network_output=sess.run(output_layer,feed_dict={X:test_x.reshape(-1,1)})    plt.plot(test_x,test_x**2,color='r',label='y=x^2')plt.plot(test_x,network_output,color='b',label='network output')plt.legend(loc='center')plt.show()


Checked the docs for neurolab - newff creates NN with sigmoid transfer function in all neurons by default. Sigmoid value is always in (-1; 1) range, so your output will never leave this range.

Second square (4) is already out of this range, so your code doesn't match your problem at all.

Try using other functions (I'd propose SoftPlus or ReLU). They work quite well with feed-forward networks, allow for backpropagation training (as they are derivable in whole domain) and have values in range (0, ∞), just as you need.

Also: first param to newff defines ranges for input data - you're using [0, 99] which matches all the training data, but doesn't match values that you've tried while testing (since 100 and 101 are bigger than 99). Change this value to something way bigger, so the values you test on are not "special" (meaning "on the end of the range") - I'd propose something like [-300, 300].

Besides, as stated by Seanny123 in a comment, I don't think it's gonna work at all, but with current setup I can be sure of that. Good luck. Let me know (for example in comments) if you succeeded.

Last, but not least - what you're trying to do is extrapolation (figuring out values out of some range based on values in that range). NN are better suited for interpolation (figuring out values in the range based on samples from that range), as they are supposed to generalize data used in training. Try teaching it squares of, for example, every 3rd square (so 1, 16, 49, ...) and then testing by asking for squares of the rest (for example asking for square of 2 or 8).