Why doesn't my custom made linear regression model match sklearn?

python numpy machine-learning scikit-learn gradient-descent

I think you are missing the 1/m term (where m is the size of y) in the gradient descent. After including the 1/m term, I seem to get a predicted value similar to your sklearn code.

see below

....weights = np.ones((3,))m = y.sizefor boom in range(100):  currentCost = cost(normalizedX, weights, y)  if boom % 1 == 0:    print(boom, 'iteration', weights[0], weights[1], weights[2])    print('Cost', currentCost)  for i in range(47):    errorDiff = h(normalizedX[i], weights) - y[i]    weights[0] = weights[0] - alpha *(1/m)* (errorDiff) * normalizedX[i][0]    weights[1] = weights[1] - alpha *(1/m)*  (errorDiff) * normalizedX[i][1]    weights[2] = weights[2] - alpha *(1/m)* (errorDiff) * normalizedX[i][2]...

this gives the firstprediction to be 355242.

This agrees well with the linear regression model even though it does not do gradient descent.

I also tried sgdregressor (uses stochastic gradient descent) in sklearn and it too seem to get a value close to linear regressor model and your model. see the code below

import numpyimport matplotlib.pyplot as plotimport pandasimport sklearnfrom sklearn.model_selection import train_test_splitfrom sklearn.linear_model import LinearRegression, SGDRegressordataset = pandas.read_csv('Housing.csv', header=None)x = dataset.iloc[:, :-1].valuesy = dataset.iloc[:, 2].valuessgdRegressor = SGDRegressor(penalty='none', learning_rate='constant', eta0=0.1, max_iter=1000, tol = 1E-6)xnorm = sklearn.preprocessing.scale(x)scaleCoef = sklearn.preprocessing.StandardScaler().fit(x)mean = scaleCoef.mean_std = numpy.sqrt(scaleCoef.var_)print('stf')print(std)yPrediction = []predictedX = [[(2100 - mean[0]) / std[0], (3 - mean[1]) / std[1]]]print('predictedX', predictedX)for trials in range(10):    stuff = sgdRegressor.fit(xnorm, y)    yPrediction.extend(sgdRegressor.predict(predictedX))print('predict', np.mean(yPrediction))

results in

predict 355533.10119985335

CodeHunter

Why doesn't my custom made linear regression model match sklearn?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last