Shape not aligned error in OLS Regression python Shape not aligned error in OLS Regression python numpy numpy

Shape not aligned error in OLS Regression python


model in line model = sm.OLS(y_train,X_train[:,[0,1,2,3,4,6]]), when trained that way, assumes the input data is 6-dimensional, as the 5th column of X_train is dropped. This requires the test data (in this case X_test) to be 6-dimensional too. This is why y_pred = result.predict(X_test) didn't work because X_test is originally 7-dimensional. The proper fix here is:

y_pred = result.predict(X_test[:, [0,1,2,3,4,6]]

BONUS

I see you are using the Pandas library. A better practice to drop columns is to use .drop so instead of

newdf.loc[:, newdf.columns != 'V-9'].values

you can use

newdf.drop('V-9', axis=1) # axis=1 makes sure cols are dropped, not rows

likewise instead of

X_train[:,[0,1,2,3,4,6]]

you can use

X_train.drop(X_train.columns[5], axis=1) # this like dropping the 5th column of the dataFrame

This makes it more readable and easier to code especially if you had 50 dimensions instead of 7.

I am glad it helps!