PCA inverse transform manually
1) transform
is not data * pca.components_
.
Firstly, *
is not dot product for numpy array. It is element-wise multiplication. To perform dot product, you need to use np.dot
.
Secondly, the shape of PCA.components_
is (n_components, n_features) while the shape of data to transform is (n_samples, n_features), so you need to transpose PCA.components_
to perform dot product.
Moreover, the first step of transform is to subtract the mean, therefore if you do it manually, you also need to subtract the mean at first.
The correct way to transform is
data_reduced = np.dot(data - pca.mean_, pca.components_.T)
2) inverse_transform
is just the inverse process of transform
data_original = np.dot(data_reduced, pca.components_) + pca.mean_
If your data already has zero mean in each column, you can ignore the pca.mean_
above, for example
import numpy as npfrom sklearn.decomposition import PCApca = PCA(n_components=3)pca.fit(data)data_reduced = np.dot(data, pca.components_.T) # transformdata_original = np.dot(data_reduced, pca.components_) # inverse_transform