Why my PCA is not invariant to rotation and axis swap? Why my PCA is not invariant to rotation and axis swap? numpy numpy

Why my PCA is not invariant to rotation and axis swap?


Firstly, your pca function is not correct, it should be

def pca(x):        x -= np.mean(x,axis=0)    cov = np.cov(x.T)    e_values, e_vectors = np.linalg.eig(cov)    order = np.argsort(e_values)[::-1]    v = e_vectors[:,order]        return x @ v

You shouldn't transpose the e_vectors[:,order] because we want each column of the v array is an eigenvector, therefore, x @ v will be projections of x on those eigenvectors.

Secondly, I think you misunderstand the meaning of rotation. It is not voxel1 that should be rotated, but the basis1. If you rotate (by taking transposition) voxel1, what you really do is to rearrange the indices of grid points, while the coordinates of the points basis1 are not changed.

In order to rotate the points (around the z axis for example), you can first define a function to calculate the rotation matrix given an angle

rotate_mat = lambda theta: np.array([[np.cos(theta),-np.sin(theta),0.],[np.sin(theta),np.cos(theta),0.],[0.,0.,1.]])

with the rotation matrix generated by this function, you can rotate the array basis1 to create another array basis2

basis2 = basis1 @ rotate_mat(np.deg2rad(angle))

Now it comes to the title of your question "Why my PCA is not invariant to rotation and axis swap?", from this post, the PCA result is not unique, you can actually run a test to see this

import numpy as npnp.random.seed(10)rotate_mat = lambda theta: np.array([[np.cos(theta),-np.sin(theta),0.],[np.sin(theta),np.cos(theta),0.],[0.,0.,1.]])def pca(x):        x -= np.mean(x,axis=0)    cov = np.cov(x.T)    e_values, e_vectors = np.linalg.eig(cov)    order = np.argsort(e_values)[::-1]    v = e_vectors[:,order]    return x @ vangles = np.linspace(0,90,91)    res = []for angle in angles:    voxel1 = np.random.normal(size=(3,3,3))    voxel2 = voxel1.copy()    voxel1 = voxel1.reshape(27,1)    voxel2 = voxel2.reshape(27,1)        basis1 = np.array([[i+1,j+1,k+1] for k in range(3) for j in range(3) for i in range(3)])    # basis2 = np.hstack((-basis1[:,1][:,None],basis1[:,0][:,None],-basis1[:,2][:,None]))    basis2 = basis1 @ rotate_mat(np.deg2rad(angle))    voxel1 = voxel1*basis1    voxel2 = voxel2*basis2    print(angle,np.all(np.abs(pca(voxel1) - pca(voxel2) < 1e-6)))    res.append(np.all(np.abs(pca(voxel1) - pca(voxel2) < 1e-6)))    print()print(np.sum(res) / len(angles))

After you run this script, you will see that in only 21% of times the two PCA results are the same.


UPDATE

I think instead of focusing on the eigenvectors of the principal components, you can instead focus on the projections. For two clouds of points, even though they are essentially the same, the eigenvectors can be drastically different. Therefore, hardcoding in order to somehow let the two sets of eigenvectors to be the same is a very difficult task.

However, based on this post, for the same cloud of points, two sets of eigenvectors can be different only up to a minus sign. Therefore, the projections upon the two sets of eigenvectors are also different only up to a minus sign. This actually offers us an elegant solution, for the projections along an eigenvector (principal axis), all we need to do is to switch the sign of the projections so that the projection with the largest absolute value along that principal axis is positive.

import numpy as npfrom numpy.random import randn#np.random.seed(20)rotmat_z = lambda theta: np.array([[np.cos(theta),-np.sin(theta),0.],[np.sin(theta),np.cos(theta),0.],[0.,0.,1.]])rotmat_y = lambda theta: np.array([[np.cos(theta),0.,np.sin(theta)],[0.,1.,0.],[-np.sin(theta),0.,np.cos(theta)]])rotmat_x = lambda theta: np.array([[1.,0.,0.],[0.,np.cos(theta),-np.sin(theta)],[0.,np.sin(theta),np.cos(theta)]])# based on https://en.wikipedia.org/wiki/Rotation_matrixrot_mat = lambda alpha,beta,gamma: rotmat_z(alpha) @ rotmat_y(beta) @ rotmat_x(gamma)deg2rad = lambda alpha,beta,gamma: [np.deg2rad(alpha),np.deg2rad(beta),np.deg2rad(gamma)]def pca(feat, x):        # pca with attemt to create convention on sign changes    x_c = x - np.mean(x,axis=0)    x_f = feat * x    x_f -= np.mean(x_f,axis=0)    cov = np.cov(x_f.T)    e_values, e_vectors = np.linalg.eig(cov)    order = np.argsort(e_values)[::-1]    v = e_vectors[:,order]        # here is the solution, we switch the sign of the projections    # so that the projection with the largest absolute value along a principal axis is positive    proj = x_f @ v    asign = np.sign(proj)    max_ind = np.argmax(np.abs(proj),axis=0)[None,:]    sign = np.take_along_axis(asign,max_ind,axis=0)    proj = proj * sign        return projref_angles = np.linspace(0.0,90.0,10)angles = [[alpha,beta,gamma] for alpha in ref_angles for beta in ref_angles for gamma in ref_angles]voxel1 = np.random.normal(size=(3,3,3))res = []for angle in angles:    voxel2 = voxel1.copy()    voxel1 = voxel1.reshape(27,1)    voxel2 = voxel2.reshape(27,1)    basis1 = np.array([[i+1,j+1,k+1] for k in range(3) for j in range(3) for i in range(3)]).astype(np.double)    basis1 = basis1 + 1e-4 * randn(27,3) # perturbation    basis2 = basis1 @ rot_mat(*deg2rad(*angle))       original_diff = np.sum(np.abs(basis1-basis2))    gg= np.abs(pca(voxel1, basis1) - pca(voxel1, basis2))    ss= np.sum(np.ravel(gg))    bl= np.all(gg<1e-4)    print('difference before pca %.3f,' % original_diff, 'difference after pca %.3f' % ss,bl)    res.append(bl)    del basis1, basis2print('correct for %.1f percent of time' % (100*(np.sum(res) / len(res))))

As you can see by running this script, the projections on the principal axis are the same, this means we have resolved the issue of PCA results being not unique.


Reply to EDIT 3

As for the new issue you raised, I think you missed an important point, it is the projections of the cloud of points onto the principal axes that are invariant, not anything else. Therefore, if you rotate voxel1 and obtain voxel2, they are the same in the sense that their own respective projections onto the principal axes of the cloud of points are the same, it actually does not make too much sense to compare pca(voxel1,basis1) with pca(voxel2,basis1).

Furthermore, the method rotate of scipy.ndimage actually changes information, as you can see by running this script

image1 = np.linspace(1,100,100).reshape(10,10)image2 = scipy.ndimage.rotate(image1, 45, mode='nearest', axes=(0, 1), reshape=False)image3 = scipy.ndimage.rotate(image2, -45, mode='nearest', axes=(0, 1), reshape=False)fig,ax = plt.subplots(nrows=1,ncols=3,figsize=(12,4))ax[0].imshow(image1)ax[1].imshow(image2)ax[2].imshow(image3)

The output image is

As you can see the matrix after rotation is not the same as the original one, some information of the original matrix is changed.

output


Reply to EDIT 4

Actually, we are almost there, the two pca results are different because we are comparing pca components for different points.

indices = np.arange(27)indices3d = indices.reshape((3,3,3))# apply rotations to the indices, it is not weights yetgen = rotations24(indices3d)# construct the weightsvoxel10 = np.random.normal(size=(3,3,3))res = []count = 0for ind2 in gen:    count += 1    # ind2 is the array of indices after rotation    # reindex the weights with the indices after rotation     voxel1 = voxel10.copy().reshape(27,1)     voxel2 = voxel1[ind2.reshape(27,)]    # basis1 is the array of coordinates where the points are    basis1 = np.array([[i+1,j+1,k+1] for k in range(3) for j in range(3) for i in range(3)]).astype(np.double)    basis1 += 1e-4*np.random.normal(size=(27, 1))    # reindex the coordinates with the indices after rotation    basis2 = basis1[ind2.reshape(27,)]    # add a slight modification to pca, return the axes also     pca1,v1 = pca(voxel1,basis1)    pca2,v2 = pca(voxel2,basis2)    # sort the principal components before comparing them     pca1 = np.sort(pca1,axis=0)    pca2 = np.sort(pca2,axis=0)        gg= np.abs(pca1 - pca2)    ss= np.sum(np.ravel(gg))    bl= np.all(gg<1e-4)    print('difference after pca %.3f' % ss,bl)    res.append(bl)        del basis1, basis2print('correct for %.1f percent of time' % (100*(np.sum(res) / len(res))))

Running this script, you will find, for each rotation, the two sets of principal axes are different only up to a minus sign. The two sets of pca results are different because the indices of the cloud of points before and after rotation are different (since you apply rotation to the indices). If you sort the pca results before comparing them, you will find the two pca results are exactly the same.


Summary

The answer to this question can be divided into two parts. In the first part, the rotation is applied to the basis (the coordinates of points), while the indices and the corresponding weights are unchanged. In the second part, the rotation is applied to the indices, then the weights and the basis are rearranged with the new indices. For both of the two parts, the solution pca function is the same

def pca(feat, x):    # pca with attemt to create convention on sign changes    x_c = x - np.mean(x,axis=0)    x_f = feat * x    x_f -= np.mean(x_f,axis=0)    cov = np.cov(x_f.T)    e_values, e_vectors = np.linalg.eig(cov)    order = np.argsort(e_values)[::-1]    v = e_vectors[:,order]    # here is the solution, we switch the sign of the projections    # so that the projection with the largest absolute value along a principal axis is positive    proj = x_f @ v    asign = np.sign(proj)    max_ind = np.argmax(np.abs(proj),axis=0)[None,:]    sign = np.take_along_axis(asign,max_ind,axis=0)    proj = proj * sign    return proj

The idea of this function is, instead of matching the principal axes, we can match the principal components since it is the principal components that are rotationally invariant after all.

Based on this function pca, the first part of this answer is easy to understand, since the indices of the points are unchanged while we only rotate the basis. In order to understand the second part of this answer (Reply to EDIT 5), we must first understand the function rotations24. This function rotates the indices rather than the coordinates of the points, therefore, if we stay at the same position observing the points, we will feel that the positions of the points are changed.

plot

With this in mind, it is not hard to understand Reply to EDIT 5.

Actually, the function pca in this answer can be applied to more general cases, for example (we rotate the indices)

num_of_points_per_dim = 10num_of_points = num_of_points_per_dim ** 3indices = np.arange(num_of_points)indices3d = indices.reshape((num_of_points_per_dim,num_of_points_per_dim,num_of_points_per_dim))voxel10 = 100*np.random.normal(size=(num_of_points_per_dim,num_of_points_per_dim,num_of_points_per_dim))gen = rotations24(indices3d)res = []for ind2 in gen:    voxel1 = voxel10.copy().reshape(num_of_points,1)    voxel2 = voxel1[ind2.reshape(num_of_points,)]    basis1 = 100*np.random.rand(num_of_points,3)    basis2 = basis1[ind2.reshape(num_of_points,)]    pc1 = np.sort(pca(voxel1, basis1),axis=0)    pc2 = np.sort(pca(voxel2, basis2),axis=0)        gg= np.abs(pc1-pc2)    ss= np.sum(np.ravel(gg))    bl= np.all(gg<1e-4)    print('difference after pca %.3f' % ss,bl)    res.append(bl)    del basis1, basis2print('correct for %.1f percent of time' % (100*(np.sum(res) / len(res))))