How do I standardize a matrix? How do I standardize a matrix? numpy numpy

How do I standardize a matrix?


The following subtracts the mean of A from each element (the new mean is 0), then normalizes the result by the standard deviation.

import numpy as npA = (A - np.mean(A)) / np.std(A)

The above is for standardizing the entire matrix as a whole, If A has many dimensions and you want to standardize each column individually, specify the axis:

import numpy as npA = (A - np.mean(A, axis=0)) / np.std(A, axis=0)

Always verify by hand what these one-liners are doing before integrating them into your code. A simple change in orientation or dimension can drastically change (silently) what operations numpy performs on them.


import scipy.stats as ssA = np.array(ss.zscore(A))


from sklearn.preprocessing import StandardScalerstandardized_data = StandardScaler().fit_transform(your_data)

Example:

>>> import numpy as np>>> from sklearn.preprocessing import StandardScaler>>> data = np.random.randint(25, size=(4, 4))>>> dataarray([[17, 12,  4, 17],       [ 1, 16, 19,  1],       [ 7,  8, 10,  4],       [22,  4,  2,  8]])>>> standardized_data = StandardScaler().fit_transform(data)>>> standardized_dataarray([[ 0.63812398,  0.4472136 , -0.718646  ,  1.57786412],       [-1.30663482,  1.34164079,  1.55076242, -1.07959124],       [-0.57735027, -0.4472136 ,  0.18911737, -0.58131836],       [ 1.24586111, -1.34164079, -1.02123379,  0.08304548]])

Works well on large datasets.