Efficient 1D linear regression for each element of 3D numpy array

python numpy linear-regression

If i understand it correctly, you want do many linear regression y = k * x + b, there is only one x, but many y, for every y you want to calculate the k and b.

If the shape of x is 50, y is (50, 1000), you can use numpy.linalg.lstsq, here is some demo:

import numpy as npx = np.random.rand(50)k = np.random.rand(1000)b = np.random.rand(1000)y = np.outer(x, k) + b + np.random.normal(size=(50, 1000), scale=1e-10)r = np.linalg.lstsq(np.c_[x, np.ones_like(x)], y)[0]print np.allclose(r[0], k)print np.allclose(r[1], b)

for the y with shape (50, 2000, 2000), you can reshape it to (50, 2000*2000).

Edit

Here the my solution for masked array. The formula is:

enter image description here

Prepare Y as 2-dim array with shape (1889*1566, 57), X as 2-dim array with shape (57, 2). mask as a bool array with the same shape as Y, True means the value in Y is available.

Calculate array a with shape (1889*1566, 2, 2), b with shape (1889*1566, 2), then call numpy.linalg.solve(a, b), here is some demo code:

import numpy as npN = 50M = 1000x = np.random.rand(N)X = np.c_[x, np.ones_like(x)]beta = np.random.rand(M, 2)Y = np.dot(beta, X.T)Y += np.random.normal(scale=0.1, size=Y.shape)mask = np.random.randint(0, 2, size=Y.shape).astype(np.bool)a = np.swapaxes(np.dot(X.T, (X[None, :, :] * mask[:, :, None])), 0, 1)b = np.dot(X.T, (mask*Y).T)beta2 = np.linalg.solve(a, b.T)i = 123print "real:", beta[i]print "by solve:", beta2[i]m = mask[i]x2 = X[m]y2 = Y[i, m]print "by lstsq:", np.linalg.lstsq(x2, y2)[0]

output 123th coefficient:

real: [ 0.35813131  0.29736779]by solve: [ 0.38088499  0.30382547]by lstsq: [ 0.38088499  0.30382547]

You can also calculate the a array by following code, I think it will use less memory than the method above:

a2 = np.empty((M, 2, 2))xm = mask * xa2[:, 0, 0] = (xm*xm).sum(1)a2[:, 1, 0] = (xm*mask).sum(1)a2[:, 0, 1] = a2[:, 1, 0]a2[:, 1, 1] = (mask).sum(1)print np.allclose(a2, a)

CodeHunter

Efficient 1D linear regression for each element of 3D numpy array

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last