Most memory-efficient way to compute abs()**2 of complex numpy ndarray Most memory-efficient way to compute abs()**2 of complex numpy ndarray python python

Most memory-efficient way to compute abs()**2 of complex numpy ndarray


Thanks to numba.vectorize in recent versions of numba, creating a numpy universal function for the task is very easy:

@numba.vectorize([numba.float64(numba.complex128),numba.float32(numba.complex64)])def abs2(x):    return x.real**2 + x.imag**2

On my machine, I find a threefold speedup compared to a pure-numpy version that creates intermediate arrays:

>>> x = np.random.randn(10000).view('c16')>>> y = abs2(x)>>> np.all(y == x.real**2 + x.imag**2)   # exactly equal, being the same operationTrue>>> %timeit np.abs(x)**210000 loops, best of 3: 81.4 µs per loop>>> %timeit x.real**2 + x.imag**2100000 loops, best of 3: 12.7 µs per loop>>> %timeit abs2(x)100000 loops, best of 3: 4.6 µs per loop


EDIT: this solution has twice the minimum memory requirement, and is just marginally faster. The discussion in the comments is good for reference however.

Here's a faster solution, with the result stored in res:

import numpy as npres = arr.conjugate()np.multiply(arr,res,out=res)

where we exploited the property of the abs of a complex number, i.e. abs(z) = sqrt(z*z.conjugate), so that abs(z)**2 = z*z.conjugate


If your primary goal is to conserve memory, NumPy's ufuncs take an optional out parameter that lets you direct the output to an array of your choosing. It can be useful when you want to perform operations in place.

If you make this minor modification to your first method, then you can perform the operation on arr completely in place:

np.abs(arr, out=arr)arr **= 2

One convoluted way that only uses a little extra memory could be to modify arr in place, compute the new array of real values and then restore arr.

This means storing information about the signs (unless you know that your complex numbers all have positive real and imaginary parts). Only a single bit is needed for the sign of each real or imaginary value, so this uses 1/16 + 1/16 == 1/8 the memory of arr (in addition to the new array of floats you create).

>>> signs_real = np.signbit(arr.real) # store information about the signs>>> signs_imag = np.signbit(arr.imag)>>> arr.real **= 2 # square the real and imaginary values>>> arr.imag **= 2>>> result = arr.real + arr.imag>>> arr.real **= 0.5 # positive square roots of real and imaginary values>>> arr.imag **= 0.5>>> arr.real[signs_real] *= -1 # restore the signs of the real and imagary values>>> arr.imag[signs_imag] *= -1

At the expense of storing signbits, arr is unchanged and result holds the values we want.