Is there a list of big O complexities for the numpy library? Is there a list of big O complexities for the numpy library? python python

Is there a list of big O complexities for the numpy library?


BigO complexity is not often used with Python and numpy. It's a measure of how the code scales with problem size. That's useful in a compiled language like C. But here the code is a mix of interpreted Python and compiled code. Both can have the same bigO, but the interpreted version will be orders of magnitude slower. That's why most of the SO questions about improving numpy speed, talk about 'removing loops' and 'vectorizing'.

Also few operations are pure O(n); most are a mix. There's a setup cost, plus a per element cost. If the per element cost is small, the setup cost dominates.

If starting with lists, it's often faster to iterate on the list, because converting a list to an array has a substantial overhead (O(n)).

If you already have arrays, then avoid (python level) iteration where possible. Iteration is part of most calculations, but numpy lets you do a lot of that in faster compiled code (faster O(n)).

At some point you have to understand how numpy stores its arrays. The distinction between view and copy is important. A view is in effect O(1), a copy O(n).

Often you'll see SO answers do timeit speed comparisons. I often add the caution that results might vary with problem size. The better answers will time various size problems, and show the results on a nice plot. The results are often a mix of straight lines (O(n)), and curves (varying blends of O(1) and O(n) components).


You asked specifically about np.array. Here are some sample timings:

In [134]: %%timeit alist = list(range(1000))     ...: np.array(alist)67.9 µs ± 839 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)In [135]: %%timeit alist = list(range(10))     ...: np.array(alist)2.19 µs ± 9.88 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)In [136]: %%timeit alist = list(range(2000))     ...: np.array(alist)134 µs ± 1.98 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

copy an array:

In [137]: %%timeit alist = list(range(2000)); arr=np.array(alist)     ...: np.array(arr)1.77 µs ± 24.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

no copy:

In [138]: %%timeit alist = list(range(2000)); arr=np.array(alist)     ...: np.array(arr, copy=False) 237 ns ± 1.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

from a list of strings:

In [139]: %%timeit alist = [str(i) for i in range(2000)]     ...: np.array(alist, dtype=int)286 µs ± 4.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Almost all calculations in numpy are O(n). If it involves each element of an array it, speed will depend on the size of the array. Some array manipulations are O(1), such as reshaping, because they don't actually do anything with the data; they change properties like shape and strides.

Search problems often grow faster than O(n); usually numpy is not the best choice for that kind of problem. Smart of use Python lists and dictionaries can be faster.


For the specific example np.array(my_array) as it needs to run through all the elements of my_array, allocate memory and initialize the values, it takes place in linear time.

There is a python module big_O that can be used to analyze the complexity of a function from its execution time.

Refer to this link for more information