Why is `arr.take(idx)` faster than `arr[idx]` Why is `arr.take(idx)` faster than `arr[idx]` numpy numpy

Why is `arr.take(idx)` faster than `arr[idx]`


The answer is very low level, and have to do with the C compiler and CPU cache optimizations. Please see the active discussion with Sebastian Berg and Max Bolingbroke (both numpy's contributors) on this numpy issue.

Fancy indexing tries to be "smart" about how the memory is read and written (C-order vs F-order), while .take will always keep C-order. This means fancy indexing will usually be much faster for F-ordered arrays, and should always be faster in any case for huge arrays. Now, numpy decides what is the "smart" way without taking the size of the array into consideration, or the particular hardware it is running on. Therefore, for smaller arrays, choosing the "wrong" memory order might actually get better performance thanks to better use of reads in CPU-cache.