Numpy dot product very slow using ints Numpy dot product very slow using ints numpy numpy

Numpy dot product very slow using ints

very interesting, I was curious to see how it was implemented so I did:

>>> import inspect>>> import numpy as np>>> inspect.getmodule(<module 'numpy.core._dotblas' from '/Library/Python/2.6/site-packages/numpy-1.6.1-py2.6-macosx-10.6-universal.egg/numpy/core/'>>>> 

So it looks like its using the BLAS library.


>>> help(np.core._dotblas)

from which I found this:

When Numpy is built with an accelerated BLAS like ATLAS, these functions are replaced to make use of the faster implementations. The faster implementations only affect float32, float64, complex64, and complex128 arrays. Furthermore, the BLAS API only includes matrix-matrix, matrix-vector, and vector-vector products. Products of arrays with larger dimensionalities use the built in functions and are not accelerated.

So it looks like ATLAS fine tunes certain functions but its only applicable to certain data types, very interesting.

so yeah it looks I'll be using floats more often ...

Using int vs float data types causes different code paths to be executed:

The stack trace for float looks like this:

(gdb) backtr#0  0x007865a0 in dgemm_ () from /usr/lib/  0x007559d5 in cblas_dgemm () from /usr/lib/  0x00744108 in dotblas_matrixproduct (__NPY_UNUSED_TAGGEDdummy=0x0, args=(<numpy.ndarray at remote 0x85d9090>, <numpy.ndarray at remote 0x85d9090>), kwargs=0x0) at numpy/core/blasdot/_dotblas.c:798#3  0x08088ba1 in PyEval_EvalFrameEx ()...

..while the stack trace for int looks like this:

(gdb) backtr#0  LONG_dot (ip1=0xb700a280 "\t", is1=4, ip2=0xb737dc64 "\a", is2=4000, op=0xb6496fc4 "", n=1000, __NPY_UNUSED_TAGGEDignore=0x85fa960)at numpy/core/src/multiarray/arraytypes.c.src:3076#1  0x00659d9d in PyArray_MatrixProduct2 (op1=<numpy.ndarray at remote 0x85dd628>, op2=<numpy.ndarray at remote 0x85dd628>, out=0x0)at numpy/core/src/multiarray/multiarraymodule.c:847#2  0x00742b93 in dotblas_matrixproduct (__NPY_UNUSED_TAGGEDdummy=0x0, args=(<numpy.ndarray at remote 0x85dd628>, <numpy.ndarray at remote 0x85dd628>), kwargs=0x0) at numpy/core/blasdot/_dotblas.c:254#3  0x08088ba1 in PyEval_EvalFrameEx ()...

Both calls lead to dotblas_matrixproduct, but it appears that the float call stays in the BLAS library (probably accessing some well-optimized code), while the int call gets kicked back out to numpy's PyArray_MatrixProduct2.

So this is either a bug or BLAS just doesn't support integer types in matrixproduct (which seems rather unlikely).

Here's an easy and inexpensive workaround:

af = a.astype(float), af).astype(int)