Compiling numpy with OpenBLAS integration Compiling numpy with OpenBLAS integration python python

Compiling numpy with OpenBLAS integration


I just compiled numpy inside a virtualenv with OpenBLAS integration, and it seems to be working OK.

This was my process:

  1. Compile OpenBLAS:

    $ git clone https://github.com/xianyi/OpenBLAS$ cd OpenBLAS && make FC=gfortran$ sudo make PREFIX=/opt/OpenBLAS install

    If you don't have admin rights you could set PREFIX= to a directory where you have write privileges (just modify the corresponding steps below accordingly).

  2. Make sure that the directory containing libopenblas.so is in your shared library search path.

    • To do this locally, you could edit your ~/.bashrc file to contain the line

      export LD_LIBRARY_PATH=/opt/OpenBLAS/lib:$LD_LIBRARY_PATH

      The LD_LIBRARY_PATH environment variable will be updated when you start a new terminal session (use $ source ~/.bashrc to force an update within the same session).

    • Another option that will work for multiple users is to create a .conf file in /etc/ld.so.conf.d/ containing the line /opt/OpenBLAS/lib, e.g.:

      $ sudo sh -c "echo '/opt/OpenBLAS/lib' > /etc/ld.so.conf.d/openblas.conf"

    Once you are done with either option, run

    $ sudo ldconfig
  3. Grab the numpy source code:

    $ git clone https://github.com/numpy/numpy$ cd numpy
  4. Copy site.cfg.example to site.cfg and edit the copy:

    $ cp site.cfg.example site.cfg$ nano site.cfg

    Uncomment these lines:

    ....[openblas]libraries = openblaslibrary_dirs = /opt/OpenBLAS/libinclude_dirs = /opt/OpenBLAS/include....
  5. Check configuration, build, install (optionally inside a virtualenv)

    $ python setup.py config

    The output should look something like this:

    ...openblas_info:  FOUND:    libraries = ['openblas', 'openblas']    library_dirs = ['/opt/OpenBLAS/lib']    language = c    define_macros = [('HAVE_CBLAS', None)]  FOUND:    libraries = ['openblas', 'openblas']    library_dirs = ['/opt/OpenBLAS/lib']    language = c    define_macros = [('HAVE_CBLAS', None)]...

    Installing with pip is preferable to using python setup.py install, since pip will keep track of the package metadata and allow you to easily uninstall or upgrade numpy in the future.

    $ pip install .
  6. Optional: you can use this script to test performance for different thread counts.

    $ OMP_NUM_THREADS=1 python build/test_numpy.pyversion: 1.10.0.dev0+8e026a2maxint:  9223372036854775807BLAS info: * libraries ['openblas', 'openblas'] * library_dirs ['/opt/OpenBLAS/lib'] * define_macros [('HAVE_CBLAS', None)] * language cdot: 0.099796795845 sec$ OMP_NUM_THREADS=8 python build/test_numpy.pyversion: 1.10.0.dev0+8e026a2maxint:  9223372036854775807BLAS info: * libraries ['openblas', 'openblas'] * library_dirs ['/opt/OpenBLAS/lib'] * define_macros [('HAVE_CBLAS', None)] * language cdot: 0.0439578056335 sec

There seems to be a noticeable improvement in performance for higher thread counts. However, I haven't tested this very systematically, and it's likely that for smaller matrices the additional overhead would outweigh the performance benefit from a higher thread count.


Just in case you are using ubuntu or mint, you can easily have openblas linked numpy by installing both numpy and openblas via apt-get as

sudo apt-get install numpy libopenblas-dev

On a fresh docker ubuntu, I tested the following script copied from the blog post "Installing Numpy and OpenBLAS"

import numpy as npimport numpy.random as nprimport time# --- Test 1N = 1n = 1000A = npr.randn(n,n)B = npr.randn(n,n)t = time.time()for i in range(N):    C = np.dot(A, B)td = time.time() - tprint("dotted two (%d,%d) matrices in %0.1f ms" % (n, n, 1e3*td/N))# --- Test 2N = 100n = 4000A = npr.randn(n)B = npr.randn(n)t = time.time()for i in range(N):    C = np.dot(A, B)td = time.time() - tprint("dotted two (%d) vectors in %0.2f us" % (n, 1e6*td/N))# --- Test 3m,n = (2000,1000)A = npr.randn(m,n)t = time.time()[U,s,V] = np.linalg.svd(A, full_matrices=False)td = time.time() - tprint("SVD of (%d,%d) matrix in %0.3f s" % (m, n, td))# --- Test 4n = 1500A = npr.randn(n,n)t = time.time()w, v = np.linalg.eig(A)td = time.time() - tprint("Eigendecomp of (%d,%d) matrix in %0.3f s" % (n, n, td))

Without openblas the result is:

dotted two (1000,1000) matrices in 563.8 msdotted two (4000) vectors in 5.16 usSVD of (2000,1000) matrix in 6.084 sEigendecomp of (1500,1500) matrix in 14.605 s

After I installed openblas with apt install openblas-dev, I checked the numpy linkage with

import numpy as npnp.__config__.show()

and the information is

atlas_threads_info:  NOT AVAILABLEopenblas_info:  NOT AVAILABLEatlas_blas_info:  NOT AVAILABLEatlas_3_10_threads_info:  NOT AVAILABLEblas_info:    library_dirs = ['/usr/lib']    libraries = ['blas', 'blas']    language = c    define_macros = [('HAVE_CBLAS', None)]mkl_info:  NOT AVAILABLEatlas_3_10_blas_threads_info:  NOT AVAILABLEatlas_3_10_blas_info:  NOT AVAILABLEopenblas_lapack_info:  NOT AVAILABLElapack_opt_info:    library_dirs = ['/usr/lib']    libraries = ['lapack', 'lapack', 'blas', 'blas']    language = c    define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]blas_opt_info:    library_dirs = ['/usr/lib']    libraries = ['blas', 'blas']    language = c    define_macros = [('NO_ATLAS_INFO', 1), ('HAVE_CBLAS', None)]atlas_info:  NOT AVAILABLEblas_mkl_info:  NOT AVAILABLElapack_mkl_info:  NOT AVAILABLEatlas_3_10_info:  NOT AVAILABLElapack_info:    library_dirs = ['/usr/lib']    libraries = ['lapack', 'lapack']    language = f77atlas_blas_threads_info:  NOT AVAILABLE

It doesn't show linkage to openblas. However, the new result of the script shows that numpy must have used openblas:

dotted two (1000,1000) matrices in 15.2 msdotted two (4000) vectors in 2.64 usSVD of (2000,1000) matrix in 0.469 sEigendecomp of (1500,1500) matrix in 2.794 s


Here's a simpler approach than @ali_m's answer and it works on macOS.

  1. Install a gfortran compiler if you don't have one. E.g. using homebrew on macOS:

    $ brew install gcc
  2. Compile OpenBLAS from source [or use a package manager], either getting the source repo or downloading a release:

    $ git clone https://github.com/xianyi/OpenBLAS$ cd OpenBLAS && make FC=gfortran$ sudo make PREFIX=/opt/OpenBLAS install

    If you don't/can't sudo, set PREFIX= to another directory and modify the path in the next step.

    OpenBLAS does not need to be on the compiler include path or the linker library path.

  3. Create a ~/.numpy-site.cfg file containing the PREFIX path you used in step 2:

    [openblas]libraries = openblaslibrary_dirs = /opt/OpenBLAS/libruntime_library_dirs = /opt/OpenBLAS/libinclude_dirs = /opt/OpenBLAS/include

    include_dirs is for the compiler. library_dirs is for the linker. runtime_library_dirs is for the loader, and might not be needed.

  4. pip-install numpy and scipy from source (preferably into a virtualenv) without manually downloading them [you can also specify the release versions]:

    pip install numpy scipy --no-binary numpy,scipy
  5. In my experience, this OPENBLAS_NUM_THREADS setting at runtime makes OpenBLAS faster, not slower, esp. when multiple CPU processes are using it at the same time:

     export OPENBLAS_NUM_THREADS=1

    (Alternatively, you can compile OpenBLAS with make FC=gfortran USE_THREAD=0.)

See the other answers for ways to test it.