Prototyping with Python code before compiling Prototyping with Python code before compiling python python

Prototyping with Python code before compiling


Finally a question that I can really put a value answer to :).

I have investigated f2py, boost.python, swig, cython and pyrex for my work (PhD in optical measurement techniques). I used swig extensively, boost.python some and pyrex and cython a lot. I also used ctypes. This is my breakdown:

Disclaimer: This is my personal experience. I am not involved with any of these projects.

swig:does not play well with c++. It should, but name mangling problems in the linking step was a major headache for me on linux & Mac OS X. If you have C code and want it interfaced to python, it is a good solution. I wrapped the GTS for my needs and needed to write basically a C shared library which I could connect to. I would not recommend it.

Ctypes:I wrote a libdc1394 (IEEE Camera library) wrapper using ctypes and it was a very straigtforward experience. You can find the code on https://launchpad.net/pydc1394. It is a lot of work to convert headers to python code, but then everything works reliably. This is a good way if you want to interface an external library. Ctypes is also in the stdlib of python, so everyone can use your code right away. This is also a good way to play around with a new lib in python quickly. I can recommend it to interface to external libs.

Boost.Python: Very enjoyable. If you already have C++ code of your own that you want to use in python, go for this. It is very easy to translate c++ class structures into python class structures this way. I recommend it if you have c++ code that you need in python.

Pyrex/Cython: Use Cython, not Pyrex. Period. Cython is more advanced and more enjoyable to use. Nowadays, I do everything with cython that i used to do with SWIG or Ctypes. It is also the best way if you have python code that runs too slow. The process is absolutely fantastic: you convert your python modules into cython modules, build them and keep profiling and optimizing like it still was python (no change of tools needed). You can then apply as much (or as little) C code mixed with your python code. This is by far faster then having to rewrite whole parts of your application in C; you only rewrite the inner loop.

Timings: ctypes has the highest call overhead (~700ns), followed by boost.python (322ns), then directly by swig (290ns). Cython has the lowest call overhead (124ns) and the best feedback where it spends time on (cProfile support!). The numbers are from my box calling a trivial function that returns an integer from an interactive shell; module import overhead is therefore not timed, only function call overhead is. It is therefore easiest and most productive to get python code fast by profiling and using cython.

Summary: For your problem, use Cython ;). I hope this rundown will be useful for some people. I'll gladly answer any remaining question.


Edit: I forget to mention: for numerical purposes (that is, connection to NumPy) use Cython; they have support for it (because they basically develop cython for this purpose). So this should be another +1 for your decision.


I haven't used SWIG or SIP, but I find writing Python wrappers with boost.python to be very powerful and relatively easy to use.

I'm not clear on what your requirements are for passing types between C/C++ and python, but you can do that easily by either exposing a C++ type to python, or by using a generic boost::python::object argument to your C++ API. You can also register converters to automatically convert python types to C++ types and vice versa.

If you plan use boost.python, the tutorial is a good place to start.

I have implemented something somewhat similar to what you need. I have a C++ function that accepts a python function and an image as arguments, and applies the python function to each pixel in the image.

Image* unary(boost::python::object op, Image& im){    Image* out = new Image(im.width(), im.height(), im.channels());    for(unsigned int i=0; i<im.size(); i++)    {        (*out)[i] == extract<float>(op(im[i]));    }    return out;}

In this case, Image is a C++ object exposed to python (an image with float pixels), and op is a python defined function (or really any python object with a __call__ attribute). You can then use this function as follows (assuming unary is located in the called image that also contains Image and a load function):

import imageim = image.load('somefile.tiff')double_im = image.unary(lambda x: 2.0*x, im)

As for using arrays with boost, I personally haven't done this, but I know the functionality to expose arrays to python using boost is available - this might be helpful.


The best way to plan for an eventual transition to compiled code is to write the performance sensitive portions as a module of simple functions in a functional style (stateless and without side effects), which accept and return basic data types.

This will provide a one-to-one mapping from your Python prototype code to the eventual compiled code, and will let you use ctypes easily and avoid a whole bunch of headaches.

For peak fitting, you'll almost certainly need to use arrays, which will complicate things a little, but is still very doable with ctypes.

If you really want to use more complicated data structures, or modify the passed arguments, SWIG or Python's standard C-extension interface will let you do what you want, but with some amount of hassle.

For what you're doing, you may also want to check out NumPy, which might do some of the work you would want to push to C, as well as offering some additional help in moving data back and forth between Python and C.