Safer way to expose a C-allocated memory buffer using numpy/ctypes? Safer way to expose a C-allocated memory buffer using numpy/ctypes? numpy numpy

Safer way to expose a C-allocated memory buffer using numpy/ctypes?


You have to keep a reference to your Wrapper while any numpy array exists. Easiest way to achieve this, is to save this reference in a attribute of the ctype-buffer:

class MyWrapper(object):    def __init__(self, n=10):        # buffer allocated by external library        self.size = n        self.addr = libc.malloc(C.sizeof(C.c_int) * n)    def __del__(self):        # buffer freed by external library        libc.free(self.addr)    @property    def buffer(self):        buf = (C.c_int * self.size).from_address(self.addr)        buf._wrapper = self        return np.ctypeslib.as_array(buf)

This way you're wrapper is automatically freed, when the last reference, e.g the last numpy array, is garbage collected.


It's a proprietary library written by a third party and distributed as a binary. I could call the same library functions from C rather than Python, but that wouldn't help much since I still don't have any access to the code that actually allocates and frees the buffers. I can't, for example, allocate the buffers myself and then pass them to the library as pointers.

You could, however, wrap the buffer in a Python extension type. That way you can expose only the interface you want to be available, and let the extension type automatically handle the freeing of the buffer. That way it's not possible for the Python API to do a free memory read/write.


mybuffer.c

#include <python3.3/Python.h>// Hardcoded values// N.B. Most of these are only needed for defining the view in the Python// buffer protocolstatic long external_buffer_size = 32;          // Size of buffer in bytesstatic long external_buffer_shape[] = { 32 };   // Number of items for each dimensionstatic long external_buffer_strides[] = { 1 };  // Size of item for each dimension//----------------------------------------------------------------------------// Code to simulate the third-party library//----------------------------------------------------------------------------// Allocate a new bufferstatic void* external_buffer_allocate(){    // Allocate the memory    void* ptr = malloc(external_buffer_size);    // Debug    printf("external_buffer_allocate() = 0x%lx\n", (long) ptr);    // Fill buffer with a recognizable pattern    int i;    for (i = 0; i < external_buffer_size; ++i)    {        *((char*) ptr + i) = i;    }    // Done    return ptr;}// Free an existing bufferstatic void external_buffer_free(void* ptr){    // Debug    printf("external_buffer_free(0x%lx)\n", (long) ptr);    // Release the memory    free(ptr);}//----------------------------------------------------------------------------// Define a new Python instance object for the external buffer// See: https://docs.python.org/3/extending/newtypes.html//----------------------------------------------------------------------------typedef struct{    // Python macro to include standard members, like reference count    PyObject_HEAD    // Base address of allocated memory    void* ptr;} BufferObject;//----------------------------------------------------------------------------// Define the instance methods for the new object//----------------------------------------------------------------------------// Called when there are no more references to the objectstatic void BufferObject_dealloc(BufferObject* self){    external_buffer_free(self->ptr);}// Called when we want a new view of the buffer, using the buffer protocol// See: https://docs.python.org/3/c-api/buffer.htmlstatic int BufferObject_getbuffer(BufferObject *self, Py_buffer *view, int flags){    // Set the view info    view->obj = (PyObject*) self;    view->buf = self->ptr;                      // Base pointer    view->len = external_buffer_size;           // Length    view->readonly = 0;    view->itemsize = 1;    view->format = "B";                         // unsigned byte    view->ndim = 1;    view->shape = external_buffer_shape;    view->strides = external_buffer_strides;    view->suboffsets = NULL;    view->internal = NULL;    // We need to increase the reference count of our buffer object here, but    // Python will automatically decrease it when the view goes out of scope    Py_INCREF(self);    // Done    return 0;}//----------------------------------------------------------------------------// Define the struct required to implement the buffer protocol//----------------------------------------------------------------------------static PyBufferProcs BufferObject_as_buffer ={    // Create new view    (getbufferproc) BufferObject_getbuffer,    // Release an existing view    (releasebufferproc) 0,};//----------------------------------------------------------------------------// Define a new Python type object for the external buffer//----------------------------------------------------------------------------static PyTypeObject BufferType ={    PyVarObject_HEAD_INIT(NULL, 0)    "external buffer",                  /* tp_name */    sizeof(BufferObject),               /* tp_basicsize */    0,                                  /* tp_itemsize */    (destructor) BufferObject_dealloc,  /* tp_dealloc */    0,                                  /* tp_print */    0,                                  /* tp_getattr */    0,                                  /* tp_setattr */    0,                                  /* tp_reserved */    0,                                  /* tp_repr */    0,                                  /* tp_as_number */    0,                                  /* tp_as_sequence */    0,                                  /* tp_as_mapping */    0,                                  /* tp_hash  */    0,                                  /* tp_call */    0,                                  /* tp_str */    0,                                  /* tp_getattro */    0,                                  /* tp_setattro */    &BufferObject_as_buffer,            /* tp_as_buffer */    Py_TPFLAGS_DEFAULT,                 /* tp_flags */    "External buffer",                  /* tp_doc */    0,                                  /* tp_traverse */    0,                                  /* tp_clear */    0,                                  /* tp_richcompare */    0,                                  /* tp_weaklistoffset */    0,                                  /* tp_iter */    0,                                  /* tp_iternext */    0,                                  /* tp_methods */    0,                                  /* tp_members */    0,                                  /* tp_getset */    0,                                  /* tp_base */    0,                                  /* tp_dict */    0,                                  /* tp_descr_get */    0,                                  /* tp_descr_set */    0,                                  /* tp_dictoffset */    (initproc) 0,                       /* tp_init */    0,                                  /* tp_alloc */    0,                                  /* tp_new */};//----------------------------------------------------------------------------// Define a Python function to put in the module which creates a new buffer//----------------------------------------------------------------------------static PyObject* mybuffer_create(PyObject *self, PyObject *args){    BufferObject* buf = (BufferObject*)(&BufferType)->tp_alloc(&BufferType, 0);    buf->ptr = external_buffer_allocate();    return (PyObject*) buf;}//----------------------------------------------------------------------------// Define the set of all methods which will be exposed in the module//----------------------------------------------------------------------------static PyMethodDef mybufferMethods[] ={    {"create", mybuffer_create, METH_VARARGS, "Create a buffer"},    {NULL, NULL, 0, NULL}        /* Sentinel */};//----------------------------------------------------------------------------// Define the module//----------------------------------------------------------------------------static PyModuleDef mybuffermodule = {    PyModuleDef_HEAD_INIT,    "mybuffer",    "Example module that creates an extension type.",    -1,    mybufferMethods    //NULL, NULL, NULL, NULL, NULL};//----------------------------------------------------------------------------// Define the module's entry point//----------------------------------------------------------------------------PyMODINIT_FUNC PyInit_mybuffer(void){    PyObject* m;    if (PyType_Ready(&BufferType) < 0)        return NULL;    m = PyModule_Create(&mybuffermodule);    if (m == NULL)        return NULL;    return m;}

test.py

#!/usr/bin/env python3import numpy as npimport mybufferdef test():    print('Create buffer')    b = mybuffer.create()    print('Print buffer')    print(b)    print('Create memoryview')    m = memoryview(b)    print('Print memoryview shape')    print(m.shape)    print('Print memoryview format')    print(m.format)    print('Create numpy array')    a = np.asarray(b)    print('Print numpy array')    print(repr(a))    print('Change every other byte in numpy')    a[::2] += 10    print('Print numpy array')    print(repr(a))    print('Change first byte in memory view')    m[0] = 42    print('Print numpy array')    print(repr(a))    print('Delete buffer')    del b    print('Delete memoryview')    del m    print('Delete numpy array - this is the last ref, so should free memory')    del a    print('Memory should be free before this line')if __name__ == '__main__':    test()

Example

$ gcc -fPIC -shared -o mybuffer.so mybuffer.c -lpython3.3m$ ./test.pyCreate bufferexternal_buffer_allocate() = 0x290fae0Print buffer<external buffer object at 0x7f7231a2cc60>Create memoryviewPrint memoryview shape(32,)Print memoryview formatBCreate numpy arrayPrint numpy arrayarray([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31], dtype=uint8)Change every other byte in numpyPrint numpy arrayarray([10,  1, 12,  3, 14,  5, 16,  7, 18,  9, 20, 11, 22, 13, 24, 15, 26,       17, 28, 19, 30, 21, 32, 23, 34, 25, 36, 27, 38, 29, 40, 31], dtype=uint8)Change first byte in memory viewPrint numpy arrayarray([42,  1, 12,  3, 14,  5, 16,  7, 18,  9, 20, 11, 22, 13, 24, 15, 26,       17, 28, 19, 30, 21, 32, 23, 34, 25, 36, 27, 38, 29, 40, 31], dtype=uint8)Delete bufferDelete memoryviewDelete numpy array - this is the last ref, so should free memoryexternal_buffer_free(0x290fae0)Memory should be free before this line


I liked @Vikas's approach, but when I tried it, I only got a Numpy object-array of a single FreeOnDel object. The following is much simpler and works:

class FreeOnDel(object):    def __init__(self, data, shape, dtype, readonly=False):        self.__array_interface__ = {"version": 3,                                    "typestr": numpy.dtype(dtype).str,                                    "data": (data, readonly),                                    "shape": shape}    def __del__(self):        data = self.__array_interface__["data"][0]      # integer ptr        print("do what you want with the data at {}".format(data))view = numpy.array(FreeOnDel(ptr, shape, dtype), copy=False)

where ptr is a pointer to the data as an integer (e.g. ctypesptr.addressof(...)).

This __array_interface__ attribute is sufficient to tell Numpy how to cast a region of memory as an array, and then the FreeOnDel object becomes that array's base. When the array is deleted, the deletion is propagated to the FreeOnDel object, where you can call libc.free.

I might even call this FreeOnDel class "BufferOwner", because that's its role: to track ownership.