Add metadata comment to Numpy ndarray Add metadata comment to Numpy ndarray numpy numpy

Add metadata comment to Numpy ndarray


TobiasR's comment is the simplest way, but you could also subclass ndarray. See numpy documentation or this question

class MetaArray(np.ndarray):    """Array with metadata."""    def __new__(cls, array, dtype=None, order=None, **kwargs):        obj = np.asarray(array, dtype=dtype, order=order).view(cls)                                         obj.metadata = kwargs        return obj    def __array_finalize__(self, obj):        if obj is None: return        self.metadata = getattr(obj, 'metadata', None)

Example usage:

>>> a = MetaArray([1,2,3], comment='/Documents/Data/foobar.txt')>>> a.metadata{'comment': '/Documents/Data/foobar.txt'}


It sounds like you may be interested in storing metadata in a persistent way along with your array. If so, HDF5 is an excellent option to use as a storage container.

For example, let's create an array and save it to an HDF file with some metadata using h5py:

import numpy as npimport h5pysome_data = np.random.random((100, 100))with h5py.File('data.hdf', 'w') as outfile:    dataset = outfile.create_dataset('my data', data=some_data)    dataset.attrs['an arbitrary key'] = 'arbitrary values'    dataset.attrs['foo'] = 10.2

We can then read it back in:

import h5pywith h5py.File('data.hdf', 'r') as infile:    dataset = infile['my data']    some_data = dataset[...] # Load it into memory. Could also slice a subset.    print dataset.attrs['an arbitrary key']    print dataset.attrs['foo']

As others have mentioned, if you are only concerned with storing the data + metadata in memory, a better option is a dict or simple wrapper class. For example:

class Container:    def __init__(self, data, **kwargs):        self.data = data        self.metadata = kwargs

Of course, this won't behave like a numpy array directly, but it's usually a bad idea to subclass ndarrays. (You can, but it's easy to do incorrectly. You're almost always better off designing a class that stores the array as an attribute.)

Better yet, make any operations you're doing methods of a similar class to the example above. For example:

import scipy.signalimport numpy as npclass SeismicCube(object):    def __init__(self, data, bounds, metadata=None):        self.data = data        self.x0, self.x1, self.y0, self.y1, self.z0, self.z1= bounds        self.bounds = bounds        self.metadata = {} if metadata is None else metadata    def inside(self, x, y, z):        """Test if a point is inside the cube."""        inx = self.x0 >= x >= self.x1        iny = self.y0 >= y >= self.y1        inz = self.z0 >= z >= self.z1        return inx and iny and inz    def inst_amp(self):        """Calculate instantaneous amplitude and return a new SeismicCube."""        hilb = scipy.signal.hilbert(self.data, axis=2)        data = np.hypot(hilb.real, hilb.imag)        return type(self)(data, self.bounds, self.metadata)