Add metadata comment to Numpy ndarray
TobiasR's comment is the simplest way, but you could also subclass ndarray. See numpy documentation or this question
class MetaArray(np.ndarray): """Array with metadata.""" def __new__(cls, array, dtype=None, order=None, **kwargs): obj = np.asarray(array, dtype=dtype, order=order).view(cls) obj.metadata = kwargs return obj def __array_finalize__(self, obj): if obj is None: return self.metadata = getattr(obj, 'metadata', None)
Example usage:
>>> a = MetaArray([1,2,3], comment='/Documents/Data/foobar.txt')>>> a.metadata{'comment': '/Documents/Data/foobar.txt'}
It sounds like you may be interested in storing metadata in a persistent way along with your array. If so, HDF5 is an excellent option to use as a storage container.
For example, let's create an array and save it to an HDF file with some metadata using h5py
:
import numpy as npimport h5pysome_data = np.random.random((100, 100))with h5py.File('data.hdf', 'w') as outfile: dataset = outfile.create_dataset('my data', data=some_data) dataset.attrs['an arbitrary key'] = 'arbitrary values' dataset.attrs['foo'] = 10.2
We can then read it back in:
import h5pywith h5py.File('data.hdf', 'r') as infile: dataset = infile['my data'] some_data = dataset[...] # Load it into memory. Could also slice a subset. print dataset.attrs['an arbitrary key'] print dataset.attrs['foo']
As others have mentioned, if you are only concerned with storing the data + metadata in memory, a better option is a dict
or simple wrapper class. For example:
class Container: def __init__(self, data, **kwargs): self.data = data self.metadata = kwargs
Of course, this won't behave like a numpy array directly, but it's usually a bad idea to subclass ndarrays
. (You can, but it's easy to do incorrectly. You're almost always better off designing a class that stores the array as an attribute.)
Better yet, make any operations you're doing methods of a similar class to the example above. For example:
import scipy.signalimport numpy as npclass SeismicCube(object): def __init__(self, data, bounds, metadata=None): self.data = data self.x0, self.x1, self.y0, self.y1, self.z0, self.z1= bounds self.bounds = bounds self.metadata = {} if metadata is None else metadata def inside(self, x, y, z): """Test if a point is inside the cube.""" inx = self.x0 >= x >= self.x1 iny = self.y0 >= y >= self.y1 inz = self.z0 >= z >= self.z1 return inx and iny and inz def inst_amp(self): """Calculate instantaneous amplitude and return a new SeismicCube.""" hilb = scipy.signal.hilbert(self.data, axis=2) data = np.hypot(hilb.real, hilb.imag) return type(self)(data, self.bounds, self.metadata)