How to store dictionary in HDF5 dataset How to store dictionary in HDF5 dataset python python

How to store dictionary in HDF5 dataset


I found two ways to this:

I) transform datetime object to string and use it as dataset name

h = h5py.File('myfile.hdf5')for k, v in d.items():    h.create_dataset(k.strftime('%Y-%m-%dT%H:%M:%SZ'), data=np.array(v, dtype=np.int8))

where data can be accessed by quering key strings (datasets name). For example:

for ds in h.keys():    if '2012-04' in ds:        print(h[ds].value)

II) transform datetime object to dataset subgroups

h = h5py.File('myfile.hdf5')for k, v in d.items():    h.create_dataset(k.strftime('%Y/%m/%d/%H:%M'), data=np.array(v, dtype=np.int8))

notice forward slashes in strftime string, which will create appropriate subgroups in HDF file. Data can be accessed directly like h['2012']['04']['05']['23:30'].value, or by iterating with provided h5py iterators, or even by using custom functions through visititems()

For simplicity I choose the first option.


This question relates to the more general question of being able to store any type of dictionary in HDF5 format. First, convert the dictionary to a string. Then to recover the dictionary, use the ast library by using the import ast command. The following code gives an example.

>>> d = {1:"a",2:"b"}>>> s = str(d)>>> s"{1: 'a', 2: 'b'}">>> ast.literal_eval(s){1: 'a', 2: 'b'}>>> type(ast.literal_eval(s))<type 'dict'>


I would serialize the object into JSON or YAML and store the resulting string as an attribute in the appropriate object (HDF5 group or dataset).

I'm not sure why you're using the datetime as a dataset name, however, unless you absolutely need to look up your dataset directly by datetime.

p.s. For what it's worth, PyTables is a lot easier to use than the low-level h5py.