How to read HDF5 files in Python
Read HDF5
import h5pyfilename = "file.hdf5"with h5py.File(filename, "r") as f: # List all groups print("Keys: %s" % f.keys()) a_group_key = list(f.keys())[0] # Get the data data = list(f[a_group_key])
Write HDF5
import h5py# Create random dataimport numpy as npdata_matrix = np.random.uniform(-1, 1, size=(10, 3))# Write data to HDF5with h5py.File("file.hdf5", "w") as data_file: data_file.create_dataset("group_name", data=data_matrix)
See h5py docs for more information.
Alternatives
- JSON: Nice for writing human-readable data; VERY commonly used (read & write)
- CSV: Super simple format (read & write)
- pickle: A Python serialization format (read & write)
- MessagePack (Python package): More compact representation (read & write)
- HDF5 (Python package): Nice for matrices (read & write)
- XML: exists too *sigh* (read & write)
For your application, the following might be important:
- Support by other programming languages
- Reading / writing performance
- Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python
Reading the file
import h5pyf = h5py.File(file_name, mode)
Studying the structure of the file by printing what HDF5 groups are present
for key in f.keys(): print(key) #Names of the groups in HDF5 file.
Extracting the data
#Get the HDF5 groupgroup = f[key]#Checkout what keys are inside that group.for key in group.keys(): print(key)data = group[some_key_inside_the_group][()]#Do whatever you want with data#After you are donef.close()