Write a pandas data frame to HDF5
It's difficult to give you a good answer to this rather generic question.
It's not clear how are you going to use (read) your HDF5 files - do you want to select data conditionally (using where
parameter)?
fir of all you need to open a store object:
store = pd.HDFStore('/path/to/filename.h5')
now you can write (or append) to the store (i'm using here blosc
compression - it's pretty fast and efficient), beside that i will use data_columns
parameter in order to specify the columns that must be indexed (so you can use these columns in the where
parameter later when you will read your HDF5 file):
for f in files: #read or process each file in/into a separate `df` store.append('df_identifier_AKA_key', df, data_columns=[list_of_indexed_cols], complevel=5, complib='blosc')store.close()