Caffe: Reading LMDB from Python
Here's the working code I figured out
import caffeimport lmdblmdb_env = lmdb.open('directory_containing_mdb')lmdb_txn = lmdb_env.begin()lmdb_cursor = lmdb_txn.cursor()datum = caffe.proto.caffe_pb2.Datum()for key, value in lmdb_cursor: datum.ParseFromString(value) label = datum.label data = caffe.io.datum_to_array(datum) for l, d in zip(label, data): print l, d
If you have encoded images in lmdb
, you'll probably see this error when using @ytrewq's code
ValueError: total size of new array must be unchanged
Use this function instead:
import caffeimport lmdbimport PIL.Imagefrom io import StringIOimport numpy as npdef read_lmdb(lmdb_file): cursor = lmdb.open(lmdb_file, readonly=True).begin().cursor() datum = caffe.proto.caffe_pb2.Datum() for _, value in cursor: datum.ParseFromString(value) s = StringIO() s.write(datum.data) s.seek(0) yield np.array(PIL.Image.open(s)), datum.label
Example:
lmdb_dir = '/save/jobs/20160613-125532-958f/train_db/'for im, label in read_lmdb(lmdb_dir): print label, im