Caffe: Reading LMDB from Python Caffe: Reading LMDB from Python python python

Caffe: Reading LMDB from Python


Here's the working code I figured out

import caffeimport lmdblmdb_env = lmdb.open('directory_containing_mdb')lmdb_txn = lmdb_env.begin()lmdb_cursor = lmdb_txn.cursor()datum = caffe.proto.caffe_pb2.Datum()for key, value in lmdb_cursor:    datum.ParseFromString(value)    label = datum.label    data = caffe.io.datum_to_array(datum)    for l, d in zip(label, data):            print l, d


If you have encoded images in lmdb, you'll probably see this error when using @ytrewq's code

ValueError: total size of new array must be unchanged

Use this function instead:

import caffeimport lmdbimport PIL.Imagefrom io import StringIOimport numpy as npdef read_lmdb(lmdb_file):    cursor = lmdb.open(lmdb_file, readonly=True).begin().cursor()    datum = caffe.proto.caffe_pb2.Datum()    for _, value in cursor:        datum.ParseFromString(value)        s = StringIO()        s.write(datum.data)        s.seek(0)        yield np.array(PIL.Image.open(s)), datum.label

Example:

lmdb_dir = '/save/jobs/20160613-125532-958f/train_db/'for im, label in read_lmdb(lmdb_dir):    print label, im