Key/Value database for storing binary data Key/Value database for storing binary data sqlite sqlite

Key/Value database for storing binary data


As it was previously pointed out, BerkeleyDB does support opaque values and keys, but I would suggest a better alternative: LevelDB.

LevelDB:

Google is your friend :), so much so that they even provide you with an embedded database: A fast and lightweight key/value database library by Google.

Features:

  • Keys and values are arbitrary byte arrays.
  • Data is stored sorted by key.
  • Callers can provide a custom comparison function to override the sort order.
  • The basic operations are Put(key,value), Get(key), Delete(key).
  • Multiple changes can be made in one atomic batch.
  • Users can create a transient snapshot to get a consistent view of data.
  • Forward and backward iteration is supported over the data.
  • Data is automatically compressed using the Snappy compression library.
  • External activity (file system operations etc.) is relayed through a virtual interface so users can customize the operating system interactions.
  • Detailed documentation about how to use the library is included with the source code.


What makes you think BerkDB cannot store binary data? From their docs:

Key and content arguments are objects described by the datum typedef. A datum specifies a string of dsize bytes pointed to by dptr. Arbitrary binary data, as well as normal text strings, are allowed.

Also see their examples:

money = 122.45;key.data = &money;key.size = sizeof(float);...ret = my_database->put(my_database, NULL, &key, &data, DB_NOOVERWRITE);


If you don't need "multiple writer processes" (only multiple readers works), want something small and want something that is available on nearly every linux, you might want to take a look at gdbm, which is like berkeley db, but much simpler. Also it's possibly not as fast.

In nearly the same area are things like tokyocabinet, qdbm, and the already mentioned leveldb.

Berkeley db and sqlite are ahead of those, because they support multiple writers. berkeley db is a versioning desaster sometimes.

The major pro of gdbm: It's already on every linux, no versioning issues, small.