Using S3 as a database vs. database (e.g. MongoDB) Using S3 as a database vs. database (e.g. MongoDB) mongodb mongodb

Using S3 as a database vs. database (e.g. MongoDB)


You are "considering using AWS S3 bucket instead of a NoSQL database", but the fact is that Amazon S3 effectively is a NoSQL database.

It is a very large Key-Value store. The Key is the filename, the Value is the contents of the file.

If your needs are simply "Store a value with this key" and "Retrieve a value with this key", then it would work just fine!

In fact, old orders on Amazon.com (more than a year old) are apparently archived to Amazon S3 since they are read-only (no returns, no changes).

While slower than DynamoDB, Amazon S3 certainly costs significantly less for storage!


Context: we use S3 for some "database" (lit. key/value structured storage).

It should be noted that S3 does actually have search and, depending on how you structure your data, queries in the form of S3 Select (and, if you have the time: Athena).

However the biggest disadvantage/architectural challenge is that S3 is eventually consistent (which is actually the reason why you cannot "update" a file). This manifests itself in some behaviours which your architecture will need to tolerate:

  • Operations are cached by key, so if you attempt to get an object that doesn't exist, and then create it- for a period of time* any gets on that object will return that it does not exist.
  • There is no global cache, so you can get two different versions of the same object for a period of time* after it has been overwritten.
  • List operations provide a semi-unstable iterator. If you're going to list on a large number of objects in a bucket that is being updated, then chances are you are not going to visit all the objects by the end of the iterator.

*period of time is purposely undefined by AWS, however, from observation, it is rarely more than a minute.