Database solution for static time-series data Database solution for static time-series data database database

Database solution for static time-series data


Quite often when people come to NoSQL databases, they come to it hearing that there's no schema and life's all good. However, IMHO this is a really wrong notion.

When dealing with NoSQL, You have to think in terms of "aggregates" . Typically an aggregate would be an entity that can be operated on as a single unit. In your case one possible (but not that efficient) way will be to model an user and his/her data as a single aggregate. This will ensure that your user aggregate can be data centre / shard agnostic. But if the data is going to grow - loading a user will also load all the related data and be a memory hog. (Mongo as such is bit greedy on memory)

Another option will be to have the recordings stored as an aggregate and "linked" back to the user with an id - this can be a synthetic key that you can create like a GUID. Even though this superficially seems like a join, its just a "look up by property" - Since there's no real referential integrity here. This maybe the approach that I'll take if files are going to get added constantly.

The place where MongoDb shines is the part where you can do adhoc queries by a property in the document(you will create an index for this property if you don't want to lose hair later down the road.). You will not go wrong with your choice for time series data storage in Mongo. You can extract data that matches an id, within a date range for e.g., without doing any major stunts.

Please do ensure that you have replica sets no matter which ever approach you take, and diligently chose your sharding approach early on - sharding later is no fun.


I feel like this may not answer the right question, but here is what I would probably go for (using SQL server):

User (table)

  • UserId
  • Gender
  • Expertise
  • etc...

Sample (table)

  • SampleId
  • UserId
  • Startime
  • Duration
  • Order
  • etc...

Series (table)

  • SampleId
  • SecondNumber (about 1-90)
  • Values (string with values)

I think this should give you fairly flexible access, as well as reasonable memory efficency. As the values are stored in string format you cannot do analysis on the timeseries in sql (they will need to be parsed first) but I don't think that should be a problem. Of course you can also use MeasurementNumber and Value, then you have complete freedom.

Of course this is not as complete as your MongoDB setup but the gaps should be fairly easy to fill.


You should really investigate LDAP and its data model. There is clearly a strong hierarchical character to your data, and LDAP is already commonly used to store attributes about people. It's a mature, standardized network protocol so you can choose from a variety of implementations, as opposed to being locked into a particular NoSQL flavor-of-the-month choice. LDAP is designed for distributed access, provides a security model for authentication (and authorization/access control as well) and is extremely efficient. More so than any of these HTTP-based protocols.