MongoDB - Querying between a time range of hours

mongodb datetime time pymongo

Well, the best way to solve this is to store the minutes separately as well. But you can get around this with the aggregation framework, although that is not going to be very fast:

db.so.aggregate( [     { $project: {        loc: 1,        vid: 1,        datetime_recorded: 1,         minutes: { $add: [            { $multiply: [ { $hour: '$datetime_recorded' }, 60 ] },             { $minute: '$datetime_recorded' }         ] }     } },    { $match: { 'minutes' : { $gte : 12 * 60, $lt : 16 * 60 } } }] );

In the first step $project, we calculate the minutes from hour * 60 + min which we then match against in the second step: $match.

mongodb datetime time pymongo

Adding an answer since I disagree with the other answers in that even though there are great things you can do with the aggregation framework, this really is not an optimal way to perform this type of query.

If your identified application usage pattern is that you rely on querying for "hours" or other times of the day without wanting to look at the "date" part, then you are far better off storing that as a numeric value in the document. Something like "milliseconds from start of day" would be granular enough for as many purposes as a BSON Date, but of course gives better performance without the need to compute for every document.

Set Up

This does require some set-up in that you need to add the new fields to your existing documents and make sure you add these on all new documents within your code. A simple conversion process might be:

MongoDB 4.2 and upwards

This can actually be done in a single request due to aggregation operations being allowed in "update" statements now.

db.collection.updateMany(  {},  [{ "$set": {    "timeOfDay": {      "$mod": [        { "$toLong": "$datetime_recorded" },        1000 * 60 * 60 * 24      ]    }  }}])

Older MongoDB

var batch = [];db.collection.find({ "timeOfDay": { "$exists": false } }).forEach(doc => {  batch.push({    "updateOne": {      "filter": { "_id": doc._id },      "update": {        "$set": {          "timeOfDay":  doc.datetime_recorded.valueOf() % (60 * 60 * 24 * 1000)        }      }    }  });  // write once only per reasonable batch size  if ( batch.length >= 1000 ) {    db.collection.bulkWrite(batch);    batch = [];  }})if ( batch.length > 0 ) {  db.collection.bulkWrite(batch);  batch = [];}

If you can afford to write to a new collection, then looping and rewriting would not be required:

db.collection.aggregate([  { "$addFields": {    "timeOfDay": {      "$mod": [        { "$subtract": [ "$datetime_recorded", Date(0) ] },        1000 * 60 * 60 * 24      ]    }  }},  { "$out": "newcollection" }])

Or with MongoDB 4.0 and upwards:

db.collection.aggregate([  { "$addFields": {    "timeOfDay": {      "$mod": [        { "$toLong": "$datetime_recorded" },        1000 * 60 * 60 * 24      ]    }  }},  { "$out": "newcollection" }])

All using the same basic conversion of:

1000 milliseconds in a second
60 seconds in a minute
60 minutes in an hour
24 hours a day

The modulo from the numeric milliseconds since epoch which is actually the value internally stored as a BSON date is the simple thing to extract as the current milliseconds in the day.

Query

Querying is then really simple, and as per the question example:

db.collection.find({  "timeOfDay": {    "$gte": 12 * 60 * 60 * 1000, "$lt": 16 * 60 * 60 * 1000  }})

Of course using the same time scale conversion from hours into milliseconds to match the stored format. But just like before you can make this whatever scale you actually need.

Most importantly, as real document properties which don't rely on computation at run-time, you can place an index on this:

db.collection.createIndex({ "timeOfDay": 1 })

So not only is this negating run-time overhead for calculating, but also with an index you can avoid collection scans as outlined on the linked page on indexing for MongoDB.

For optimal performance you never want to calculate such things as in any real world scale it simply takes an order of magnitude longer to process all documents in the collection just to work out which ones you want than to simply reference an index and only fetch those documents.

The aggregation framework may just be able to help you rewrite the documents here, but it really should not be used as a production system method of returning such data. Store the times separately.

CodeHunter

MongoDB - Querying between a time range of hours

Set Up

Query

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last