Best way to implement sorting over a large number of record in Mongo? [closed] Best way to implement sorting over a large number of record in Mongo? [closed] elasticsearch elasticsearch

Best way to implement sorting over a large number of record in Mongo? [closed]


This is more like a system design question to be honest. Even with elastic search you will have to mark the analyzer on the basis of certain elements right so it's not like it is supposed to work on everything, you'll have to define it that way.

With respect to database, MongoDB, the best you can do is have indexes to aid the sorting, because if not that then the system will have to fetch those values in the WiredTiger Cache (WiredTiger = Storage Engine) and then sort them in memory, imagine the travesty that will cause :D

Most of the companies maintain a more granular control over things like this, based on expectations most of the things are pre-compiled, on the basis of tags for example in Twitter. And after it has run once you don't need to sort the whole thing again.

I have sorted a dataset on field A for example, do I need to sort all of it again for a new request? No : Just adjust the new entries. This adjustment will depend on what you want to show to the user.

All in all, an interesting problem to solve but heavily will depend on use case. Exact access pattern. Having said that ElasticSearch sounds like a good candidate but... it also will have its limitation. Focus on exact access patterns, like I mentioned already.

Edit as requested by OP.

So, how do I fetch top trending posts?

This doesn't depend on entirely just sorting your results, this is more dependent on explosiveness of the topic where rate has more importance.

Check this article here by Gilad.

Think of it where you check the rate of tags and words, you maintain a count on rate basis for that.

Similarly for your category, based on algorithm keep this piece isolated from just querying all the posts.

Amazon is not ranking products on the fly for a category for all it's dataset, is it? Think of it.

Pre-rank stuff and based on new addition, keep that part dynamic and merge them.

For example for category x => I have top 500 ready based on my algorithm, now for new data which has come in today, I use to algorithm to get relative rank and then merge top 500 with ranked content today and display the results.