Implementing twitter and facebook like hashtags Implementing twitter and facebook like hashtags elasticsearch elasticsearch

Implementing twitter and facebook like hashtags


A start with MongoDB would be to parse each message for hashtags the user used and put these into a sub-array of the document. Example status update:

Peter

April 29th 2014 12:28:34

Hello friends, I visited the #tradeshow in #washington and drank a delicious #coffee

This message would look like this in MongoDB:

{    author: "Peter",    date: ISODate("2014-04-29 12:28:34"),    text: "Hello friends, I visited the #tradeshow in #washington and drank a delicious #coffee",    hashtags: [        "tradeshow",        "washington",        "coffee"    ]}

When you then create an index on db.collection.hashtags you can quickly search for all messages which include one of these hashtags. You likely want to order and limit the results by date so the user sees the most recent results first. When you make it a compound index which also includes the date, you can also speed that up.

How to implement "trending" topics is a quite complex question. It is also very subjective depending on what you would consider "trending". The exact algorithms Twitter or Facebook use to determine which topics are trending or not is not public. According to various social media analysts they also change them frequently, so we can assume that they are quite complex by now.

That means we can not help you to come up with an algorithm on your own. But when you already have an algorithm in mind to calculate the "trendyness" of a hashtag, we could help you to find a good implementation.