Twitter streaming with multiple twitts that have same id Twitter streaming with multiple twitts that have same id hadoop hadoop

Twitter streaming with multiple twitts that have same id


Flume does not add any kind of id to the data it is going to store. The same occurs with HDFS, it does not add any id when storing the data. They simply work together in order to move the generated data and store it.

If you are storing tweets with identical id it is because you are receiving the data with those ids, or you are interpreting the data in the wrong way.

Being said that, maybe you could add some examples to your question by editing it.