Can Apache Flume be used to extract tweets for a certain period of time? Can Apache Flume be used to extract tweets for a certain period of time? hadoop hadoop

Can Apache Flume be used to extract tweets for a certain period of time?


AFAIK, the TwitterSource from Cloudera is just for receiving data at the same time it is generated. I think something similiar occurs with the Twitter 1% firehose source.

Nevertheless, I'm seeing the Twitter API may work with timelines, thus it is a matter of modifying the TwitterSource source code.