Accumulo high speed ingest options
Even with every use case, people have personal preferences regarding how they would like to implement a solution for a specific use case. I would actually run flume agents on the feed nodes and collect the data in HDFS and periodically run a MapReduce on the new data that arrives in HDFS using the RFile approach.