Can I run a Time Series Database (TSDB) over Apache Spark? Can I run a Time Series Database (TSDB) over Apache Spark? database database

Can I run a Time Series Database (TSDB) over Apache Spark?


I'm an OpenTSDB committer, I know this is an old question, but I wanted to answer. My suggestion would be to write your incoming data to OpenTSDB, assuming you just want to store the raw data and process it later. Then with Spark, execute OpenTSDB queries using the OpenTSDB classes.

You can write data with the classes also, I think you want to use the IncomingDataPoint construct, I actually don't have the details at hand at the moment. Feel free to contact me on the OpenTSDB mailing list for more questions.

You an see how OpenTSDB handles the incoming "put" request here, you should be able to do the same thing in your code for writes:

https://github.com/OpenTSDB/opentsdb/blob/master/src/tsd/PutDataPointRpc.java#L42

You can see the Splicer project submitting OpenTSDB queries here, a similar method could be used in your Spark project I think:

https://github.com/turn/splicer/blob/master/src/main/java/com/turn/splicer/tsdbutils/SplicerQueryRunner.java#L87