How to convert DataFrame to Json?
val result: DataFrame = sqlContext.read.json(path)result.write.json("/yourPath")
The method write
is in the class DataFrameWriter and should be accessible to you on DataFrame
objects. Just make sure that your rdd is of type DataFrame
and not of deprecated type SchemaRdd
. You can explicitly provide type definition val data: DataFrame
or cast to dataFrame with toDF()
.
If you have a DataFrame there is an API to convert back to an RDD[String] that contains the json records.
val df = Seq((2012, 8, "Batman", 9.8), (2012, 8, "Hero", 8.7), (2012, 7, "Robot", 5.5), (2011, 7, "Git", 2.0)).toDF("year", "month", "title", "rating")df.toJSON.saveAsTextFile("/tmp/jsonRecords")df.toJSON.take(2).foreach(println)
This should be available from Spark 1.4 onward. Call the API on the result DataFrame you created.
The APIs available are listed here