How to convert Row to json in Spark 2 Scala How to convert Row to json in Spark 2 Scala json json

How to convert Row to json in Spark 2 Scala


You can use getValuesMap to convert the row object to a Map and then convert it JSON:

import scala.util.parsing.json.JSONObjectimport org.apache.spark.sql._val df = Seq((1,2,3),(2,3,4)).toDF("A", "B", "C")    val row = df.first()          // this is an example row objectdef convertRowToJSON(row: Row): String = {    val m = row.getValuesMap(row.schema.fieldNames)    JSONObject(m).toString()}convertRowToJSON(row)// res46: String = {"A" : 1, "B" : 2, "C" : 3}


I need to read json input and produce json output.Most fields are handled individually, but a few json sub objects need to just be preserved.

When Spark reads a dataframe it turns a record into a Row. The Row is a json like structure. That can be transformed and written out to json.

But I need to take some sub json structures out to a string to use as a new field.

This can be done like this:

dataFrameWithJsonField = dataFrame.withColumn("address_json", to_json($"location.address"))

location.address is the path to the sub json object of the incoming json based dataframe. address_json is the column name of that object converted to a string version of the json.

to_json is implemented in Spark 2.1.

If generating it output json using json4s address_json should be parsed to an AST representation otherwise the output json will have the address_json part escaped.


Pay attention scala class scala.util.parsing.json.JSONObject is deprecated and not support null values.

@deprecated("This class will be removed.", "2.11.0")

"JSONFormat.defaultFormat doesn't handle null values"

https://issues.scala-lang.org/browse/SI-5092