How can I print nulls when converting a dataframe to json in Spark How can I print nulls when converting a dataframe to json in Spark json json

How can I print nulls when converting a dataframe to json in Spark


To print the null values in JSON using Spark's toJSON method, you can use following code:

myData.na.fill("null").toJSON

It will give you expected result:

+-------------------------------------------+|value                                      |+-------------------------------------------+|{"name":"Alice","age":"23","pets":"dog"}   ||{"name":"Bob","age":"30","pets":"dog"}     ||{"name":"Charlie","age":"35","pets":"null"}|+-------------------------------------------+

I hope it helps!


I have modified JacksonGenerator.writeFields function and included in my project.Below are the steps-

1) Create package 'org.apache.spark.sql.catalyst.json' inside 'src/main/scala/'

2) Copy JacksonGenerator class

3) Create JacksonGenerator.scala class in '' package and paste the copied code

4) modify writeFields function

private def writeFields(row: InternalRow, schema: StructType, fieldWriters:Seq[ValueWriter]): Unit = {var i = 0while (i < row.numFields) {  val field = schema(i)  if (!row.isNullAt(i)) {    gen.writeFieldName(field.name)    fieldWriters(i).apply(row, i)  }  else{    gen.writeNullField(field.name)  }  i += 1}}


import org.apache.spark.sql.types._import scala.util.parsing.json.JSONObjectdef convertRowToJSON(row: Row): String = {    val m = row.getValuesMap(row.schema.fieldNames).filter(_._2 != null)    JSONObject(m).toString()  }