How to parse nested JSON objects in spark sql? How to parse nested JSON objects in spark sql? json json

How to parse nested JSON objects in spark sql?


Assuming you read in a json file and print the schema you are showing us like this:

DataFrame df = sqlContext.read().json("/path/to/file").toDF();    df.registerTempTable("df");    df.printSchema();

Then you can select nested objects inside a struct type like so...

DataFrame app = df.select("app");        app.registerTempTable("app");        app.printSchema();        app.show();DataFrame appName = app.select("element.appName");        appName.registerTempTable("appName");        appName.printSchema();        appName.show();


Try this:

val nameAndAddress = sqlContext.sql("""    SELECT name, address.city, address.state    FROM people""")nameAndAddress.collect.foreach(println)

Source: https://databricks.com/blog/2015/02/02/an-introduction-to-json-support-in-spark-sql.html


Have you tried doing it straight from the SQL query like

Select apps.element.Ratings from yourTableName

This will probably return an array and you can more easily access the elements inside. Also, I use this online Json viewer when I have to deal with large JSON structures and the schema is too complex:http://jsonviewer.stack.hu/