How to parse nested JSON objects in spark sql?
Assuming you read in a json file and print the schema you are showing us like this:
DataFrame df = sqlContext.read().json("/path/to/file").toDF(); df.registerTempTable("df"); df.printSchema();
Then you can select nested objects inside a struct type like so...
DataFrame app = df.select("app"); app.registerTempTable("app"); app.printSchema(); app.show();DataFrame appName = app.select("element.appName"); appName.registerTempTable("appName"); appName.printSchema(); appName.show();
Try this:
val nameAndAddress = sqlContext.sql(""" SELECT name, address.city, address.state FROM people""")nameAndAddress.collect.foreach(println)
Source: https://databricks.com/blog/2015/02/02/an-introduction-to-json-support-in-spark-sql.html
Have you tried doing it straight from the SQL query like
Select apps.element.Ratings from yourTableName
This will probably return an array and you can more easily access the elements inside. Also, I use this online Json viewer when I have to deal with large JSON structures and the schema is too complex:http://jsonviewer.stack.hu/