Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the referenced columns only include the internal corrupt record column

json scala apache-spark apache-spark-sql

You may try either of these two ways.

Option-1: JSON in single line as answered above by @Avishek Bhattacharya.

Option-2: Add option to read multi line JSON in the code as follows. You could read the nested attribute also as shown below.

val df = spark.read.option("multiline","true").json("C:\\data\\nested-data.json")df.select("a.b").show()

Here is the output for Option-2.

20/07/29 23:14:35 INFO DAGScheduler: Job 1 finished: show at NestedJsonReader.scala:23, took 0.181579 s+---+|  b|+---+|  1|+---+

json scala apache-spark apache-spark-sql

The problem is with the JSON file. The file : "D:/playground/input.json" looks like as you descibed as

{  "a": {  "b": 1  }}

This is not right. Spark while processing json data considers each new line as a complete json. Thus it is failing.

You should keep your complete json in a single line in a compact form by removing all white spaces and newlines.

{"a":{"b":1}}

If you want multiple jsons in a single file keep them like this

{"a":{"b":1}}{"a":{"b":2}}{"a":{"b":3}} ...

For more infos see

json scala apache-spark apache-spark-sql

This error means 2 things:

1- either your file format isn't what you think (and you are using the wrong method for it, like its text but you mistakenly used json method)

2- you file doesn't follow the standards for the format you are using (while you used correct method for correct format), this usually happens with json.

CodeHunter

Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the referenced columns only include the internal corrupt record column

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last