Job 65 cancelled because SparkContext was shut down Job 65 cancelled because SparkContext was shut down hadoop hadoop

Job 65 cancelled because SparkContext was shut down


Your job is getting aborted at the write step. Job aborted. is the exception message for that, which is leading to the Spark Context being shutdown.

Look into optimising the write step, maxRecordsPerFile might be the culprit; maybe try a lower number.. you currently have 1M records in a file!


In general, Job ${job.jobId} cancelled because SparkContext was shut down just means that it's an exception due to which the DAG couldn't continue and needs to Error out. Its the Spark scheduler throwing this error when it faces an exception, it might be an exception that is unhandled in your code or a job failure due to any other reason. And as the DAG scheduler is stopped, the entire application will get stopped(this message is part of Cleanup).


To your questions -

When a SparkContext shuts down, does that mean my bridge to the Spark cluster is down?

SparkContext represents the connection to a Spark cluster, so if its dead it means you can't run run job on to it as you lost the link! On Zepplin, you can just restart the SparkContext (Menu -> Interpreter -> Spark Interpreter -> restart)

And, if that's the case, how can I cause the bridge to the spark cluster to go down?

With SparkException/Error in Jobs or manually using sc.stop()