Databricks Exception: Total size of serialized results is bigger than spark.driver.maxResultsSize

python azure apache-spark databricks

You need to change this parameter in the cluster configuration. Go into the cluster settings, under Advanced select spark and paste spark.driver.maxResultSize 0 (for unlimited) or whatever the value suits you. Using 0 is not recommended. You should optimize the job by re partitioning.

python azure apache-spark databricks

Looks like your driver have a limited size for storing the result and your resulting files have cross the limit,so you can increase the size of result by the following command in your notebook.

sqlContext.getConf("spark.driver.maxResultSize")res19: String = 20g

It gives the current max size of storage capacity as 20 GB, mine

sqlContext.setConf("spark.driver.maxResultSize","30g")

To increase the maxResultSize you can use the above command.

It's not recommended because its reduce the performance size of your cluster because then you have minimize the free space allocate to the temporary files for processing in the cluster.But i think it solved your issue.

python azure apache-spark databricks

You need to increase the maxResultSize value for the cluster.

The maxResultSize must be set BEFORE the cluster is started -- trying to set the maxResultSize in the notebook after the cluster is started will not work.

"Edit" the cluster and set the value in the "Spark Config" section under "Advanced Options".

Here is a screenshot of Configure Cluster for Databricks in AWS, but something similar probably exists for Databricks in Azure.

In your notebook you can verify that the value is already set by including the following command:

Of course 8g may not be large enough in your case, so keep increasing it until the problem goes away -- or something else blows up! Best of luck.

Note: When I ran into this problem my notebook was attempting to write to S3, not directly trying to "collect" the data, so-to-speak.

CodeHunter

Databricks Exception: Total size of serialized results is bigger than spark.driver.maxResultsSize

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last