Spark Error: Failed to Send RPC to Datanode Spark Error: Failed to Send RPC to Datanode hadoop hadoop

Spark Error: Failed to Send RPC to Datanode


It could be due to insufficient disk space. In my case, i was running a Spark Job in AWS EMR with 1 r4.2xlarge (Master) & 2 r4.8xlarge (Core). Spark tuning and increasing the slave nodes solved my problem. Most common issue is memory pressure, bcoz of bad configs (i.e. wrong-sized executors), long-running tasks, and tasks that result in cartesian operations. You can speed up jobs with appropriate caching, and by allowing for data skew. For the best performance, monitor and review long-running and resource-consuming Spark job executions. Hope it helps.

Reference => EMR Spark - TransportClient: Failed to send RPC


In my case I reduced the memory for driver and executor from 8 to 4G:

spark.driver.memory=4G,spark.executor.memory=4G

Check your nodes configuration, you should not ask more memory as available.