Running spark cluster on standalone mode vs Yarn/Mesos Running spark cluster on standalone mode vs Yarn/Mesos hadoop hadoop

Running spark cluster on standalone mode vs Yarn/Mesos


Spark standalone cluster manager can also give you cluster mode capabilities.

Spark standalone cluster will provide almost all the same features as the other cluster managers if you are only running Spark.

When you submit your application in cluster mode all you job related files would be copied on to one of the machines on the cluster which would then submit the job on your behalf, if you submit the application in client mode the machine from which the job is being submitted would be taking care of driver related activities. This means that the machine from which the job has been submitted cannot go offline, whereas in cluster mode the machine from which the job has been submitted can go offline.

Having a Cassandra cluster would also not change any of these behaviors except it can save you network traffic if you can get the nearest contact point for the spark executor(Just like Data locality).

The failed stages gets rescheduled if you use either of the cluster managers.


I was wondering if I switch to Hadoop and start using a Resource manager like YARN or mesos, does it give me an additional performance advantage like execution time and better resource management?

In Standalone cluster model, each application uses all the available nodes in the cluster.

From spark-standalone documentation page:

The standalone cluster mode currently only supports a simple FIFO scheduler across applications. However, to allow multiple concurrent users, you can control the maximum number of resources each application will use. By default, it will acquire all cores in the cluster, which only makes sense if you just run one application at a time.

In other cases (when you are running multiple applications in the cluster) , you can prefer YARN.

Currently sometime when I am processing huge chunk of data during shuffling with a possibility of stage failure. If I migrate to a YARN, can Resource manager address this issue?

Not sure since your application logic is not known. But you can give a try with YARN.

Have a look at related SE question for benefits of YARN over Standalone and Mesos:

Which cluster type should I choose for Spark?