How to setup Apache Spark to use local hard disk when data does not fit in RAM in local mode? How to setup Apache Spark to use local hard disk when data does not fit in RAM in local mode? hadoop hadoop

How to setup Apache Spark to use local hard disk when data does not fit in RAM in local mode?


Look at http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistenceYou can use various persistence models as per your need. MEMORY_AND_DISK is what will solve your problem . If you want a better performance, use MEMORY_AND_DISK_SER which stores data in serialized fashion.