Not able to retrieve data from SparkR created DataFrame
Adding this line made the difference:
Sys.setenv("SPARKR_SUBMIT_ARGS"="--master yarn-client sparkr-shell")
Here is the full code:
Sys.setenv(HADOOP_CONF_DIR = "/etc/hadoop/conf.cloudera.yarn")Sys.setenv(SPARK_HOME = "/home/user/Downloads/spark-1.6.1-bin-hadoop2.6").libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))library(SparkR)Sys.setenv("SPARKR_SUBMIT_ARGS"="--master yarn-client sparkr-shell")sc <- sparkR.init(sparkEnvir = list(spark.shuffle.service.enabled=TRUE,spark.dynamicAllocation.enabled=TRUE,spark.dynamicAllocation.initialExecutors="40"))hiveContext <- sparkRHive.init(sc)n = 1000x = data.frame(id = 1:n, val = rnorm(n))xs <- createDataFrame(hiveContext, x)xshead(xs)collect(xs)