Spark on K8s - getting error: kube mode not support referencing app depenpendcies in local
The error message comes from commit 5d7c4ba4d73a72f26d591108db3c20b4a6c84f3f and include the page you mention: "Running Spark on Kubernetes" with the mention that you indicate:
// TODO(SPARK-23153): remove once submission client local dependencies are supported.if (existSubmissionLocalFiles(sparkJars) || existSubmissionLocalFiles(sparkFiles)) { throw new SparkException("The Kubernetes mode does not yet support referencing application " + "dependencies in the local file system.")}
This is described in SPARK-18278:
it wouldn't accept running a local: jar file, e.g.
local:///opt/spark/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar
, on my spark docker image (allowsMixedArguments
andisAppResourceReq booleans
inSparkSubmitCommandBuilder.java
get in the way).
And this is linked to kubernetes issue 34377
The issue SPARK-22962 "Kubernetes app fails if local files are used" mentions:
This is the resource staging server use-case. We'll upstream this in the 2.4.0 timeframe.
In the meantime, that error message was introduced in PR 20320.
It includes the comment:
The manual tests I did actually use a main app jar located on gcs and http.
To be specific and for record, I did the following tests:
- Using a gs:// main application jar and a http:// dependency jar. Succeeded.
- Using a https:// main application jar and a http:// dependency jar. Succeeded.
- Using a local:// main application jar. Succeeded.
- Using a file:// main application jar. Failed.
- Using a file:// dependency jar. Failed.
That issue should been fixed by now, and the OP garfiny confirms in the comments:
I used the newest
spark-kubernetes jar
to replace the one inspark-2.3.0-bin-hadoop2.7
package. The exception is gone.
According to the mentioned documentation:
Dependency Management
If your application’s dependencies are allhosted in remote locations like HDFS or HTTP servers, they may bereferred to by their appropriate remote URIs. Also, applicationdependencies can be pre-mounted into custom-built Docker images. Thosedependencies can be added to the classpath by referencing them withlocal:// URIs and/or setting the SPARK_EXTRA_CLASSPATH environmentvariable in your Dockerfiles. The local:// scheme is also requiredwhen referring to dependencies in custom-built Docker images inspark-submit.
Note that using application dependencies from thesubmission client’s local file system is currently not yet supported.
I have the same case.
I do not know what to do? How to fix? Spark version 2.3.0.
Copied and renamed spark-kubernetes_2.11-2.3.1.jar -> spark-kubernetes_2.11-2.3.0.jar.
Spark does not find the corresponding kubernetes files.
bin/spark-submit \--master k8s://https://lubernetes:6443 \--deploy-mode cluster \--name spark-pi \--class org.apache.spark.examples.SparkPi \--conf spark.kubernetes.namespace=spark \--conf spark.executor.instances=5 \--conf spark.kubernetes.container.image=gcr.io/cloud-solutions-images/spark:v2.3.0-gcs \--conf spark.kubernetes.authenticate.submission.caCertFile=/var/run/secrets/kubernetes.io/serviceaccount/k8.crt \--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ local:///usr/spark-2.3.0/examples/jars/spark-examples_2.11-2.3.0.jar
Thanks for the help!