5e74570c8f
## What changes were proposed in this pull request? - solves the current issue with --packages in cluster mode (there is no ticket for it). Also note of some [issues](https://issues.apache.org/jira/browse/SPARK-22657) of the past here when hadoop libs are used at the spark submit side. - supports spark.jars, spark.files, app jar. It works as follows: Spark submit uploads the deps to the HCFS. Then the driver serves the deps via the Spark file server. No hcfs uris are propagated. The related design document is [here](https://docs.google.com/document/d/1peg_qVhLaAl4weo5C51jQicPwLclApBsdR1To2fgc48/edit). the next option to add is the RSS but has to be improved given the discussion in the past about it (Spark 2.3). ## How was this patch tested? - Run integration test suite. - Run an example using S3: ``` ./bin/spark-submit \ ... --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.6 \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.memory=1G \ --conf spark.kubernetes.namespace=spark \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ --conf spark.driver.memory=1G \ --conf spark.executor.instances=2 \ --conf spark.sql.streaming.metricsEnabled=true \ --conf "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp" \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image=skonto/spark:k8s-3.0.0 \ --conf spark.kubernetes.file.upload.path=s3a://fdp-stavros-test \ --conf spark.hadoop.fs.s3a.access.key=... \ --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \ --conf spark.hadoop.fs.s3a.fast.upload=true \ --conf spark.kubernetes.executor.deleteOnTermination=false \ --conf spark.hadoop.fs.s3a.secret.key=... \ --conf spark.files=client:///...resolv.conf \ file:///my.jar ** ``` Added integration tests based on [Ceph nano](https://github.com/ceph/cn). Looks very [active](http://www.sebastien-han.fr/blog/2019/02/24/Ceph-nano-is-getting-better-and-better/). Unfortunately minio needs hadoop >= 2.8. Closes #23546 from skonto/support-client-deps. Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Signed-off-by: Erik Erlandson <eerlands@redhat.com> |
||
---|---|---|
.. | ||
kubernetes | ||
mesos | ||
yarn |