ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Jungtaek Lim (HeartSaVioR)	128ea37bda	[SPARK-28601][CORE][SQL] Use StandardCharsets.UTF_8 instead of "UTF-8" string representation, and get rid of UnsupportedEncodingException ## What changes were proposed in this pull request? This patch tries to keep consistency whenever UTF-8 charset is needed, as using `StandardCharsets.UTF_8` instead of using "UTF-8". If the String type is needed, `StandardCharsets.UTF_8.name()` is used. This change also brings the benefit of getting rid of `UnsupportedEncodingException`, as we're providing `Charset` instead of `String` whenever possible. This also changes some private Catalyst helper methods to operate on encodings as `Charset` objects rather than strings. ## How was this patch tested? Existing unit tests. Closes #25335 from HeartSaVioR/SPARK-28601. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-05 20:45:54 -07:00
HyukjinKwon	946aef0535	[SPARK-28550][K8S][TESTS] Unset SPARK_HOME environment variable in K8S integration preparation ## What changes were proposed in this pull request? Currently, if we run the Kubernetes integration tests with `SPARK_HOME` already set, it refers the `SPARK_HOME` even when `--spark-tgz` is specified. This PR proposes to unset `SPARK_HOME` to let the docker-image-tool script detect `SPARK_HOME`. Otherwise, it cannot indicate the unpacked directory as its home. ## How was this patch tested? ```bash export SPARK_HOME=`pwd` dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pkubernetes resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --spark-tgz $PWD/spark-.tgz ``` Before:* ``` + /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/bin/docker-image-tool.sh -r docker.io/kubespark -t 650B51C8-BBED-47C9-AEAB-E66FC9A0E64E -p /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory cp: assembly/target/scala-2.12/jars: No such file or directory cp: resource-managers/kubernetes/integration-tests/tests: No such file or directory cp: examples/target/scala-2.12/jars/: No such file or directory cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory Cannot find docker image. This script must be run from a runnable distribution of Apache Spark. ... [INFO] Spark Project Kubernetes Integration Tests ......... FAILURE [ 4.870 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE ``` After:* ``` + /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/bin/docker-image-tool.sh -r docker.io/kubespark -t 2BA5883A-A0AC-4D2B-8D00-702D31B59B23 -p /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build Sending build context to Docker daemon 250.2MB Step 1/15 : FROM openjdk:8-alpine ---> a3562aa0b991 ... Successfully built 8614fb5ac279 Successfully tagged kubespark/spark:2BA5883A-A0AC-4D2B-8D00-702D31B59B23 ``` Closes #25283 from HyukjinKwon/SPARK-28550. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-07-29 10:47:28 -07:00
Junjie Chen	780d176136	[SPARK-28042][K8S] Support using volume mount as local storage ## What changes were proposed in this pull request? This pr is used to support using hostpath/PV volume mounts as local storage. In KubernetesExecutorBuilder.scala, the LocalDrisFeatureStep is built before MountVolumesFeatureStep which means we cannot use any volumes mount later. This pr adjust the order of feature building steps which moves localDirsFeature at last so that we can check if directories in SPARK_LOCAL_DIRS are set to volumes mounted such as hostPath, PV, or others. ## How was this patch tested? Unit tests Closes #24879 from chenjunjiedada/SPARK-28042. Lead-authored-by: Junjie Chen <jimmyjchen@tencent.com> Co-authored-by: Junjie Chen <cjjnjust@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-07-29 10:44:17 -07:00
Dongjoon Hyun	767802500c	[SPARK-28534][K8S][TEST] Update node affinity for DockerForDesktop backend in PVTestsSuite ## What changes were proposed in this pull request? This PR aims to recover our K8s integration test suite by extending node affinity in order to pass `PVTestsSuite` in `DockerForDesktop` environment, too. Previously, `PVTestsSuite` fails at `--deploy-mode docker-for-desktop` option because the node affinity requires `minibase` node. For `Docker Desktop`, there are two node names like the following. Note that Spark testing needs K8s v1.13 and above. So, this PR should be verified with `Docker Desktop (Edge)` version. This PR adds both because next stable `Docker Desktop` will have K8s v1.14.3. Docker Desktop (Stable, K8s v1.10.11) ``` $ kubectl get node NAME STATUS ROLES AGE VERSION docker-for-desktop Ready master 52s v1.10.11 ``` Docker Desktop 2.1.0.0 (Edge, K8s v1.14.3, Released 2019-07-26) ``` $ kubectl get node NAME STATUS ROLES AGE VERSION docker-desktop Ready master 16h v1.14.3 ``` ## How was this patch tested? Pass the Jenkins K8s integration test (`minibase`) and install `Docker Desktop 2.1.0.0 (Edge)` and run the integration test in `DockerForDesktop`. Note that this fixes only `PVTestsSuite`. ``` $ dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pkubernetes $ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --spark-tgz $PWD/spark-*.tgz ... KubernetesSuite: ... - PVs with local storage ... ``` Closes #25269 from dongjoon-hyun/SPARK-28534. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-07-29 16:28:56 +09:00
Stavros Kontopoulos	7504eab42f	[SPARK-28465][K8S] Fix integration tests which fail due to missing ceph-nano image ## What changes were proposed in this pull request? Fixes the tests. Follows instructions here: https://github.com/ceph/cn/issues/115#issuecomment-497384369 ## How was this patch tested? Manually by running the tests with minikube. Closes #25222 from skonto/fix-ceph. Authored-by: Stavros Kontopoulos <st.kontopoulos@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-07-24 14:58:20 -07:00
Douglas R Colkitt	8fc5cb6285	[SPARK-28473][DOC] Stylistic consistency of build command in README ## What changes were proposed in this pull request? Change the format of the build command in the README to start with a `./` prefix ./build/mvn -DskipTests clean package This increases stylistic consistency across the README- all the other commands have a `./` prefix. Having a visible `./` prefix also makes it clear to the user that the shell command requires the current working directory to be at the repository root. ## How was this patch tested? README.md was reviewed both in raw markdown and in the Github rendered landing page for stylistic consistency. Closes #25231 from Mister-Meeseeks/master. Lead-authored-by: Douglas R Colkitt <douglas.colkitt@gmail.com> Co-authored-by: Mister-Meeseeks <douglas.colkitt@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-07-23 16:29:46 -07:00
Onur Satici	e7c97a3d86	[SPARK-28145][K8S] safe runnable in polling executor source ## What changes were proposed in this pull request? Add error handling to `ExecutorPodsPollingSnapshotSource` Closes #24952 from onursatici/os/polling-source. Authored-by: Onur Satici <onursatici@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-06-28 09:38:43 -05:00
Xiangrui Meng	7056e004ee	[SPARK-27823][CORE] Refactor resource handling code ## What changes were proposed in this pull request? Continue the work from https://github.com/apache/spark/pull/24821. Refactor resource handling code to make the code more readable. Major changes: * Moved resource-related classes to `spark.resource` from `spark`. * Added ResourceUtils and helper classes so we don't need to directly deal with Spark conf. * ResourceID: resource identifier and it provides conf keys * ResourceRequest/Allocation: abstraction for requested and allocated resources * Added `TestResourceIDs` to reference commonly used resource IDs in tests like `spark.executor.resource.gpu`. cc: tgravescs jiangxb1987 Ngone51 ## How was this patch tested? Unit tests for added utils and existing unit tests. Closes #24856 from mengxr/SPARK-27823. Lead-authored-by: Xiangrui Meng <meng@databricks.com> Co-authored-by: Thomas Graves <tgraves@nvidia.com> Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>	2019-06-18 17:18:17 -07:00
Stavros Kontopoulos	7912ab85a6	[SPARK-27872][K8S] Fix executor service account inconsistency ## What changes were proposed in this pull request? Fixes the service account inconsistency that breaks pull secrets. It gives the option to the user to setup a specific service account for the executors if he has to (via `spark.kubernetes.authenticate.executor.serviceAccountName`). Defaults to the driver's one. We are not supporting special authentication credentials for the executors with this PR. ## How was this patch tested? Tested manually by launching a Spark job exercising the introduced settings. Added a new integration tests for this fix. Closes #24748 from skonto/fix_executor_sa. Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-06-09 16:28:37 -05:00
Thomas Graves	d30284b5a5	[SPARK-27760][CORE] Spark resources - change user resource config from .count to .amount ## What changes were proposed in this pull request? Change the resource config spark.{executor/driver}.resource.{resourceName}.count to .amount to allow future usage of containing both a count and a unit. Right now we only support counts - # of gpus for instance, but in the future we may want to support units for things like memory - 25G. I think making the user only have to specify a single config .amount is better then making them specify 2 separate configs of a .count and then a .unit. Change it now since its a user facing config. Amount also matches how the spark on yarn configs are setup. ## How was this patch tested? Unit tests and manually verified on yarn and local cluster mode Closes #24810 from tgravescs/SPARK-27760-amount. Authored-by: Thomas Graves <tgraves@nvidia.com> Signed-off-by: Thomas Graves <tgraves@apache.org>	2019-06-06 14:16:05 -05:00
Thomas Graves	1277f8fa92	[SPARK-27362][K8S] Resource Scheduling support for k8s ## What changes were proposed in this pull request? Add ability to map the spark resource configs spark.{executor/driver}.resource.{resourceName} to kubernetes Container builder so that we request resources (gpu,s/fpgas/etc) from kubernetes. Note that the spark configs will overwrite any resource configs users put into a pod template. I added a generic vendor config which is only used by kubernetes right now. I intentionally didn't put it into the kubernetes config namespace just to avoid adding more config prefixes. I will add more documentation for this under jira SPARK-27492. I think it will be easier to do all at once to get cohesive story. ## How was this patch tested? Unit tests and manually testing on k8s cluster. Closes #24703 from tgravescs/SPARK-27362. Authored-by: Thomas Graves <tgraves@nvidia.com> Signed-off-by: Thomas Graves <tgraves@apache.org>	2019-05-31 15:26:14 -05:00
Yuming Wang	db3e746b64	[SPARK-27875][CORE][SQL][ML][K8S] Wrap all PrintWriter with Utils.tryWithResource ## What changes were proposed in this pull request? This pr wrap all `PrintWriter` with `Utils.tryWithResource` to prevent resource leak. ## How was this patch tested? Existing test Closes #24739 from wangyum/SPARK-27875. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-05-30 19:54:32 +09:00
Stavros Kontopoulos	5e74570c8f	[SPARK-23153][K8S] Support client dependencies with a Hadoop Compatible File System ## What changes were proposed in this pull request? - solves the current issue with --packages in cluster mode (there is no ticket for it). Also note of some [issues](https://issues.apache.org/jira/browse/SPARK-22657) of the past here when hadoop libs are used at the spark submit side. - supports spark.jars, spark.files, app jar. It works as follows: Spark submit uploads the deps to the HCFS. Then the driver serves the deps via the Spark file server. No hcfs uris are propagated. The related design document is [here](https://docs.google.com/document/d/1peg_qVhLaAl4weo5C51jQicPwLclApBsdR1To2fgc48/edit). the next option to add is the RSS but has to be improved given the discussion in the past about it (Spark 2.3). ## How was this patch tested? - Run integration test suite. - Run an example using S3: ``` ./bin/spark-submit \ ... --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.6 \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.memory=1G \ --conf spark.kubernetes.namespace=spark \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ --conf spark.driver.memory=1G \ --conf spark.executor.instances=2 \ --conf spark.sql.streaming.metricsEnabled=true \ --conf "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp" \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image=skonto/spark:k8s-3.0.0 \ --conf spark.kubernetes.file.upload.path=s3a://fdp-stavros-test \ --conf spark.hadoop.fs.s3a.access.key=... \ --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \ --conf spark.hadoop.fs.s3a.fast.upload=true \ --conf spark.kubernetes.executor.deleteOnTermination=false \ --conf spark.hadoop.fs.s3a.secret.key=... \ --conf spark.files=client:///...resolv.conf \ file:///my.jar ** ``` Added integration tests based on [Ceph nano](https://github.com/ceph/cn). Looks very [active](http://www.sebastien-han.fr/blog/2019/02/24/Ceph-nano-is-getting-better-and-better/). Unfortunately minio needs hadoop >= 2.8. Closes #23546 from skonto/support-client-deps. Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Signed-off-by: Erik Erlandson <eerlands@redhat.com>	2019-05-22 16:15:42 -07:00
Arun Mahadevan	1a8c09334d	[SPARK-27754][K8S] Introduce additional config (spark.kubernetes.driver.request.cores) for driver request cores for spark on k8s ## What changes were proposed in this pull request? Spark on k8s supports config for specifying the executor cpu requests (spark.kubernetes.executor.request.cores) but a similar config is missing for the driver. Instead, currently `spark.driver.cores` value is used for integer value. Although `pod spec` can have `cpu` for the fine-grained control like the following, this PR proposes additional configuration `spark.kubernetes.driver.request.cores` for driver request cores. ``` resources: requests: memory: "64Mi" cpu: "250m" ``` ## How was this patch tested? Unit tests Closes #24630 from arunmahadevan/SPARK-27754. Authored-by: Arun Mahadevan <arunm@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-18 21:28:46 -07:00
Sean Owen	bfb3ffe9b3	[SPARK-27682][CORE][GRAPHX][MLLIB] Replace use of collections and methods that will be removed in Scala 2.13 with work-alikes ## What changes were proposed in this pull request? This replaces use of collection classes like `MutableList` and `ArrayStack` with workalikes that are available in 2.12, as they will be removed in 2.13. It also removes use of `.to[Collection]` as its uses was superfluous anyway. Removing `collection.breakOut` will have to wait until 2.13 ## How was this patch tested? Existing tests Closes #24586 from srowen/SPARK-27682. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-05-15 09:29:12 -05:00
Adi Muraru	8ef4da753d	[SPARK-27610][YARN] Shade netty native libraries ## What changes were proposed in this pull request? Fixed the `spark-<version>-yarn-shuffle.jar` artifact packaging to shade the native netty libraries: - shade the `META-INF/native/libnetty_*` native libraries when packagin the yarn shuffle service jar. This is required as netty library loader derives that based on shaded package name. - updated the `org/spark_project` shade package prefix to `org/sparkproject` (i.e. removed underscore) as the former breaks the netty native lib loading. This was causing the yarn external shuffle service to fail when spark.shuffle.io.mode=EPOLL ## How was this patch tested? Manual tests Closes #24502 from amuraru/SPARK-27610_master. Authored-by: Adi Muraru <amuraru@adobe.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-05-07 10:47:36 -07:00
Rob Vesse	b1c6b60ce7	[SPARK-26729][K8S] Fix typo with default value for R image name ## What changes were proposed in this pull request? As discovered by users making use of this feature, there is a bug in the declaration of the `R_IMAGE_NAME` variable that causes the default name to not be properly set to `spark-r` but rather to just `-r` ## How was this patch tested? Verified that the image name for the R image is now appropriately populated in the integration test script via Bash debug output. NB - The fact that this wasn't spotted earlier highlights the fact that currently the K8S integration test suite does not have any tests for the R image as if it had this would have failed integration testing in the original PR #23846 Closes #24449 from rvesse/SPARK-26729. Authored-by: Rob Vesse <rvesse@dotnetrdf.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-24 21:08:42 -07:00
Sean Owen	d4420b455a	[SPARK-27323][CORE][SQL][STREAMING] Use Single-Abstract-Method support in Scala 2.12 to simplify code ## What changes were proposed in this pull request? Use Single Abstract Method syntax where possible (and minor related cleanup). Comments below. No logic should change here. ## How was this patch tested? Existing tests. Closes #24241 from srowen/SPARK-27323. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-04-02 07:37:05 -07:00
Yuming Wang	b670f39fc6	[SPARK-24793][FOLLOW-UP][K8S] Remove duplicate declaration of mockito-core ## What changes were proposed in this pull request? ``` [WARNING] Some problems were encountered while building the effective model for org.apache.spark:spark-kubernetes_2.12🫙3.0.0-SNAPSHOT [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: org.mockito:mockito-core:jar -> duplicate declaration of version (?) org.apache.spark:spark-kubernetes_2.12:[unknown-version], /Users/yumwang/spark/resource-managers/kubernetes/core/pom.xml, line 98, column 17 ``` This pr remove duplicate declaration of `mockito-core`. ## How was this patch tested? N/A Closes #24256 from wangyum/SPARK-24793-FOLLOW-UP. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-03-30 21:29:32 -07:00
Stavros Kontopoulos	39577a27a0	[SPARK-24902][K8S] Add PV integration tests ## What changes were proposed in this pull request? - Adds persistent volume integration tests - Adds a custom tag to the test to exclude it if it is run against a cloud backend. - Assumes default fs type for the host, AFAIK that is ext4. ## How was this patch tested? Manually run the tests against minikube as usual: ``` [INFO] --- scalatest-maven-plugin:1.0:test (integration-test) spark-kubernetes-integration-tests_2.12 --- Discovery starting. Discovery completed in 192 milliseconds. Run starting. Expected test count is: 16 KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Use SparkLauncher.NO_RESOURCE - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - Run extraJVMOptions check on driver - Run SparkRemoteFileTest using a remote data file - Run SparkPi with env and mount secrets. - Run PySpark on simple pi.py example - Run PySpark with Python2 to test a pyfiles example - Run PySpark with Python3 to test a pyfiles example - Run PySpark with memory customization - Run in client mode. - Start pod creation from template - Test PVs with local storage ``` Closes #23514 from skonto/pvctests. Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Signed-off-by: shane knapp <incomplete@gmail.com>	2019-03-27 13:00:56 -07:00
Stavros Kontopoulos	05168e725d	[SPARK-24793][K8S] Enhance spark-submit for app management - supports `--kill` & `--status` flags. - supports globs which is useful in general check this long standing [issue](https://github.com/kubernetes/kubernetes/issues/17144#issuecomment-272052461) for kubectl. Manually against running apps. Example output: Submission Id reported at launch time: ``` 2019-01-20 23:47:56 INFO Client:58 - Waiting for application spark-pi with submissionId spark:spark-pi-1548020873671-driver to finish... ``` Killing the app: ``` ./bin/spark-submit --kill spark:spark-pi-1548020873671-driver --master k8s://https://192.168.2.8:8443 2019-01-20 23:48:07 WARN Utils:70 - Your hostname, universe resolves to a loopback address: 127.0.0.1; using 192.168.2.8 instead (on interface wlp2s0) 2019-01-20 23:48:07 WARN Utils:70 - Set SPARK_LOCAL_IP if you need to bind to another address ``` App terminates with 143 (SIGTERM, since we have tiny this should lead to [graceful shutdown](https://cloud.google.com/solutions/best-practices-for-building-containers)): ``` 2019-01-20 23:48:08 INFO LoggingPodStatusWatcherImpl:58 - State changed, new state: pod name: spark-pi-1548020873671-driver namespace: spark labels: spark-app-selector -> spark-e4730c80e1014b72aa77915a2203ae05, spark-role -> driver pod uid: 0ba9a794-1cfd-11e9-8215-a434d9270a65 creation time: 2019-01-20T21:47:55Z service account name: spark-sa volumes: spark-local-dir-1, spark-conf-volume, spark-sa-token-b7wcm node name: minikube start time: 2019-01-20T21:47:55Z phase: Running container status: container name: spark-kubernetes-driver container image: skonto/spark:k8s-3.0.0 container state: running container started at: 2019-01-20T21:48:00Z 2019-01-20 23:48:09 INFO LoggingPodStatusWatcherImpl:58 - State changed, new state: pod name: spark-pi-1548020873671-driver namespace: spark labels: spark-app-selector -> spark-e4730c80e1014b72aa77915a2203ae05, spark-role -> driver pod uid: 0ba9a794-1cfd-11e9-8215-a434d9270a65 creation time: 2019-01-20T21:47:55Z service account name: spark-sa volumes: spark-local-dir-1, spark-conf-volume, spark-sa-token-b7wcm node name: minikube start time: 2019-01-20T21:47:55Z phase: Failed container status: container name: spark-kubernetes-driver container image: skonto/spark:k8s-3.0.0 container state: terminated container started at: 2019-01-20T21:48:00Z container finished at: 2019-01-20T21:48:08Z exit code: 143 termination reason: Error 2019-01-20 23:48:09 INFO LoggingPodStatusWatcherImpl:58 - Container final statuses: container name: spark-kubernetes-driver container image: skonto/spark:k8s-3.0.0 container state: terminated container started at: 2019-01-20T21:48:00Z container finished at: 2019-01-20T21:48:08Z exit code: 143 termination reason: Error 2019-01-20 23:48:09 INFO Client:58 - Application spark-pi with submissionId spark:spark-pi-1548020873671-driver finished. 2019-01-20 23:48:09 INFO ShutdownHookManager:58 - Shutdown hook called 2019-01-20 23:48:09 INFO ShutdownHookManager:58 - Deleting directory /tmp/spark-f114b2e0-5605-4083-9203-a4b1c1f6059e ``` Glob scenario: ``` ./bin/spark-submit --status spark:spark-pi* --master k8s://https://192.168.2.8:8443 2019-01-20 22:27:44 WARN Utils:70 - Your hostname, universe resolves to a loopback address: 127.0.0.1; using 192.168.2.8 instead (on interface wlp2s0) 2019-01-20 22:27:44 WARN Utils:70 - Set SPARK_LOCAL_IP if you need to bind to another address Application status (driver): pod name: spark-pi-1547948600328-driver namespace: spark labels: spark-app-selector -> spark-f13f01702f0b4503975ce98252d59b94, spark-role -> driver pod uid: c576e1c6-1c54-11e9-8215-a434d9270a65 creation time: 2019-01-20T01:43:22Z service account name: spark-sa volumes: spark-local-dir-1, spark-conf-volume, spark-sa-token-b7wcm node name: minikube start time: 2019-01-20T01:43:22Z phase: Running container status: container name: spark-kubernetes-driver container image: skonto/spark:k8s-3.0.0 container state: running container started at: 2019-01-20T01:43:27Z Application status (driver): pod name: spark-pi-1547948792539-driver namespace: spark labels: spark-app-selector -> spark-006d252db9b24f25b5069df357c30264, spark-role -> driver pod uid: 38375b4b-1c55-11e9-8215-a434d9270a65 creation time: 2019-01-20T01:46:35Z service account name: spark-sa volumes: spark-local-dir-1, spark-conf-volume, spark-sa-token-b7wcm node name: minikube start time: 2019-01-20T01:46:35Z phase: Succeeded container status: container name: spark-kubernetes-driver container image: skonto/spark:k8s-3.0.0 container state: terminated container started at: 2019-01-20T01:46:39Z container finished at: 2019-01-20T01:46:56Z exit code: 0 termination reason: Completed ``` Closes #23599 from skonto/submit_ops_extension. Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-26 11:55:03 -07:00
Sean Owen	8bc304f97e	[SPARK-26132][BUILD][CORE] Remove support for Scala 2.11 in Spark 3.0.0 ## What changes were proposed in this pull request? Remove Scala 2.11 support in build files and docs, and in various parts of code that accommodated 2.11. See some targeted comments below. ## How was this patch tested? Existing tests. Closes #23098 from srowen/SPARK-26132. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-03-25 10:46:42 -05:00
Rob Vesse	61d99462a0	[SPARK-26729][K8S] Make image names under test configurable ## What changes were proposed in this pull request? Allow specifying system properties to customise the image names for the images used in the integration testing. Useful if your CI/CD pipeline or policy requires using a different naming format. This is one part of addressing SPARK-26729, I plan to have a follow up patch that will also make the names configurable when using `docker-image-tool.sh` ## How was this patch tested? Ran integration tests against custom images generated by our CI/CD pipeline that do not follow Spark's existing hardcoded naming conventions using the new system properties to override the image names appropriately: ``` mvn clean integration-test -pl :spark-kubernetes-integration-tests_${SCALA_VERSION} \ -Pkubernetes -Pkubernetes-integration-tests \ -P${SPARK_HADOOP_PROFILE} -Dhadoop.version=${HADOOP_VERSION} \ -Dspark.kubernetes.test.sparkTgz=${TARBALL} \ -Dspark.kubernetes.test.imageTag=${TAG} \ -Dspark.kubernetes.test.imageRepo=${REPO} \ -Dspark.kubernetes.test.namespace=${K8S_NAMESPACE} \ -Dspark.kubernetes.test.kubeConfigContext=${K8S_CONTEXT} \ -Dspark.kubernetes.test.deployMode=${K8S_TEST_DEPLOY_MODE} \ -Dspark.kubernetes.test.jvmImage=apache-spark \ -Dspark.kubernetes.test.pythonImage=apache-spark-py \ -Dspark.kubernetes.test.rImage=apache-spark-r \ -Dtest.include.tags=k8s ... [INFO] --- scalatest-maven-plugin:1.0:test (integration-test) spark-kubernetes-integration-tests_2.12 --- Discovery starting. Discovery completed in 230 milliseconds. Run starting. Expected test count is: 15 KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Use SparkLauncher.NO_RESOURCE - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - Run extraJVMOptions check on driver - Run SparkRemoteFileTest using a remote data file - Run SparkPi with env and mount secrets. - Run PySpark on simple pi.py example - Run PySpark with Python2 to test a pyfiles example - Run PySpark with Python3 to test a pyfiles example - Run PySpark with memory customization - Run in client mode. - Start pod creation from template Run completed in 8 minutes, 33 seconds. Total number of tests run: 15 Suites: completed 2, aborted 0 Tests: succeeded 15, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Closes #23846 from rvesse/SPARK-26729. Authored-by: Rob Vesse <rvesse@dotnetrdf.org> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-20 14:28:27 -07:00
shane knapp	5564fe5151	[SPARK-27178][K8S] add nss to the spark/k8s Dockerfile ## What changes were proposed in this pull request? while performing some tests on our existing minikube and k8s infrastructure, i noticed that the integration tests were failing. i dug in and discovered the following message buried at the end of the stacktrace: ``` Caused by: java.io.FileNotFoundException: /usr/lib/libnss3.so at sun.security.pkcs11.Secmod.initialize(Secmod.java:193) at sun.security.pkcs11.SunPKCS11.<init>(SunPKCS11.java:218) ... 81 more ``` after i added the `nss` package to `resource-managers/kubernetes/docker/src/main/dockerfiles/spark/Dockerfile`, everything worked. this is also impacting current builds. see: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/8959/console ## How was this patch tested? i tested locally before pushing, and the build system will test the rest. Closes #24111 from shaneknapp/add-nss-package-to-dockerfile. Authored-by: shane knapp <incomplete@gmail.com> Signed-off-by: shane knapp <incomplete@gmail.com>	2019-03-18 16:38:42 -07:00
Holden Karau	ce89d09bdf	[SPARK-26343][K8S] Try to speed up running local k8s integration tests Speed up running k8s integration tests locally by allowing folks to skip the tgz dist build and extraction Run tests locally without a distribution of Spark, just a local build Closes #23380 from holdenk/SPARK-26343-Speed-up-running-the-kubernetes-integration-tests-locally. Authored-by: Holden Karau <holden@pigscanfly.ca> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-14 19:39:48 -07:00
Jiaxin Shan	2d0b7cfe44	[SPARK-26742][K8S] Update Kubernetes-Client version to 4.1.2 ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/23814 was reverted because of Jenkins integration tests failure. After minikube upgrade, Kubernetes client SDK v1.4.2 work with kubernetes v1.13. We can bring this change back. Reference: [Bump Kubernetes Client Version to 4.1.2](https://issues.apache.org/jira/browse/SPARK-26742) [Original PR against master](https://github.com/apache/spark/pull/23814) [Kubernetes client upgrade for Spark 2.4](https://github.com/apache/spark/pull/23993) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Unit Tests: ``` All tests passed. [INFO] ------------------------------------------------------------------------ [INFO] Reactor Summary for Spark Project Parent POM 3.0.0-SNAPSHOT: [INFO] [INFO] Spark Project Parent POM ........................... SUCCESS [ 2.343 s] [INFO] Spark Project Tags ................................. SUCCESS [ 2.039 s] [INFO] Spark Project Sketch ............................... SUCCESS [ 12.714 s] [INFO] Spark Project Local DB ............................. SUCCESS [ 2.185 s] [INFO] Spark Project Networking ........................... SUCCESS [ 38.154 s] [INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 7.989 s] [INFO] Spark Project Unsafe ............................... SUCCESS [ 2.297 s] [INFO] Spark Project Launcher ............................. SUCCESS [ 2.813 s] [INFO] Spark Project Core ................................. SUCCESS [38:03 min] [INFO] Spark Project ML Local Library ..................... SUCCESS [ 3.848 s] [INFO] Spark Project GraphX ............................... SUCCESS [ 56.084 s] [INFO] Spark Project Streaming ............................ SUCCESS [04:58 min] [INFO] Spark Project Catalyst ............................. SUCCESS [06:39 min] [INFO] Spark Project SQL .................................. SUCCESS [37:12 min] [INFO] Spark Project ML Library ........................... SUCCESS [18:59 min] [INFO] Spark Project Tools ................................ SUCCESS [ 0.767 s] [INFO] Spark Project Hive ................................. SUCCESS [33:45 min] [INFO] Spark Project REPL ................................. SUCCESS [01:14 min] [INFO] Spark Project Assembly ............................. SUCCESS [ 1.444 s] [INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [01:12 min] [INFO] Kafka 0.10+ Token Provider for Streaming ........... SUCCESS [ 6.719 s] [INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [07:00 min] [INFO] Spark Project Examples ............................. SUCCESS [ 21.805 s] [INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 0.906 s] [INFO] Spark Avro ......................................... SUCCESS [ 50.486 s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 02:32 h [INFO] Finished at: 2019-03-07T08:39:34Z [INFO] ------------------------------------------------------------------------ ``` Please review http://spark.apache.org/contributing.html before opening a pull request. Closes #24002 from Jeffwan/update_k8s_sdk_master. Authored-by: Jiaxin Shan <seedjeffwan@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-13 15:04:27 -07:00
chandulal.kavar	d4542a8ba8	[SPARK-27061][K8S] Expose Driver UI port on driver service to access … ## What changes were proposed in this pull request? Expose Spark UI port on driver service to access logs from service. ## How was this patch tested? The patch was tested using unit tests being contributed as a part of the PR Closes #23990 from chandulal/SPARK-27061. Authored-by: chandulal.kavar <cckavar@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-11 10:41:31 -07:00
Onur Satici	e9e8bb33ef	[SPARK-27023][K8S] Make k8s client timeouts configurable ## What changes were proposed in this pull request? Make k8s client timeouts configurable. No test suite exists for the client factory class, happy to add one if needed Closes #23928 from onursatici/os/k8s-client-timeouts. Lead-authored-by: Onur Satici <osatici@palantir.com> Co-authored-by: Onur Satici <onursatici@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-03-06 11:14:39 -08:00
Luca Canali	f13ea15d79	[SPARK-26995][K8S] Make ld-linux-x86-64.so.2 visible to snappy native library under /lib in docker image with Alpine Linux Running Spark in Docker image with Alpine Linux 3.9.0 throws errors when using snappy. The issue can be reproduced for example as follows: `Seq(1,2).toDF("id").write.format("parquet").save("DELETEME1")` The key part of the error stack is as follows `SparkException: Task failed while writing rows. .... Caused by: java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.7-2b4872f1-7c41-4b84-bda1-dbcb8dd0ce4c-libsnappyjava.so: Error loading shared library ld-linux-x86-64.so.2: Noded by /tmp/snappy-1.1.7-2b4872f1-7c41-4b84-bda1-dbcb8dd0ce4c-libsnappyjava.so)` The source of the error appears to be that libsnappyjava.so needs ld-linux-x86-64.so.2 and looks for it in /lib, while in Alpine Linux 3.9.0 with libc6-compat version 1.1.20-r3 ld-linux-x86-64.so.2 is located in /lib64. Note: this issue is not present with Alpine Linux 3.8 and libc6-compat version 1.1.19-r10 ## What changes were proposed in this pull request? A possible workaround proposed with this PR is to modify the Dockerfile by adding a symbolic link between /lib and /lib64 so that linux-x86-64.so.2 can be found in /lib. This is probably not the cleanest solution, but I have observed that this is what happened/happens already when using Alpine Linux 3.8.1 (a version of Alpine Linux which was not affected by the issue reported here). ## How was this patch tested? Manually tested by running a simple workload with spark-shell, using docker on a client machine and using Spark on a Kubernetes cluster. The test workload is: `Seq(1,2).toDF("id").write.format("parquet").save("DELETEME1")` Added a test to the KubernetesSuite / BasicTestsSuite Closes #23898 from LucaCanali/dockerfileUpdateSPARK26995. Authored-by: Luca Canali <luca.canali@cern.ch> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-04 09:59:12 -08:00
Marcelo Vanzin	9f16af6366	[K8S][MINOR] Log minikube version when running integration tests. Closes #23893 from vanzin/minikube-version. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-03-01 11:24:08 -08:00
Marcelo Vanzin	14f714fb30	[SPARK-26420][K8S] Generate more unique IDs when creating k8s resource names. Using the current time as an ID is more prone to clashes than people generally realize, so try to make things a bit more unique without necessarily using a UUID, which would eat too much space in the names otherwise. The implemented approach uses some bits from the current time, plus some random bits, which should be more resistant to clashes. Closes #23805 from vanzin/SPARK-26420. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-02-28 20:39:13 -08:00
Marcelo Vanzin	a6ddc9d083	[SPARK-24736][K8S] Let spark-submit handle dependency resolution. Before this change, there was some code in the k8s backend to deal with how to resolve dependencies and make them available to the Spark application. It turns out that none of that code is necessary, since spark-submit already handles all that for applications started in client mode - like the k8s driver that is run inside a Spark-created pod. For that reason, specifically for pyspark, there's no need for the k8s backend to deal with PYTHONPATH; or, in general, to change the URIs provided by the user at all. spark-submit takes care of that. For testing, I created a pyspark script that depends on another module that is shipped with --py-files. Then I used: - --py-files http://.../dep.py http://.../test.py - --py-files http://.../dep.zip http://.../test.py - --py-files local:/.../dep.py local:/.../test.py - --py-files local:/.../dep.zip local:/.../test.py Without this change, all of the above commands fail. With the change, the driver is able to see the dependencies in all the above cases; but executors don't see the dependencies in the last two. That's a bug in shared Spark code that deals with local: dependencies in pyspark (SPARK-26934). I also tested a Scala app using the main jar from an http server. Closes #23793 from vanzin/SPARK-24736. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-02-27 09:49:31 -08:00
liuxian	7912dbb88f	[MINOR] Simplify boolean expression ## What changes were proposed in this pull request? Comparing whether Boolean expression is equal to true is redundant For example: The datatype of `a` is boolean. Before: if (a == true) After: if (a) ## How was this patch tested? N/A Closes #23884 from 10110346/simplifyboolean. Authored-by: liuxian <liu.xian3@zte.com.cn> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-02-27 08:38:00 -06:00
Marcelo Vanzin	afbff6446f	Revert "[SPARK-26742][K8S] Update Kubernetes-Client version to 4.1.2" This reverts commit `a3192d966a`.	2019-02-26 13:42:07 -08:00
Jiaxin Shan	a3192d966a	[SPARK-26742][K8S] Update Kubernetes-Client version to 4.1.2 ## What changes were proposed in this pull request? Changed the `kubernetes-client` version to 4.1.2. Latest version fix error with exec credentials (used by aws eks) and this will be used to talk with kubernetes API server. Users can submit spark job to EKS api endpoint now with this patch. ## How was this patch tested? unit tests and manual tests. Closes #23814 from Jeffwan/update_k8s_sdk. Authored-by: Jiaxin Shan <seedjeffwan@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-02-25 04:56:04 -06:00
Marcelo Vanzin	61c3cdc706	[SPARK-24894][K8S] Make sure valid host names are created for executors. Since the host name is derived from the app name, which can contain arbitrary characters, it needs to be sanitized so that only valid characters are allowed. On top of that, take extra care that truncation doesn't leave characters that are valid except at the start of a host name. Closes #23781 from vanzin/SPARK-24894. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-02-19 15:19:59 -08:00
Holden Karau	6b3c832dac	[SPARK-26882] Check the Kubernetes integration tests scalatyle ## What changes were proposed in this pull request? Add the kubernetes integration tests to the scalastyle profiles. ## How was this patch tested? Run ./dev/scalastyle with a bad change manually ## Follow on work See SPARK-26898 to add scalastyle for k8s integration to the CI Closes #23792 from holdenk/SPARK-26882-check-k8s-integration-tests-when-linting. Authored-by: Holden Karau <holden@pigscanfly.ca> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-02-19 13:49:47 -08:00
Marcelo Vanzin	c624f5d683	[SPARK-26733][K8S] Cleanup entrypoint.sh. Merge both case statements, and remove unused variables that are not set by the Scala code anymore. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #23655 from vanzin/SPARK-26733.	2019-02-05 16:00:18 -08:00
Stavros Kontopoulos	196ca0c8f5	[SPARK-26603][K8S] Update minikube backend ## What changes were proposed in this pull request? - Covers latest minikube versions. - keeps the older version support Note: While I was facing disk pressure issues locally on machine, I noticed minikube status command would report that everything was working fine even if some kube-system pods were not up. I don't think the output is 100% reliable but it is good enough for most cases. ## How was this patch tested? Run it against latest version of minikube (v0.32.0). Author: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Closes #23520 from skonto/update-mini-backend.	2019-02-03 17:15:20 -08:00
Marcelo Vanzin	2a67dbfbd3	[SPARK-26595][CORE] Allow credential renewal based on kerberos ticket cache. This change addes a new mode for credential renewal that does not require a keytab; it uses the local ticket cache instead, so it works while the user keeps the cache valid. This can be useful for, e.g., people running long spark-shell sessions where their kerberos login is kept up-to-date. The main change to enable this behavior is in HadoopDelegationTokenManager, with a small change in the HDFS token provider. The other changes are to avoid creating duplicate tokens when submitting the application to YARN; they allow the tokens from the scheduler to be sent to the YARN AM, reducing the round trips to HDFS. For that, the scheduler initialization code was changed a little bit so that the tokens are available when the YARN client is initialized. That basically takes care of a long-standing TODO that was in the code to clean up configuration propagation to the driver's RPC endpoint (in CoarseGrainedSchedulerBackend). Tested with an app designed to stress this functionality, with both keytab and cache-based logins. Some basic kerberos tests on k8s also. Closes #23525 from vanzin/SPARK-26595. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-01-28 13:32:34 -08:00
Rob Vesse	dc2da72100	[SPARK-26685][K8S] Correct placement of ARG declaration Latest Docker releases are stricter in their enforcement of build argument scope. The location of the `ARG spark_uid` declaration in the Python and R Dockerfiles means the variable is out of scope by the time it is used in a `USER` declaration resulting in a container running as root rather than the default/configured UID. Also with some of the refactoring of the script that has happened since my PR that introduced the configurable UID it turns out the `-u <uid>` argument is not being properly passed to the Python and R image builds when those are opted into ## What changes were proposed in this pull request? This commit moves the `ARG` declaration to just before the argument is used such that it is in scope. It also ensures that Python and R image builds receive the build arguments that include the `spark_uid` argument where relevant ## How was this patch tested? Prior to the patch images are produced where the Python and R images ignore the default/configured UID: ``` > docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456 bash-4.4# whoami root bash-4.4# id -u 0 bash-4.4# exit > docker run -it --entrypoint /bin/bash rvesse/spark:uid456 bash-4.4$ id -u 456 bash-4.4$ exit ``` Note that the Python image is still running as `root` having ignored the configured UID of 456 while the base image has the correct UID because the relevant `ARG` declaration is correctly in scope. After the patch the correct UID is observed: ``` > docker run -it --entrypoint /bin/bash rvesse/spark-r:uid456 bash-4.4$ id -u 456 bash-4.4$ exit exit > docker run -it --entrypoint /bin/bash rvesse/spark-py:uid456 bash-4.4$ id -u 456 bash-4.4$ exit exit > docker run -it --entrypoint /bin/bash rvesse/spark:uid456 bash-4.4$ id -u 456 bash-4.4$ exit ``` Closes #23611 from rvesse/SPARK-26685. Authored-by: Rob Vesse <rvesse@dotnetrdf.org> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-01-22 10:31:17 -08:00
Rob Vesse	c542c247bb	[SPARK-25887][K8S] Configurable K8S context support This enhancement allows for specifying the desired context to use for the initial K8S client auto-configuration. This allows users to more easily access alternative K8S contexts without having to first explicitly change their current context via kubectl. Explicitly set my K8S context to a context pointing to a non-existent cluster, then launched Spark jobs with explicitly specified contexts via the new `spark.kubernetes.context` configuration property. Example Output: ``` > kubectl config current-context minikube > minikube status minikube: Stopped cluster: kubectl: > ./spark-submit --master k8s://https://localhost:6443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2 --conf spark.kubernetes.context=docker-for-desktop --conf spark.kubernetes.container.image=rvesse/spark:debian local:///opt/spark/examples/jars/spark-examples_2.11-3.0.0-SNAPSHOT.jar 4 18/10/31 11:57:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/10/31 11:57:51 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using context docker-for-desktop from users K8S config file 18/10/31 11:57:52 INFO LoggingPodStatusWatcherImpl: State changed, new state: pod name: spark-pi-1540987071845-driver namespace: default labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, spark-role -> driver pod uid: 32462cac-dd04-11e8-b6c6-025000000001 creation time: 2018-10-31T11:57:52Z service account name: default volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv node name: N/A start time: N/A phase: Pending container status: N/A 18/10/31 11:57:52 INFO LoggingPodStatusWatcherImpl: State changed, new state: pod name: spark-pi-1540987071845-driver namespace: default labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, spark-role -> driver pod uid: 32462cac-dd04-11e8-b6c6-025000000001 creation time: 2018-10-31T11:57:52Z service account name: default volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv node name: docker-for-desktop start time: N/A phase: Pending container status: N/A ... 18/10/31 11:58:03 INFO LoggingPodStatusWatcherImpl: State changed, new state: pod name: spark-pi-1540987071845-driver namespace: default labels: spark-app-selector -> spark-2c4abc226ed3415986eb602bd13f3582, spark-role -> driver pod uid: 32462cac-dd04-11e8-b6c6-025000000001 creation time: 2018-10-31T11:57:52Z service account name: default volumes: spark-local-dir-1, spark-conf-volume, default-token-glpfv node name: docker-for-desktop start time: 2018-10-31T11:57:52Z phase: Succeeded container status: container name: spark-kubernetes-driver container image: rvesse/spark:debian container state: terminated container started at: 2018-10-31T11:57:54Z container finished at: 2018-10-31T11:58:02Z exit code: 0 termination reason: Completed ``` Without the `spark.kubernetes.context` setting this will fail because the current context - `minikube` - is pointing to a non-running cluster e.g. ``` > ./spark-submit --master k8s://https://localhost:6443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2 --conf spark.kubernetes.container.image=rvesse/spark:debian local:///opt/spark/examples/jars/spark-examples_2.11-3.0.0-SNAPSHOT.jar 4 18/10/31 12:02:30 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 18/10/31 12:02:30 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file 18/10/31 12:02:31 WARN WatchConnectionManager: Exec Failure javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979) at sun.security.ssl.Handshaker.process_record(Handshaker.java:914) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281) at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:66) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:109) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387) at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292) at sun.security.validator.Validator.validate(Validator.java:260) at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1491) ... 39 more Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141) at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126) at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280) at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382) ... 45 more Exception in thread "kubernetes-dispatcher-0" Exception in thread "main" java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask611a9c09 rejected from java.util.concurrent.ScheduledThreadPoolExecutor404819e4[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326) at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533) at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632) at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:678) at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.scheduleReconnect(WatchConnectionManager.java:300) at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$800(WatchConnectionManager.java:48) at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:213) at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543) at okhttp3.internal.ws.RealWebSocket$2.onFailure(RealWebSocket.java:208) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:148) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) io.fabric8.kubernetes.client.KubernetesClientException: Failed to start websocket at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:204) at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543) at okhttp3.internal.ws.RealWebSocket$2.onFailure(RealWebSocket.java:208) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:148) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.ssl.Alerts.getSSLException(Alerts.java:192) at sun.security.ssl.SSLSocketImpl.fatal(SSLSocketImpl.java:1949) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:302) at sun.security.ssl.Handshaker.fatalSE(Handshaker.java:296) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1509) at sun.security.ssl.ClientHandshaker.processMessage(ClientHandshaker.java:216) at sun.security.ssl.Handshaker.processLoop(Handshaker.java:979) at sun.security.ssl.Handshaker.process_record(Handshaker.java:914) at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1062) at sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1403) at sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:1387) at okhttp3.internal.connection.RealConnection.connectTls(RealConnection.java:281) at okhttp3.internal.connection.RealConnection.establishProtocol(RealConnection.java:251) at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:151) at okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:195) at okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121) at okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100) at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:66) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:109) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:135) ... 4 more Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:387) at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:292) at sun.security.validator.Validator.validate(Validator.java:260) at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:324) at sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:229) at sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:124) at sun.security.ssl.ClientHandshaker.serverCertificate(ClientHandshaker.java:1491) ... 39 more Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target at sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:141) at sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126) at java.security.cert.CertPathBuilder.build(CertPathBuilder.java:280) at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:382) ... 45 more 18/10/31 12:02:31 INFO ShutdownHookManager: Shutdown hook called 18/10/31 12:02:31 INFO ShutdownHookManager: Deleting directory /private/var/folders/6b/y1010qp107j9w2dhhy8csvz0000xq3/T/spark-5e649891-8a0f-4f17-bf3a-33b34082eba8 ``` Suggested reviews: mccheah liyinan926 - this is the follow up fix to the bug discovered while working on SPARK-25809 (PR #22805) Closes #22904 from rvesse/SPARK-25887. Authored-by: Rob Vesse <rvesse@dotnetrdf.org> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-01-22 10:25:21 -08:00
Kazuaki Ishizaki	7bf0794651	[SPARK-26463][CORE] Use ConfigEntry for hardcoded configs for scheduler categories. ## What changes were proposed in this pull request? The PR makes hardcoded `spark.dynamicAllocation`, `spark.scheduler`, `spark.rpc`, `spark.task`, `spark.speculation`, and `spark.cleaner` configs to use `ConfigEntry`. ## How was this patch tested? Existing tests Closes #23416 from kiszk/SPARK-26463. Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-01-22 07:44:36 -06:00
Jungtaek Lim (HeartSaVioR)	38f030725c	[SPARK-26466][CORE] Use ConfigEntry for hardcoded configs for submit categories. ## What changes were proposed in this pull request? The PR makes hardcoded configs below to use `ConfigEntry`. * spark.kryo * spark.kryoserializer * spark.serializer * spark.jars * spark.files * spark.submit * spark.deploy * spark.worker This patch doesn't change configs which are not relevant to SparkConf (e.g. system properties). ## How was this patch tested? Existing tests. Closes #23532 from HeartSaVioR/SPARK-26466-v2. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-01-16 20:57:21 -06:00
Dongjoon Hyun	e00ebd5c72	[SPARK-26482][K8S][TEST][FOLLOWUP] Fix compile failure ## What changes were proposed in this pull request? This fixes K8S integration test compilation failure introduced by #23423 . ```scala $ build/sbt -Pkubernetes-integration-tests test:package ... [error] /Users/dongjoon/APACHE/spark/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesTestComponents.scala:71: type mismatch; [error] found : org.apache.spark.internal.config.OptionalConfigEntry[Boolean] [error] required: String [error] .set(IS_TESTING, false) [error] ^ [error] /Users/dongjoon/APACHE/spark/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesTestComponents.scala:71: type mismatch; [error] found : Boolean(false) [error] required: String [error] .set(IS_TESTING, false) [error] ^ [error] two errors found ``` ## How was this patch tested? Pass the K8S integration test. Closes #23527 from dongjoon-hyun/SPARK-26482. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2019-01-11 21:58:06 -08:00
Jungtaek Lim (HeartSaVioR)	d9e4cf67c0	[SPARK-26482][CORE] Use ConfigEntry for hardcoded configs for ui categories ## What changes were proposed in this pull request? The PR makes hardcoded configs below to use `ConfigEntry`. * spark.ui * spark.ssl * spark.authenticate * spark.master.rest * spark.master.ui * spark.metrics * spark.admin * spark.modify.acl This patch doesn't change configs which are not relevant to SparkConf (e.g. system properties). ## How was this patch tested? Existing tests. Closes #23423 from HeartSaVioR/SPARK-26466. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-01-11 10:18:07 -08:00
Dongjoon Hyun	b316ebf0c2	[SPARK-26491][K8S][FOLLOWUP] Fix compile failure ## What changes were proposed in this pull request? This fixes the compilation error. ``` $ cd resource-managers/kubernetes/integration-tests $ mvn test-compile [ERROR] /Users/dongjoon/APACHE/spark/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/KubernetesTestComponents.scala:71: type mismatch; found : org.apache.spark.internal.config.OptionalConfigEntry[Boolean] required: String [ERROR] .set(IS_TESTING, false) [ERROR] ^ ``` ## How was this patch tested? Pass the Jenkins K8S Integration test or Manual. Closes #23505 from dongjoon-hyun/SPARK-26491. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2019-01-09 23:39:42 -08:00
Marco Gaido	1a641525e6	[SPARK-26491][CORE][TEST] Use ConfigEntry for hardcoded configs for test categories ## What changes were proposed in this pull request? The PR makes hardcoded `spark.test` and `spark.testing` configs to use `ConfigEntry` and put them in the config package. ## How was this patch tested? existing UTs Closes #23413 from mgaido91/SPARK-26491. Authored-by: Marco Gaido <marcogaido91@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-01-07 15:35:33 -08:00
Marcelo Vanzin	669e8a1559	[SPARK-25689][YARN] Make driver, not AM, manage delegation tokens. This change modifies the behavior of the delegation token code when running on YARN, so that the driver controls the renewal, in both client and cluster mode. For that, a few different things were changed: * The AM code only runs code that needs DTs when DTs are available. In a way, this restores the AM behavior to what it was pre-SPARK-23361, but keeping the fix added in that bug. Basically, all the AM code is run in a "UGI.doAs()" block; but code that needs to talk to HDFS (basically the distributed cache handling code) was delayed to the point where the driver is up and running, and thus when valid delegation tokens are available. * SparkSubmit / ApplicationMaster now handle user login, not the token manager. The previous AM code was relying on the token manager to keep the user logged in when keytabs are used. This required some odd APIs in the token manager and the AM so that the right UGI was exposed and used in the right places. After this change, the logged in user is handled separately from the token manager, so the API was cleaned up, and, as explained above, the whole AM runs under the logged in user, which also helps with simplifying some more code. * Distributed cache configs are sent separately to the AM. Because of the delayed initialization of the cached resources in the AM, it became easier to write the cache config to a separate properties file instead of bundling it with the rest of the Spark config. This also avoids having to modify the SparkConf to hide things from the UI. * Finally, the AM doesn't manage the token manager anymore. The above changes allow the token manager to be completely handled by the driver's scheduler backend code also in YARN mode (whether client or cluster), making it similar to other RMs. To maintain the fix added in SPARK-23361 also in client mode, the AM now sends an extra message to the driver on initialization to fetch delegation tokens; and although it might not really be needed, the driver also keeps the running AM updated when new tokens are created. Tested in a kerberized cluster with the same tests used to validate SPARK-23361, in both client and cluster mode. Also tested with a non-kerberized cluster. Closes #23338 from vanzin/SPARK-25689. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>	2019-01-07 14:40:08 -06:00
Dongjoon Hyun	e15a319ccd	[SPARK-26536][BUILD][TEST] Upgrade Mockito to 2.23.4 ## What changes were proposed in this pull request? This PR upgrades Mockito from 1.10.19 to 2.23.4. The following changes are required. - Replace `org.mockito.Matchers` with `org.mockito.ArgumentMatchers` - Replace `anyObject` with `any` - Replace `getArgumentAt` with `getArgument` and add type annotation. - Use `isNull` matcher in case of `null` is invoked. ```scala saslHandler.channelInactive(null); - verify(handler).channelInactive(any(TransportClient.class)); + verify(handler).channelInactive(isNull()); ``` - Make and use `doReturn` wrapper to avoid [SI-4775](https://issues.scala-lang.org/browse/SI-4775) ```scala private def doReturn(value: Any) = org.mockito.Mockito.doReturn(value, Seq.empty: _*) ``` ## How was this patch tested? Pass the Jenkins with the existing tests. Closes #23452 from dongjoon-hyun/SPARK-26536. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2019-01-04 19:23:38 -08:00

1 2 3

145 commits