Commit graph

100 commits

Author SHA1 Message Date
Dongjoon Hyun 13c64c2980 [SPARK-32448][K8S][TESTS] Use single version for exec-maven-plugin/scalatest-maven-plugin
### What changes were proposed in this pull request?

Two different versions are used for the same artifacts, `exec-maven-plugin` and `scalatest-maven-plugin`. This PR aims to use the same versions for `exec-maven-plugin` and `scalatest-maven-plugin`. In addition, this PR removes `scala-maven-plugin.version` from `K8s` integration suite because it's unused.

### Why are the changes needed?

This will prevent the mistake which upgrades only one place and forgets the others.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the Jenkins K8S IT.

Closes #29248 from dongjoon-hyun/SPARK-32448.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-26 19:25:41 -07:00
Sean Owen be2eca22e9 [SPARK-32398][TESTS][CORE][STREAMING][SQL][ML] Update to scalatest 3.2.0 for Scala 2.13.3+
### What changes were proposed in this pull request?

Updates to scalatest 3.2.0. Though it looks large, it is 99% changes to the new location of scalatest classes.

### Why are the changes needed?

3.2.0+ has a fix that is required for Scala 2.13.3+ compatibility.

### Does this PR introduce _any_ user-facing change?

No, only affects tests.

### How was this patch tested?

Existing tests.

Closes #29196 from srowen/SPARK-32398.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-23 16:20:17 -07:00
maruilei ffdca8285e [SPARK-32367][K8S][TESTS] Correct the spelling of parameter in KubernetesTestComponents
### What changes were proposed in this pull request?

Correct the spelling of parameter 'spark.executor.instances' in KubernetesTestComponents

### Why are the changes needed?

Parameter spelling error

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Test is not needed.

Closes #29164 from merrily01/SPARK-32367.

Authored-by: maruilei <maruilei@jd.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-20 13:48:57 -07:00
Holden Karau a4ca355af8 [SPARK-20629][CORE][K8S] Copy shuffle data when nodes are being shutdown
### What is changed?

This pull request adds the ability to migrate shuffle files during Spark's decommissioning. The design document associated with this change is at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE .

To allow this change the `MapOutputTracker` has been extended to allow the location of shuffle files to be updated with `updateMapOutput`. When a shuffle block is put, a block update message will be sent which triggers the `updateMapOutput`.

Instead of rejecting remote puts of shuffle blocks `BlockManager` delegates the storage of shuffle blocks to it's shufflemanager's resolver (if supported). A new, experimental, trait is added for shuffle resolvers to indicate they handle remote putting of blocks.

The existing block migration code is moved out into a separate file, and a producer/consumer model is introduced for migrating shuffle files from the host as quickly as possible while not overwhelming other executors.

### Why are the changes needed?

Recomputting shuffle blocks can be expensive, we should take advantage of our decommissioning time to migrate these blocks.

### Does this PR introduce any user-facing change?

This PR introduces two new configs parameters, `spark.storage.decommission.shuffleBlocks.enabled` & `spark.storage.decommission.rddBlocks.enabled` that control which blocks should be migrated during storage decommissioning.

### How was this patch tested?

New unit test & expansion of the Spark on K8s decom test to assert that decommisioning with shuffle block migration means that the results are not recomputed even when the original executor is terminated.

This PR is a cleaned-up version of the previous WIP PR I made https://github.com/apache/spark/pull/28331 (thanks to attilapiros for his very helpful reviewing on it :)).

Closes #28708 from holdenk/SPARK-20629-copy-shuffle-data-when-nodes-are-being-shutdown-cleaned-up.

Lead-authored-by: Holden Karau <hkarau@apple.com>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Co-authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Co-authored-by: Attila Zsolt Piros <attilazsoltpiros@apiros-mbp16.lan>
Signed-off-by: Holden Karau <hkarau@apple.com>
2020-07-19 21:33:13 -07:00
Sean Owen ee624821a9 [SPARK-29292][YARN][K8S][MESOS] Fix Scala 2.13 compilation for remaining modules
### What changes were proposed in this pull request?

See again the related PRs like https://github.com/apache/spark/pull/28971
This completes fixing compilation for 2.13 for all but `repl`, which is a separate task.

### Why are the changes needed?

Eventually, we need to support a Scala 2.13 build, perhaps in Spark 3.1.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests. (2.13 was not tested; this is about getting it to compile without breaking 2.12)

Closes #29147 from srowen/SPARK-29292.4.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-18 15:08:00 -07:00
Dongjoon Hyun fb51925123 [SPARK-32335][K8S][TESTS] Remove Python2 test from K8s IT
### What changes were proposed in this pull request?

This PR aims to remove Python 2 test case from K8s IT.

### Why are the changes needed?

Since Apache Spark 3.1.0 dropped Python 2.7, 3.4 and 3.5 support officially via SPARK-32138, K8s IT fails.

```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example *** FAILED ***
  The code passed to eventually never returned normally. Attempted 113 times over 2.0014854648999996 minutes. Last failure message: false was not true. (KubernetesSuite.scala:370)
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 11 minutes, 15 seconds.
Total number of tests run: 20
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 1, canceled 0, ignored 0, pending 0
*** 1 TEST FAILED ***
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass Jenkins K8s IT.

Closes #29136 from dongjoon-hyun/SPARK-32335.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-16 11:21:14 -07:00
HyukjinKwon 4ad9bfd53b [SPARK-32138] Drop Python 2.7, 3.4 and 3.5
### What changes were proposed in this pull request?

This PR aims to drop Python 2.7, 3.4 and 3.5.

Roughly speaking, it removes all the widely known Python 2 compatibility workarounds such as `sys.version` comparison, `__future__`. Also, it removes the Python 2 dedicated codes such as `ArrayConstructor` in Spark.

### Why are the changes needed?

 1. Unsupport EOL Python versions
 2. Reduce maintenance overhead and remove a bit of legacy codes and hacks for Python 2.
 3. PyPy2 has a critical bug that causes a flaky test, SPARK-28358 given my testing and investigation.
 4. Users can use Python type hints with Pandas UDFs without thinking about Python version
 5. Users can leverage one latest cloudpickle, https://github.com/apache/spark/pull/28950. With Python 3.8+ it can also leverage C pickle.

### Does this PR introduce _any_ user-facing change?

Yes, users cannot use Python 2.7, 3.4 and 3.5 in the upcoming Spark version.

### How was this patch tested?

Manually tested and also tested in Jenkins.

Closes #28957 from HyukjinKwon/SPARK-32138.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-07-14 11:22:44 +09:00
Dongjoon Hyun 9c134b57bf [SPARK-32058][BUILD] Use Apache Hadoop 3.2.0 dependency by default
### What changes were proposed in this pull request?

According to the dev mailing list discussion, this PR aims to switch the default Apache Hadoop dependency from 2.7.4 to 3.2.0 for Apache Spark 3.1.0 on December 2020.

| Item | Default Hadoop Dependency |
|------|-----------------------------|
| Apache Spark Website | 3.2.0 |
| Apache Download Site | 3.2.0 |
| Apache Snapshot | 3.2.0 |
| Maven Central | 3.2.0 |
| PyPI | 2.7.4 (We will switch later) |
| CRAN | 2.7.4 (We will switch later) |
| Homebrew | 3.2.0 (already) |

In Apache Spark 3.0.0 release, we focused on the other features. This PR targets for [Apache Spark 3.1.0 scheduled on December 2020](https://spark.apache.org/versioning-policy.html).

### Why are the changes needed?

Apache Hadoop 3.2 has many fixes and new cloud-friendly features.

**Reference**
- 2017-08-04: https://hadoop.apache.org/release/2.7.4.html
- 2019-01-16: https://hadoop.apache.org/release/3.2.0.html

### Does this PR introduce _any_ user-facing change?

Since the default Hadoop dependency changes, the users will get a better support in a cloud environment.

### How was this patch tested?

Pass the Jenkins.

Closes #28897 from dongjoon-hyun/SPARK-32058.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-06-26 19:43:29 -07:00
Dongjoon Hyun e5b9b862e6 [SPARK-31881][K8S][TESTS][FOLLOWUP] Activate hadoop-2.7 by default in K8S IT
### What changes were proposed in this pull request?

This PR aims to activate `hadoop-2.7` profile by default in Kubernetes IT module.

### Why are the changes needed?

While SPARK-31881 added Hadoop 3.2 support, one default test dependency was moved to `hadoop-2.7` profile. It works when we give one of `hadoop-2.7` and `hadoop-3.2`, but it fails when we don't give any profile.

**BEFORE**
```
$ mvn test-compile -pl resource-managers/kubernetes/integration-tests -Pkubernetes-integration-tests
...
[ERROR] [Error] /APACHE/spark-merge/resource-managers/kubernetes/integration-tests/src/test/scala/org/apache/spark/deploy/k8s/integrationtest/DepsTestsSuite.scala:23:
object amazonaws is not a member of package com
```

**AFTER**
```
$ mvn test-compile -pl resource-managers/kubernetes/integration-tests -Pkubernetes-integration-tests
..
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
```

The default activated profile will be override when we give `hadoop-3.2`.
```
$ mvn help:active-profiles -Pkubernetes-integration-tests
...
Active Profiles for Project 'org.apache.spark:spark-kubernetes-integration-tests_2.12🫙3.1.0-SNAPSHOT':

The following profiles are active:

 - hadoop-2.7 (source: org.apache.spark:spark-kubernetes-integration-tests_2.12:3.1.0-SNAPSHOT)
 - kubernetes-integration-tests (source: org.apache.spark:spark-parent_2.12:3.1.0-SNAPSHOT)
 - test-java-home (source: org.apache.spark:spark-parent_2.12:3.1.0-SNAPSHOT)
```
```
$ mvn help:active-profiles -Pkubernetes-integration-tests -Phadoop-3.2
...
Active Profiles for Project 'org.apache.spark:spark-kubernetes-integration-tests_2.12🫙3.1.0-SNAPSHOT':

The following profiles are active:

 - hadoop-3.2 (source: org.apache.spark:spark-kubernetes-integration-tests_2.12:3.1.0-SNAPSHOT)
 - hadoop-3.2 (source: org.apache.spark:spark-parent_2.12:3.1.0-SNAPSHOT)
 - kubernetes-integration-tests (source: org.apache.spark:spark-parent_2.12:3.1.0-SNAPSHOT)
 - test-java-home (source: org.apache.spark:spark-parent_2.12:3.1.0-SNAPSHOT)
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the Jenkins UT and IT.

Currently, all Jenkins build and tests (UT & IT) passes without this patch. This should be tested manually with the above command.

`hadoop-3.2` K8s IT also passed like the following.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 33 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #28716 from dongjoon-hyun/SPARK-31881-2.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-06-03 02:17:25 -07:00
Dongjoon Hyun 17586f9ed2 [SPARK-31881][K8S][TESTS] Support Hadoop 3.2 K8s integration tests
### What changes were proposed in this pull request?

This PR aims to support Hadoop 3.2 K8s integration tests.

### Why are the changes needed?

Currently, K8s integration suite assumes Hadoop 2.7 and has hard-coded parts.

### Does this PR introduce _any_ user-facing change?

No. This is a dev-only change.

### How was this patch tested?

Pass the Jenkins K8s IT (with Hadoop 2.7) and do the manual testing for Hadoop 3.2 as described in `README.md`.

```
./dev/dev-run-integration-tests.sh --hadoop-profile hadoop-3.2
```

I verified this manually like the following.
```
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh \
--spark-tgz .../spark-3.1.0-SNAPSHOT-bin-3.2.0.tgz \
--exclude-tags r \
--hadoop-profile hadoop-3.2
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 49 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #28689 from dongjoon-hyun/SPARK-31881.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-06-01 11:19:42 -07:00
Dongjoon Hyun 64ffc66496
[SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.9.2
### What changes were proposed in this pull request?

This PR aims to upgrade `kubernetes-client` library to bring the JDK8 related fixes. Please note that JDK11 works fine without any problem.
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.9.2
  - JDK8 always uses http/1.1 protocol (Prevent OkHttp from wrongly enabling http/2)

### Why are the changes needed?

OkHttp "wrongly" detects the Platform as Jdk9Platform on JDK 8u251.
- https://github.com/fabric8io/kubernetes-client/issues/2212
- https://stackoverflow.com/questions/61565751/why-am-i-not-able-to-run-sparkpi-example-on-a-kubernetes-k8s-cluster

Although there is a workaround `export HTTP2_DISABLE=true` and `Downgrade JDK or K8s`, we had better avoid this problematic situation.

### Does this PR introduce _any_ user-facing change?

No. This will recover the failures on JDK 8u252.

### How was this patch tested?

- [x] Pass the Jenkins UT (https://github.com/apache/spark/pull/28601#issuecomment-632474270)
- [x] Pass the Jenkins K8S IT with the K8s 1.13 (https://github.com/apache/spark/pull/28601#issuecomment-632438452)
- [x] Manual testing with K8s 1.17.3. (Below)

**v1.17.6 result (on Minikube)**
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 27 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #28601 from dongjoon-hyun/SPARK-K8S-CLIENT.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-05-23 11:07:45 -07:00
Dongjoon Hyun a06768ec4d
[SPARK-31780][K8S][TESTS] Add R test tag to exclude R K8s image building and test
### What changes were proposed in this pull request?

This PR aims to skip R image building and one R test during integration tests by using `--exclude-tags r`.

### Why are the changes needed?

We have only one R integration test case, `Run SparkR on simple dataframe.R example`, for submission test coverage. Since this is rarely changed, we can skip this and save the efforts required for building the whole R image and running the single test.
```
KubernetesSuite:
...
- Run SparkR on simple dataframe.R example
Run completed in 10 minutes, 20 seconds.
Total number of tests run: 20
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the K8S integration test and do the following manually. (Note that R test is skipped)
```
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --exclude-tags r --spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 10 minutes, 23 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #28594 from dongjoon-hyun/SPARK-31780.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-05-20 18:33:38 -07:00
Dongjoon Hyun b7947e0285 [SPARK-31766][K8S][TESTS] Add Spark version prefix to K8s UUID test image tag
### What changes were proposed in this pull request?

This PR aims to add Spark version prefix during generating test image tag for K8s integration testing.

### Why are the changes needed?

This helps to distinguish the images by version.

**BEFORE**
```
$ docker images | grep kubespark
kubespark/spark-py  F7188CBD-AE08-4705-9C8A-D0DD3DC8B86F  ...
kubespark/spark     F7188CBD-AE08-4705-9C8A-D0DD3DC8B86F  ...
```

**AFTER**
```
$ docker images | grep kubespark
kubespark/spark-py  3.1.0-SNAPSHOT_F7188CBD-AE08-4705-9C8A-D0DD3DC8B86F ...
kubespark/spark     3.1.0-SNAPSHOT_F7188CBD-AE08-4705-9C8A-D0DD3DC8B86F ...
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the K8s integration test.

```
...
Successfully tagged kubespark/spark:3.1.0-SNAPSHOT_688b46c8-c119-404d-aadb-d05a14262db7
...
Successfully tagged kubespark/spark-py:3.1.0-SNAPSHOT_688b46c8-c119-404d-aadb-d05a14262db7
...
Successfully tagged kubespark/spark-r:3.1.0-SNAPSHOT_688b46c8-c119-404d-aadb-d05a14262db7
```

Closes #28587 from dongjoon-hyun/SPARK-31766.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-05-20 14:55:23 +09:00
William Hyun 5bb1a09b5f
[SPARK-31740][K8S][TESTS] Use github URL instead of a broken link
This PR aims to use GitHub URL instead of a broken link in `BasicTestsSuite.scala`.

Currently, K8s integration test is broken:

https://amplab.cs.berkeley.edu/jenkins/view/Spark%20K8s%20Builds/job/spark-master-test-k8s/534/console

```
- Run SparkRemoteFileTest using a remote data file *** FAILED ***
  The code passed to eventually never returned normally. Attempted 130 times over 2.00109555135 minutes. Last failure message: false was not true. (KubernetesSuite.scala:370)
```

No.

Pass the K8s integration test.

Closes #28561 from williamhyun/williamhyun-patch-1.

Authored-by: williamhyun <62487364+williamhyun@users.noreply.github.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-05-17 22:13:16 -07:00
Dongjoon Hyun dba525c997 [SPARK-31313][K8S][TEST] Add m01 node name to support Minikube 1.8.x
### What changes were proposed in this pull request?

This PR aims to add `m01` as a node name additionally to `PVTestsSuite`.

### Why are the changes needed?

minikube 1.8.0 ~ 1.8.2 generate a cluster with a nodename `m01` while all the other versions have `minikube`. This causes `PVTestSuite` failure.
```
$ minikube --vm-driver=hyperkit start --memory 6000 --cpus 8
* minikube v1.8.2 on Darwin 10.15.3
  - MINIKUBE_ACTIVE_DOCKERD=minikube
* Using the hyperkit driver based on user configuration
* Creating hyperkit VM (CPUs=8, Memory=6000MB, Disk=20000MB) ...
* Preparing Kubernetes v1.18.0 on Docker 19.03.6 ...
* Launching Kubernetes ...
* Enabling addons: default-storageclass, storage-provisioner
* Waiting for cluster to come online ...
* Done! kubectl is now configured to use "minikube"

$ kubectl get nodes
NAME   STATUS   ROLES    AGE   VERSION
m01    Ready    master   22s   v1.17.3
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

This only adds a new node name. So, K8S Jenkins job should passed.
In addition, `K8s` integration test suite should be tested on `minikube 1.8.2` manually.

```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 10 minutes, 23 seconds.
Total number of tests run: 20
Suites: completed 2, aborted 0
Tests: succeeded 20, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

For the above test, Minikube 1.8.2 and K8s v1.18.0 is used.
```
$ minikube version
minikube version: v1.8.2
commit: eb13446e786c9ef70cb0a9f85a633194e62396a1

$ kubectl version --short
Client Version: v1.18.0
Server Version: v1.18.0
```

Closes #28080 from dongjoon-hyun/SPARK-31313.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2020-04-01 03:42:26 +00:00
Dongjoon Hyun f206bbde3a
[SPARK-31244][K8S][TEST] Use Minio instead of Ceph in K8S DepsTestsSuite
### What changes were proposed in this pull request?

This PR (SPARK-31244) replaces `Ceph` with `Minio` in K8S `DepsTestSuite`.

### Why are the changes needed?

Currently, `DepsTestsSuite` is using `ceph` for S3 storage. However, the used version and all new releases are broken on new `minikube` releases. We had better use more robust and small one.

```
$ minikube version
minikube version: v1.8.2

$ minikube -p minikube docker-env | source

$ docker run -it --rm -e NETWORK_AUTO_DETECT=4 -e RGW_FRONTEND_PORT=8000 -e SREE_PORT=5001 -e CEPH_DEMO_UID=nano -e CEPH_DAEMON=demo ceph/daemon:v4.0.3-stable-4.0-nautilus-centos-7-x86_64 /bin/sh
2020-03-25 04:26:21  /opt/ceph-container/bin/entrypoint.sh: ERROR- it looks like we have not been able to discover the network settings

$ docker run -it --rm -e NETWORK_AUTO_DETECT=4 -e RGW_FRONTEND_PORT=8000 -e SREE_PORT=5001 -e CEPH_DEMO_UID=nano -e CEPH_DAEMON=demo ceph/daemon:v4.0.11-stable-4.0-nautilus-centos-7 /bin/sh
2020-03-25 04:20:30  /opt/ceph-container/bin/entrypoint.sh: ERROR- it looks like we have not been able to discover the network settings
```

Also, the image size is unnecessarily big (almost `1GB`) and growing while `minio` is `55.8MB` with the same features.
```
$ docker images | grep ceph
ceph/daemon v4.0.3-stable-4.0-nautilus-centos-7-x86_64 a6a05ccdf924 6 months ago 852MB
ceph/daemon v4.0.11-stable-4.0-nautilus-centos-7       87f695550d8e 12 hours ago 901MB

$ docker images | grep minio
minio/minio latest                                     95c226551ea6 5 days ago   55.8MB
```

### Does this PR introduce any user-facing change?

No. (This is a test case change)

### How was this patch tested?

Pass the existing Jenkins K8s integration test job and test with the latest minikube.
```
$ minikube version
minikube version: v1.8.2

$ kubectl version --short
Client Version: v1.17.4
Server Version: v1.17.4

$ NO_MANUAL=1 ./dev/make-distribution.sh --r --pip --tgz -Pkubernetes
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage *** FAILED *** // This is irrelevant to this PR.
- Launcher client dependencies          // This is the fixed test case by this PR.
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 12 minutes, 4 seconds.
...
```

The following is the working snapshot of `DepsTestSuite` test.
```
$ kubectl get all -ncf9438dd8a65436686b1196a6b73000f
NAME                                                  READY   STATUS    RESTARTS   AGE
pod/minio-0                                           1/1     Running   0          70s
pod/spark-test-app-8494bddca3754390b9e59a2ef47584eb   1/1     Running   0          55s

NAME                                                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
service/minio-s3                                     NodePort    10.109.54.180   <none>        9000:30678/TCP               70s
service/spark-test-app-fd916b711061c7b8-driver-svc   ClusterIP   None            <none>        7078/TCP,7079/TCP,4040/TCP   55s

NAME                     READY   AGE
statefulset.apps/minio   1/1     70s
```

Closes #28015 from dongjoon-hyun/SPARK-31244.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-03-25 12:38:15 -07:00
Prashant Sharma 3799d2b9d8
[SPARK-30715][K8S][TESTS][FOLLOWUP] Update k8s client version in IT as well
### What changes were proposed in this pull request?
This is a follow up for SPARK-30715 . Kubernetes client version in sync in integration-tests and kubernetes/core

### Why are the changes needed?
More than once, the kubernetes client version has gone out of sync between integration tests and kubernetes/core. So brought them up in sync again and added a comment to save us from future need of this additional followup.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Manually.

Closes #27948 from ScrapCodes/follow-up-spark-30715.

Authored-by: Prashant Sharma <prashsh1@in.ibm.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-03-21 18:26:53 -07:00
Holden Karau 2825237448 [SPARK-31062][K8S][TESTS] Improve spark decommissioning k8s test reliability
### What changes were proposed in this pull request?

Replace a sleep with waiting for the first collect to happen to try and make the K8s test code more reliable.

### Why are the changes needed?

Currently the Decommissioning test appears to be flaky in Jenkins.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Ran K8s test suite in a loop on minikube on my desktop for 10 iterations without this test failing on any of the runs.

Closes #27858 from holdenk/SPARK-31062-Improve-Spark-Decommissioning-K8s-test-teliability.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Holden Karau <hkarau@apple.com>
2020-03-11 14:42:31 -07:00
gatorsmile 28b8713036 [SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT
### What changes were proposed in this pull request?
This patch is to bump the master branch version to 3.1.0-SNAPSHOT.

### Why are the changes needed?
N/A

### Does this PR introduce any user-facing change?
N/A

### How was this patch tested?
N/A

Closes #27698 from gatorsmile/updateVersion.

Authored-by: gatorsmile <gatorsmile@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-02-25 19:44:31 -08:00
Holden Karau d273a2bb0f [SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support
This PR is based on an existing/previou PR - https://github.com/apache/spark/pull/19045

### What changes were proposed in this pull request?

This changes adds a decommissioning state that we can enter when the cloud provider/scheduler lets us know we aren't going to be removed immediately but instead will be removed soon. This concept fits nicely in K8s and also with spot-instances on AWS / preemptible instances all of which we can get a notice that our host is going away. For now we simply stop scheduling jobs, in the future we could perform some kind of migration of data during scale-down, or at least stop accepting new blocks to cache.

There is a design document at https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?usp=sharing

### Why are the changes needed?

With more move to preemptible multi-tenancy, serverless environments, and spot-instances better handling of node scale down is required.

### Does this PR introduce any user-facing change?

There is no API change, however an additional configuration flag is added to enable/disable this behaviour.

### How was this patch tested?

New integration tests in the Spark K8s integration testing. Extension of the AppClientSuite to test decommissioning seperate from the K8s.

Closes #26440 from holdenk/SPARK-20628-keep-track-of-nodes-which-are-going-to-be-shutdown-r4.

Lead-authored-by: Holden Karau <hkarau@apple.com>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Signed-off-by: Holden Karau <hkarau@apple.com>
2020-02-14 12:36:52 -08:00
Dongjoon Hyun 74cd46eb69 [SPARK-30816][K8S][TESTS] Fix dev-run-integration-tests.sh to ignore empty params
### What changes were proposed in this pull request?

This PR aims to fix `dev-run-integration-tests.sh` to ignore empty params correctly.

### Why are the changes needed?

The following script runs `mvn` integration test like the following.
```
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh
...
build/mvn integration-test
-f /Users/dongjoon/APACHE/spark/pom.xml
-pl resource-managers/kubernetes/integration-tests
-am
-Pscala-2.12
-Pkubernetes
-Pkubernetes-integration-tests
-Djava.version=8
-Dspark.kubernetes.test.sparkTgz=N/A
-Dspark.kubernetes.test.imageTag=N/A
-Dspark.kubernetes.test.imageRepo=docker.io/kubespark
-Dspark.kubernetes.test.deployMode=minikube
-Dtest.include.tags=k8s
-Dspark.kubernetes.test.namespace=
-Dspark.kubernetes.test.serviceAccountName=
-Dspark.kubernetes.test.kubeConfigContext=
-Dspark.kubernetes.test.master=
-Dtest.exclude.tags=
-Dspark.kubernetes.test.jvmImage=spark
-Dspark.kubernetes.test.pythonImage=spark-py
-Dspark.kubernetes.test.rImage=spark-r
```

After this PR, the empty parameters like the followings will be skipped like the original design.
```
-Dspark.kubernetes.test.namespace=
-Dspark.kubernetes.test.serviceAccountName=
-Dspark.kubernetes.test.kubeConfigContext=
-Dspark.kubernetes.test.master=
-Dtest.exclude.tags=
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins K8S integration test.

Closes #27566 from dongjoon-hyun/SPARK-30816.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-02-13 11:42:00 -08:00
Dongjoon Hyun 859699135c [SPARK-30807][K8S][TESTS] Support Java 11 in K8S integration tests
### What changes were proposed in this pull request?

This PR aims to support JDK11 test in K8S integration tests.
- This is an update in testing framework instead of individual tests.
- This will enable JDK11 runtime test when you didn't installed JDK11 on your local system.

### Why are the changes needed?

Apache Spark 3.0.0 adds JDK11 support, but K8s integration tests use JDK8 until now.

### Does this PR introduce any user-facing change?

No. This is a dev-only test-related PR.

### How was this patch tested?

This is irrelevant to Jenkins UT, but Jenkins K8S IT (JDK8) should pass.
- https://github.com/apache/spark/pull/27559#issuecomment-585903489 (JDK8 Passed)

And, manually do the following for JDK11 test.
```
$ NO_MANUAL=1 ./dev/make-distribution.sh --r --pip --tgz -Phadoop-3.2 -Pkubernetes
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --java-image-tag 11-jre-slim --spark-tgz $PWD/spark-*.tgz
```

```
$ docker run -it --rm kubespark/spark:1318DD8A-2B15-4A00-BC69-D0E90CED235B /usr/local/openjdk-11/bin/java --version | tail -n1
OpenJDK 64-Bit Server VM 18.9 (build 11.0.6+10, mixed mode)
```

Closes #27559 from dongjoon-hyun/SPARK-30807.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-02-13 11:17:27 -08:00
Dongjoon Hyun 9d90c8b898 [SPARK-30738][K8S] Use specific image version in "Launcher client dependencies" test
### What changes were proposed in this pull request?

This PR use a specific version of docker image instead of `latest`. As of today, when I run K8s integration test locally, this test case fails always.

Also, in this PR, I shows two consecutive failures with a dummy change.
- https://github.com/apache/spark/pull/27465#issuecomment-582326614
- https://github.com/apache/spark/pull/27465#issuecomment-582329114
```
- Launcher client dependencies *** FAILED ***
```

After that, I added the patch and K8s Integration test passed.
- https://github.com/apache/spark/pull/27465#issuecomment-582361696

### Why are the changes needed?

[SPARK-28465](https://github.com/apache/spark/pull/25222) switched from `v4.0.0-stable-4.0-master-centos-7-x86_64` to `latest` to catch up the API change. However, the API change seems to occur again. We had better use a specific version to prevent accidental failures.

```scala
- .withImage("ceph/daemon:v4.0.0-stable-4.0-master-centos-7-x86_64")
+ .withImage("ceph/daemon:latest")
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass `Launcher client dependencies` test in Jenkins K8s Integration Suite.
Or, run K8s Integration test locally.

Closes #27465 from dongjoon-hyun/SPARK-K8S-IT.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-02-05 11:01:53 -08:00
Onur Satici 86fdb818bf [SPARK-30715][K8S] Bump fabric8 to 4.7.1
### What changes were proposed in this pull request?
Bump fabric8 kubernetes-client to 4.7.1

### Why are the changes needed?
New fabric8 version brings support for Kubernetes 1.17 clusters.
Full release notes:
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.0
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.7.1

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Existing unit and integration tests cover creation of K8S objects. Adjusted them to work with the new fabric8 version

Closes #27443 from onursatici/os/bump-fabric8.

Authored-by: Onur Satici <onursatici@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-02-05 01:17:30 -08:00
Yuming Wang 696288f623 [INFRA] Reverts commit 56dcd79 and c216ef1
### What changes were proposed in this pull request?
1. Revert "Preparing development version 3.0.1-SNAPSHOT": 56dcd79

2. Revert "Preparing Spark release v3.0.0-preview2-rc2": c216ef1

### Why are the changes needed?
Shouldn't change master.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
manual test:
https://github.com/apache/spark/compare/5de5e46..wangyum:revert-master

Closes #26915 from wangyum/revert-master.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Yuming Wang <wgyumg@gmail.com>
2019-12-16 19:57:44 -07:00
Yuming Wang 56dcd79992 Preparing development version 3.0.1-SNAPSHOT 2019-12-17 01:57:27 +00:00
Yuming Wang c216ef1d03 Preparing Spark release v3.0.0-preview2-rc2 2019-12-17 01:57:21 +00:00
Dongjoon Hyun cc276f8a6e [SPARK-30243][BUILD][K8S] Upgrade K8s client dependency to 4.6.4
### What changes were proposed in this pull request?

This PR aims to upgrade K8s client library from 4.6.1 to 4.6.4 for `3.0.0-preview2`.

### Why are the changes needed?

This will bring the latest bug fixes.
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.6.4
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.6.3
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.6.2

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with K8s integration test.

Closes #26874 from dongjoon-hyun/SPARK-30243.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-13 08:25:51 -08:00
Marcelo Vanzin b095232f63 [SPARK-29865][K8S] Ensure client-mode executors have same name prefix
This basically does what BasicDriverFeatureStep already does to achieve the
same thing in cluster mode; but since that class (or any other feature) is
not invoked in client mode, it needs to be done elsewhere.

I also modified the client mode integration test to check the executor name
prefix; while there I had to fix the minikube backend to parse the output
from newer minikube versions (I have 1.5.2).

Closes #26488 from vanzin/SPARK-29865.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Erik Erlandson <eerlands@redhat.com>
2019-11-14 15:52:39 -07:00
Xingbo Jiang 8207c835b4 Revert "Prepare Spark release v3.0.0-preview-rc2"
This reverts commit 007c873ae3.
2019-10-30 17:45:44 -07:00
Xingbo Jiang 007c873ae3 Prepare Spark release v3.0.0-preview-rc2
### What changes were proposed in this pull request?

To push the built jars to maven release repository, we need to remove the 'SNAPSHOT' tag from the version name.

Made the following changes in this PR:
* Update all the `3.0.0-SNAPSHOT` version name to `3.0.0-preview`
* Update the sparkR version number check logic to allow jvm version like `3.0.0-preview`

**Please note those changes were generated by the release script in the past, but this time since we manually add tags on master branch, we need to manually apply those changes too.**

We shall revert the changes after 3.0.0-preview release passed.

### Why are the changes needed?

To make the maven release repository to accept the built jars.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

N/A
2019-10-30 17:42:59 -07:00
Xingbo Jiang b33a58c0c6 Revert "Prepare Spark release v3.0.0-preview-rc1"
This reverts commit 5eddbb5f1d.
2019-10-28 22:32:34 -07:00
Xingbo Jiang 5eddbb5f1d Prepare Spark release v3.0.0-preview-rc1
### What changes were proposed in this pull request?

To push the built jars to maven release repository, we need to remove the 'SNAPSHOT' tag from the version name.

Made the following changes in this PR:
* Update all the `3.0.0-SNAPSHOT` version name to `3.0.0-preview`
* Update the PySpark version from `3.0.0.dev0` to `3.0.0`

**Please note those changes were generated by the release script in the past, but this time since we manually add tags on master branch, we need to manually apply those changes too.**

We shall revert the changes after 3.0.0-preview release passed.

### Why are the changes needed?

To make the maven release repository to accept the built jars.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

N/A

Closes #26243 from jiangxb1987/3.0.0-preview-prepare.

Lead-authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Co-authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-10-28 22:31:29 -07:00
igor.calabria 78bdcfade1 [SPARK-27812][K8S] Bump K8S client version to 4.6.1
### What changes were proposed in this pull request?

Updated kubernetes client.

### Why are the changes needed?

https://issues.apache.org/jira/browse/SPARK-27812
https://issues.apache.org/jira/browse/SPARK-27927

We need this fix https://github.com/fabric8io/kubernetes-client/pull/1768 that was released on version 4.6 of the client. The root cause of the problem is better explained in https://github.com/apache/spark/pull/25785

### Does this PR introduce any user-facing change?

Nope, it should be transparent to users

### How was this patch tested?

This patch was tested manually using a simple pyspark job

```python
from pyspark.sql import SparkSession

if __name__ == '__main__':
    spark = SparkSession.builder.getOrCreate()
```

The expected behaviour of this "job" is that both python's and jvm's process exit automatically after the main runs. This is the case for spark versions <= 2.4. On version 2.4.3, the jvm process hangs because there's a non daemon thread running

```
"OkHttp WebSocket https://10.96.0.1/..." #121 prio=5 os_prio=0 tid=0x00007fb27c005800 nid=0x24b waiting on condition [0x00007fb300847000]
"OkHttp WebSocket https://10.96.0.1/..." #117 prio=5 os_prio=0 tid=0x00007fb28c004000 nid=0x247 waiting on condition [0x00007fb300e4b000]
```
This is caused by a bug on `kubernetes-client` library, which is fixed on the version that we are upgrading to.

When the mentioned job is run with this patch applied, the behaviour from spark <= 2.4.3 is restored and both processes terminate successfully

Closes #26093 from igorcalabria/k8s-client-update.

Authored-by: igor.calabria <igor.calabria@ubee.in>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-17 12:23:24 -07:00
Ilan Filonenko 52186afd84 [SPARK-25152][K8S] Enable SparkR Integration Tests for Kubernetes
## What changes were proposed in this pull request?

Re-introduced SparkR integration tests as part of the SparkR on K8S release. This PR awaits Jenkins availability.

## How was this patch tested?

This patch was tested with unit tests and integration tests.

Closes #22145 from ifilonenko/spark-r-with-tests.

Authored-by: Ilan Filonenko <if56@cornell.edu>
Signed-off-by: shane knapp <incomplete@gmail.com>
2019-10-14 13:25:54 -07:00
Holden Karau 4080c4beeb [SPARK-28937][SPARK-28936][KUBERNETES] Reduce test flakyness
### What changes were proposed in this pull request?

Switch from using a Thread sleep for waiting for commands to finish to just waiting for the command to finish with a watcher & improve the error messages in the SecretsTestsSuite.

### Why are the changes needed?
Currently some of the Spark Kubernetes tests have race conditions with command execution, and the frequent use of eventually makes debugging test failures difficult.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Existing tests pass after removal of thread.sleep

Closes #25765 from holdenk/SPARK-28937SPARK-28936-improve-kubernetes-integration-tests.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Holden Karau <hkarau@apple.com>
2019-09-20 10:08:16 -07:00
Dongjoon Hyun 3bf43fb60d [SPARK-29159][BUILD] Increase ReservedCodeCacheSize to 1G
### What changes were proposed in this pull request?

This PR aims to increase the JVM CodeCacheSize from 0.5G to 1G.

### Why are the changes needed?

After upgrading to `Scala 2.12.10`, the following is observed during building.
```
2019-09-18T20:49:23.5030586Z OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
2019-09-18T20:49:23.5032920Z OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=
2019-09-18T20:49:23.5034959Z CodeCache: size=524288Kb used=521399Kb max_used=521423Kb free=2888Kb
2019-09-18T20:49:23.5035472Z  bounds [0x00007fa62c000000, 0x00007fa64c000000, 0x00007fa64c000000]
2019-09-18T20:49:23.5035781Z  total_blobs=156549 nmethods=155863 adapters=592
2019-09-18T20:49:23.5036090Z  compilation: disabled (not enough contiguous free space left)
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually check the Jenkins or GitHub Action build log (which should not have the above).

Closes #25836 from dongjoon-hyun/SPARK-CODE-CACHE-1G.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-19 00:24:15 -07:00
Holden Karau 0ed9fae457 [SPARK-28886][K8S] Fix the DepsTestsSuite with minikube 1.3.1
### What changes were proposed in this pull request?

Matches the response from minikube service against a regex to extract the URL

### Why are the changes needed?

minikube 1.3.1 on OSX has different formatting than expected

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Ran the existing integration test run on OSX with minikube 1.3.1

Closes #25599 from holdenk/SPARK-28886-fix-deps-tests-with-minikube-1.3.1.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-08 20:04:16 -05:00
Sean Owen ded23f83dd [SPARK-28921][K8S][FOLLOWUP] Also bump K8S client version in integration-tests
### What changes were proposed in this pull request?

Per https://github.com/apache/spark/pull/25640#issuecomment-527397689 also bump K8S client version in integration-tests module.

### Why are the changes needed?

Harmonize the version as intended.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Existing tests.

Closes #25664 from srowen/SPARK-28921.2.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-04 19:51:04 -05:00
shane knapp 13fd32c9a9 [SPARK-28701][TEST-HADOOP3.2][TEST-JAVA11][K8S] adding java11 support for pull request builds
## What changes were proposed in this pull request?

we need to add the ability to test PRBs against java11.

see comments here:  https://github.com/apache/spark/pull/25405

## How was this patch tested?

the build system will test this.

Closes #25423 from shaneknapp/spark-prb-java11.

Authored-by: shane knapp <incomplete@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-08-27 00:48:01 +09:00
Liang-Chi Hsieh 37eedf6149 [SPARK-28652][TESTS][K8S] Add python version check for executor
## What changes were proposed in this pull request?

Current two PySpark version tests in PythonTestsSuite, just test against Python version at driver side. Because the test script doesn't run any spark job requiring python worker, it doesn't actually do version check at worker side. This patch adds pieces of code to the test script, to run a simple job to verify Python version.

## How was this patch tested?

Unit test. Locally manual test.

Closes #25411 from viirya/SPARK-28652.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-08-12 14:43:32 +09:00
Jungtaek Lim (HeartSaVioR) 128ea37bda [SPARK-28601][CORE][SQL] Use StandardCharsets.UTF_8 instead of "UTF-8" string representation, and get rid of UnsupportedEncodingException
## What changes were proposed in this pull request?

This patch tries to keep consistency whenever UTF-8 charset is needed, as using `StandardCharsets.UTF_8` instead of using "UTF-8". If the String type is needed, `StandardCharsets.UTF_8.name()` is used.

This change also brings the benefit of getting rid of `UnsupportedEncodingException`, as we're providing `Charset` instead of `String` whenever possible.

This also changes some private Catalyst helper methods to operate on encodings as `Charset` objects rather than strings.

## How was this patch tested?

Existing unit tests.

Closes #25335 from HeartSaVioR/SPARK-28601.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-05 20:45:54 -07:00
HyukjinKwon 946aef0535 [SPARK-28550][K8S][TESTS] Unset SPARK_HOME environment variable in K8S integration preparation
## What changes were proposed in this pull request?

Currently, if we run the Kubernetes integration tests with `SPARK_HOME` already set, it refers the `SPARK_HOME` even when `--spark-tgz` is specified.

This PR proposes to unset `SPARK_HOME` to let the docker-image-tool script detect `SPARK_HOME`. Otherwise, it cannot indicate the unpacked directory as its home.

## How was this patch tested?

```bash
export SPARK_HOME=`pwd`
dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pkubernetes
resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --spark-tgz $PWD/spark-*.tgz
```

**Before:**

```
+ /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/bin/docker-image-tool.sh -r docker.io/kubespark -t 650B51C8-BBED-47C9-AEAB-E66FC9A0E64E -p /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory
cp: assembly/target/scala-2.12/jars: No such file or directory
cp: resource-managers/kubernetes/integration-tests/tests: No such file or directory
cp: examples/target/scala-2.12/jars/*: No such file or directory
cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory
cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory
Cannot find docker image. This script must be run from a runnable distribution of Apache Spark.
...
[INFO] Spark Project Kubernetes Integration Tests ......... FAILURE [  4.870 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
```

**After:**

```
+ /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/bin/docker-image-tool.sh -r docker.io/kubespark -t 2BA5883A-A0AC-4D2B-8D00-702D31B59B23 -p /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
Sending build context to Docker daemon  250.2MB
Step 1/15 : FROM openjdk:8-alpine
 ---> a3562aa0b991
...
Successfully built 8614fb5ac279
Successfully tagged kubespark/spark:2BA5883A-A0AC-4D2B-8D00-702D31B59B23
```

Closes #25283 from HyukjinKwon/SPARK-28550.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-29 10:47:28 -07:00
Dongjoon Hyun 767802500c [SPARK-28534][K8S][TEST] Update node affinity for DockerForDesktop backend in PVTestsSuite
## What changes were proposed in this pull request?

This PR aims to recover our K8s integration test suite by extending node affinity in order to pass `PVTestsSuite` in `DockerForDesktop` environment, too. Previously, `PVTestsSuite` fails at `--deploy-mode docker-for-desktop` option because the node affinity requires `minibase` node.

For `Docker Desktop`, there are two node names like the following. Note that Spark testing needs K8s v1.13 and above. So, this PR should be verified with `Docker Desktop (Edge)` version. This PR adds both because next stable `Docker Desktop` will have K8s v1.14.3.

**Docker Desktop (Stable, K8s v1.10.11)**
```
$ kubectl get node
NAME                 STATUS   ROLES    AGE   VERSION
docker-for-desktop   Ready    master   52s   v1.10.11
```

**Docker Desktop 2.1.0.0 (Edge, K8s v1.14.3, Released 2019-07-26)**
```
$ kubectl get node
NAME             STATUS   ROLES    AGE   VERSION
docker-desktop   Ready    master   16h   v1.14.3
```

## How was this patch tested?

Pass the Jenkins K8s integration test (`minibase`) and install `Docker Desktop 2.1.0.0 (Edge)` and run the integration test in `DockerForDesktop`. Note that this fixes only `PVTestsSuite`.

```
$ dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pkubernetes
$ resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --spark-tgz $PWD/spark-*.tgz
...
KubernetesSuite:
...
- PVs with local storage
...
```

Closes #25269 from dongjoon-hyun/SPARK-28534.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-29 16:28:56 +09:00
Stavros Kontopoulos 7504eab42f [SPARK-28465][K8S] Fix integration tests which fail due to missing ceph-nano image
## What changes were proposed in this pull request?

Fixes the tests. Follows instructions here: https://github.com/ceph/cn/issues/115#issuecomment-497384369
## How was this patch tested?

Manually by running the tests with minikube.

Closes #25222 from skonto/fix-ceph.

Authored-by: Stavros Kontopoulos <st.kontopoulos@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-24 14:58:20 -07:00
Douglas R Colkitt 8fc5cb6285 [SPARK-28473][DOC] Stylistic consistency of build command in README
## What changes were proposed in this pull request?

Change the format of the build command in the README to start with a `./` prefix

    ./build/mvn -DskipTests clean package

This increases stylistic consistency across the README- all the other commands have a `./` prefix. Having a visible `./` prefix also makes it clear to the user that the shell command requires the current working directory to be at the repository root.

## How was this patch tested?

README.md was reviewed both in raw markdown and in the Github rendered landing page for stylistic consistency.

Closes #25231 from Mister-Meeseeks/master.

Lead-authored-by: Douglas R Colkitt <douglas.colkitt@gmail.com>
Co-authored-by: Mister-Meeseeks <douglas.colkitt@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-23 16:29:46 -07:00
Stavros Kontopoulos 7912ab85a6 [SPARK-27872][K8S] Fix executor service account inconsistency
## What changes were proposed in this pull request?

Fixes the service account inconsistency that breaks pull secrets. It gives the option to the user to setup a specific service account for the executors if he has to
(via `spark.kubernetes.authenticate.executor.serviceAccountName`). Defaults to the driver's one.
We are not supporting special authentication credentials for the executors with this PR.

## How was this patch tested?

Tested manually by launching a Spark job exercising the introduced settings.
Added a new integration tests for this fix.

Closes #24748 from skonto/fix_executor_sa.

Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-06-09 16:28:37 -05:00
Stavros Kontopoulos 5e74570c8f [SPARK-23153][K8S] Support client dependencies with a Hadoop Compatible File System
## What changes were proposed in this pull request?
- solves the current issue with --packages in cluster mode (there is no ticket for it). Also note of some [issues](https://issues.apache.org/jira/browse/SPARK-22657) of the past here when hadoop libs are used at the spark submit side.
- supports spark.jars, spark.files, app jar.

It works as follows:
Spark submit uploads the deps to the HCFS. Then the driver serves the deps via the Spark file server.
No hcfs uris are propagated.

The related design document is [here](https://docs.google.com/document/d/1peg_qVhLaAl4weo5C51jQicPwLclApBsdR1To2fgc48/edit). the next option to add is the RSS but has to be improved given the discussion in the past about it (Spark 2.3).
## How was this patch tested?

- Run integration test suite.
- Run an example using S3:

```
 ./bin/spark-submit \
...
 --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.6 \
 --deploy-mode cluster \
 --name spark-pi \
 --class org.apache.spark.examples.SparkPi \
 --conf spark.executor.memory=1G \
 --conf spark.kubernetes.namespace=spark \
 --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
 --conf spark.driver.memory=1G \
 --conf spark.executor.instances=2 \
 --conf spark.sql.streaming.metricsEnabled=true \
 --conf "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp" \
 --conf spark.kubernetes.container.image.pullPolicy=Always \
 --conf spark.kubernetes.container.image=skonto/spark:k8s-3.0.0 \
 --conf spark.kubernetes.file.upload.path=s3a://fdp-stavros-test \
 --conf spark.hadoop.fs.s3a.access.key=... \
 --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \
 --conf spark.hadoop.fs.s3a.fast.upload=true \
 --conf spark.kubernetes.executor.deleteOnTermination=false \
 --conf spark.hadoop.fs.s3a.secret.key=... \
 --conf spark.files=client:///...resolv.conf \
file:///my.jar **
```
Added integration tests based on [Ceph nano](https://github.com/ceph/cn). Looks very [active](http://www.sebastien-han.fr/blog/2019/02/24/Ceph-nano-is-getting-better-and-better/).
Unfortunately minio needs hadoop >= 2.8.

Closes #23546 from skonto/support-client-deps.

Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com>
Signed-off-by: Erik Erlandson <eerlands@redhat.com>
2019-05-22 16:15:42 -07:00
Sean Owen bfb3ffe9b3 [SPARK-27682][CORE][GRAPHX][MLLIB] Replace use of collections and methods that will be removed in Scala 2.13 with work-alikes
## What changes were proposed in this pull request?

This replaces use of collection classes like `MutableList` and `ArrayStack` with workalikes that are available in 2.12, as they will be removed in 2.13. It also removes use of `.to[Collection]` as its uses was superfluous anyway. Removing `collection.breakOut` will have to wait until 2.13

## How was this patch tested?

Existing tests

Closes #24586 from srowen/SPARK-27682.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-05-15 09:29:12 -05:00
Adi Muraru 8ef4da753d [SPARK-27610][YARN] Shade netty native libraries
## What changes were proposed in this pull request?

Fixed the `spark-<version>-yarn-shuffle.jar` artifact packaging to shade the native netty libraries:
- shade the `META-INF/native/libnetty_*` native libraries when packagin
the yarn shuffle service jar. This is required as netty library loader
derives that based on shaded package name.
- updated the `org/spark_project` shade package prefix to `org/sparkproject`
(i.e. removed underscore) as the former breaks the netty native lib loading.

This was causing the yarn external shuffle service to fail
when spark.shuffle.io.mode=EPOLL

## How was this patch tested?
Manual tests

Closes #24502 from amuraru/SPARK-27610_master.

Authored-by: Adi Muraru <amuraru@adobe.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-05-07 10:47:36 -07:00