## What changes were proposed in this pull request?
Re-introduced SparkR integration tests as part of the SparkR on K8S release. This PR awaits Jenkins availability.
## How was this patch tested?
This patch was tested with unit tests and integration tests.
Closes#22145 from ifilonenko/spark-r-with-tests.
Authored-by: Ilan Filonenko <if56@cornell.edu>
Signed-off-by: shane knapp <incomplete@gmail.com>
## What changes were proposed in this pull request?
Currently, if we run the Kubernetes integration tests with `SPARK_HOME` already set, it refers the `SPARK_HOME` even when `--spark-tgz` is specified.
This PR proposes to unset `SPARK_HOME` to let the docker-image-tool script detect `SPARK_HOME`. Otherwise, it cannot indicate the unpacked directory as its home.
## How was this patch tested?
```bash
export SPARK_HOME=`pwd`
dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pkubernetes
resource-managers/kubernetes/integration-tests/dev/dev-run-integration-tests.sh --deploy-mode docker-for-desktop --spark-tgz $PWD/spark-*.tgz
```
**Before:**
```
+ /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/bin/docker-image-tool.sh -r docker.io/kubespark -t 650B51C8-BBED-47C9-AEAB-E66FC9A0E64E -p /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory
cp: assembly/target/scala-2.12/jars: No such file or directory
cp: resource-managers/kubernetes/integration-tests/tests: No such file or directory
cp: examples/target/scala-2.12/jars/*: No such file or directory
cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory
cp: resource-managers/kubernetes/docker/src/main/dockerfiles: No such file or directory
Cannot find docker image. This script must be run from a runnable distribution of Apache Spark.
...
[INFO] Spark Project Kubernetes Integration Tests ......... FAILURE [ 4.870 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
```
**After:**
```
+ /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/bin/docker-image-tool.sh -r docker.io/kubespark -t 2BA5883A-A0AC-4D2B-8D00-702D31B59B23 -p /.../spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked/kubernetes/dockerfiles/spark/bindings/python/Dockerfile build
Sending build context to Docker daemon 250.2MB
Step 1/15 : FROM openjdk:8-alpine
---> a3562aa0b991
...
Successfully built 8614fb5ac279
Successfully tagged kubespark/spark:2BA5883A-A0AC-4D2B-8D00-702D31B59B23
```
Closes#25283 from HyukjinKwon/SPARK-28550.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Speed up running k8s integration tests locally by allowing folks to skip the tgz dist build and extraction
Run tests locally without a distribution of Spark, just a local build
Closes#23380 from holdenk/SPARK-26343-Speed-up-running-the-kubernetes-integration-tests-locally.
Authored-by: Holden Karau <holden@pigscanfly.ca>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
## What changes were proposed in this pull request?
bin/docker-image-tool.sh tries to build all docker images (JVM, PySpark
and SparkR) by default. But not all spark distributions are built with
SparkR and hence this script will fail on such distros.
With this change, we make building alternate language binding docker images (PySpark and SparkR) optional. User has to specify dockerfile for those language bindings using -p and -R flags accordingly, to build the binding docker images.
## How was this patch tested?
Tested following scenarios.
*bin/docker-image-tool.sh -r <repo> -t <tag> build* --> Builds only JVM docker image (default behavior)
*bin/docker-image-tool.sh -r <repo> -t <tag> -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile build* --> Builds both JVM and PySpark docker images
*bin/docker-image-tool.sh -r <repo> -t <tag> -p kubernetes/dockerfiles/spark/bindings/python/Dockerfile -R kubernetes/dockerfiles/spark/bindings/R/Dockerfile build* --> Builds JVM, PySpark and SparkR docker images.
Author: Nagaram Prasad Addepally <ram@cloudera.com>
Closes#23053 from ramaddepally/SPARK-25957.
## What changes were proposed in this pull request?
Currently K8S integration tests are hardcoded to use a `minikube` based backend. `minikube` is VM based so can be resource hungry and also doesn't cope well with certain networking setups (for example using Cisco AnyConnect software VPN `minikube` is unusable as it detects its own IP incorrectly).
This PR Adds a new K8S integration testing backend that allows for using the Kubernetes support in [Docker for Desktop](https://blog.docker.com/2018/07/kubernetes-is-now-available-in-docker-desktop-stable-channel/). It also generalises the framework to be able to run the integration tests against an arbitrary Kubernetes cluster.
To Do:
- [x] General Kubernetes cluster backend
- [x] Documentation on Kubernetes integration testing
- [x] Testing of general K8S backend
- [x] Check whether change from timestamps being `Time` to `String` in Fabric 8 upgrade needs additional fix up
## How was this patch tested?
Ran integration tests with Docker for Desktop and all passed:
![screen shot 2018-10-23 at 14 19 56](https://user-images.githubusercontent.com/2104864/47363460-c5816a00-d6ce-11e8-9c15-56b34698e797.png)
Suggested Reviewers: ifilonenko srowen
Author: Rob Vesse <rvesse@dotnetrdf.org>
Closes#22805 from rvesse/SPARK-25809.
These tests were developed in the https://github.com/apache-spark-on-k8s/spark-integration repo
by several contributors. This is a copy of the current state into the main apache spark repo.
The only changes from the current spark-integration repo state are:
* Move the files from the repo root into resource-managers/kubernetes/integration-tests
* Add a reference to these tests in the root README.md
* Fix a path reference in dev/dev-run-integration-tests.sh
* Add a TODO in include/util.sh
## What changes were proposed in this pull request?
Incorporation of Kubernetes integration tests.
## How was this patch tested?
This code has its own unit tests, but the main purpose is to provide the integration tests.
I tested this on my laptop by running dev/dev-run-integration-tests.sh --spark-tgz ~/spark-2.4.0-SNAPSHOT-bin--.tgz
The spark-integration tests have already been running for months in AMPLab, here is an example:
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-scheduled-spark-integration-master/
Please review http://spark.apache.org/contributing.html before opening a pull request.
Author: Sean Suchter <sean-github@suchter.com>
Author: Sean Suchter <ssuchter@pepperdata.com>
Closes#20697 from ssuchter/ssuchter-k8s-integration-tests.