Commit graph

7 commits

Author SHA1 Message Date
Stavros Kontopoulos 20ce1a8f8b [SPARK-24551][K8S] Add integration tests for secrets
## What changes were proposed in this pull request?

- Adds integration tests for env and mount secrets.

## How was this patch tested?

Manually by checking that secrets were added to the containers and by tuning the tests.

![image](https://user-images.githubusercontent.com/7945591/42968472-fee3740a-8bab-11e8-9eac-573f67d861fc.png)

Author: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com>

Closes #21652 from skonto/add-secret-its.
2018-07-20 07:55:58 -05:00
Devaraj K d404e54e64 [SPARK-24129][K8S] Add option to pass --build-arg's to docker-image-tool.sh
## What changes were proposed in this pull request?

Adding `-b arg` option to take `--build-arg` parameters to pass into the docker command

## How was this patch tested?

I verified by passing proxy details which fails without this change and succeeds with the changes.

Author: Devaraj K <devaraj@apache.org>

Closes #21202 from devaraj-kavali/SPARK-24129.
2018-07-18 16:18:29 -05:00
Ray Burgemeestre 15747cfd32 [SPARK-24547][K8S] Allow for building spark on k8s docker images without cache and don't forget to push spark-py container.
## What changes were proposed in this pull request?

https://issues.apache.org/jira/browse/SPARK-24547

TL;DR from JIRA issue:

- First time I generated images for 2.4.0 Docker was using it's cache, so actually when running jobs, old jars where still in the Docker image. This produces errors like this in the executors:

`java.io.InvalidClassException: org.apache.spark.storage.BlockManagerId; local class incompatible: stream classdesc serialVersionUID = 6155820641931972169, local class serialVersionUID = -3720498261147521051`

- The second problem was that the spark container is pushed, but the spark-py container wasn't yet. This was just forgotten in the initial PR.

- A third problem I also ran into because I had an older docker was https://github.com/apache/spark/pull/21551 so I have not included a fix for that in this ticket.

## How was this patch tested?

I've tested it on my own Spark on k8s deployment.

Author: Ray Burgemeestre <ray.burgemeestre@brightcomputing.com>

Closes #21555 from rayburgemeestre/SPARK-24547.
2018-06-20 17:09:37 -07:00
Ilan Filonenko 1a644afbac [SPARK-23984][K8S] Initial Python Bindings for PySpark on K8s
## What changes were proposed in this pull request?

Introducing Python Bindings for PySpark.

- [x] Running PySpark Jobs
- [x] Increased Default Memory Overhead value
- [ ] Dependency Management for virtualenv/conda

## How was this patch tested?

This patch was tested with

- [x] Unit Tests
- [x] Integration tests with [this addition](https://github.com/apache-spark-on-k8s/spark-integration/pull/46)
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- Run SparkPi with a test secret mounted into the driver and executor pods
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
Run completed in 4 minutes, 28 seconds.
Total number of tests run: 11
Suites: completed 2, aborted 0
Tests: succeeded 11, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Author: Ilan Filonenko <if56@cornell.edu>
Author: Ilan Filonenko <ifilondz@gmail.com>

Closes #21092 from ifilonenko/master.
2018-06-08 11:18:34 -07:00
Ilan Filonenko f15906da15 [SPARK-22839][K8S] Remove the use of init-container for downloading remote dependencies
## What changes were proposed in this pull request?

Removal of the init-container for downloading remote dependencies. Built off of the work done by vanzin in an attempt to refactor driver/executor configuration elaborated in [this](https://issues.apache.org/jira/browse/SPARK-22839) ticket.

## How was this patch tested?

This patch was tested with unit and integration tests.

Author: Ilan Filonenko <if56@cornell.edu>

Closes #20669 from ifilonenko/remove-init-container.
2018-03-19 11:29:56 -07:00
Jooseong Kim d5b41aea62 [SPARK-23618][K8S][BUILD] Initialize BUILD_ARGS in docker-image-tool.sh
## What changes were proposed in this pull request?

This change initializes BUILD_ARGS to an empty array when $SPARK_HOME/RELEASE exists.

In function build, "local BUILD_ARGS" effectively creates an array of one element where the first and only element is an empty string, so "${BUILD_ARGS[]}" expands to "" and passes an extra argument to docker.

Setting BUILD_ARGS to an empty array makes "${BUILD_ARGS[]}" expand to nothing.

## How was this patch tested?

Manually tested.

$ cat RELEASE
Spark 2.3.0 (git revision a0d7949896) built for Hadoop 2.7.3
Build flags: -Phadoop-2.7 -Phive -Phive-thriftserver -Pkafka-0-8 -Pmesos -Pyarn -Pkubernetes -Pflume -Psparkr -DzincPort=3036
$ ./bin/docker-image-tool.sh -m t testing build
Sending build context to Docker daemon  256.4MB
...

vanzin

Author: Jooseong Kim <jooseong@pinterest.com>

Closes #20791 from jooseong/SPARK-23618.
2018-03-12 11:31:34 -07:00
Marcelo Vanzin 0b2eefb674 [SPARK-22994][K8S] Use a single image for all Spark containers.
This change allows a user to submit a Spark application on kubernetes
having to provide a single image, instead of one image for each type
of container. The image's entry point now takes an extra argument that
identifies the process that is being started.

The configuration still allows the user to provide different images
for each container type if they so desire.

On top of that, the entry point was simplified a bit to share more
code; mainly, the same env variable is used to propagate the user-defined
classpath to the different containers.

Aside from being modified to match the new behavior, the
'build-push-docker-images.sh' script was renamed to 'docker-image-tool.sh'
to more closely match its purpose; the old name was a little awkward
and now also not entirely correct, since there is a single image. It
was also moved to 'bin' since it's not necessarily an admin tool.

Docs have been updated to match the new behavior.

Tested locally with minikube.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #20192 from vanzin/SPARK-22994.
2018-01-11 10:37:35 -08:00
Renamed from sbin/build-push-docker-images.sh (Browse further)