spark-instrumented-optimizer/resource-managers/kubernetes/integration-tests/pom.xml

190 lines
8.4 KiB
XML
Raw Normal View History

<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.12</artifactId>
<version>3.0.0-SNAPSHOT</version>
<relativePath>../../../pom.xml</relativePath>
</parent>
<artifactId>spark-kubernetes-integration-tests_2.12</artifactId>
<properties>
<download-maven-plugin.version>1.3.0</download-maven-plugin.version>
<exec-maven-plugin.version>1.4.0</exec-maven-plugin.version>
<extraScalaTestArgs></extraScalaTestArgs>
<kubernetes-client.version>4.6.4</kubernetes-client.version>
<scala-maven-plugin.version>3.2.2</scala-maven-plugin.version>
<scalatest-maven-plugin.version>1.0</scalatest-maven-plugin.version>
<sbt.project.name>kubernetes-integration-tests</sbt.project.name>
<!-- Integration Test Configuration Properties -->
<!-- Please see README.md in this directory for explanation of these -->
<spark.kubernetes.test.sparkTgz></spark.kubernetes.test.sparkTgz>
<spark.kubernetes.test.unpackSparkDir>${project.build.directory}/spark-dist-unpacked</spark.kubernetes.test.unpackSparkDir>
<spark.kubernetes.test.imageTag>N/A</spark.kubernetes.test.imageTag>
<spark.kubernetes.test.javaImageTag>8-jre-slim</spark.kubernetes.test.javaImageTag>
<spark.kubernetes.test.imageTagFile>${project.build.directory}/imageTag.txt</spark.kubernetes.test.imageTagFile>
<spark.kubernetes.test.deployMode>minikube</spark.kubernetes.test.deployMode>
<spark.kubernetes.test.imageRepo>docker.io/kubespark</spark.kubernetes.test.imageRepo>
<spark.kubernetes.test.kubeConfigContext></spark.kubernetes.test.kubeConfigContext>
<spark.kubernetes.test.master></spark.kubernetes.test.master>
<spark.kubernetes.test.namespace></spark.kubernetes.test.namespace>
<spark.kubernetes.test.serviceAccountName></spark.kubernetes.test.serviceAccountName>
<test.exclude.tags></test.exclude.tags>
<test.include.tags></test.include.tags>
</properties>
<packaging>jar</packaging>
<name>Spark Project Kubernetes Integration Tests</name>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${project.version}</version>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.fabric8</groupId>
<artifactId>kubernetes-client</artifactId>
<version>${kubernetes-client.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_${scala.binary.version}</artifactId>
<type>test-jar</type>
</dependency>
[SPARK-23153][K8S] Support client dependencies with a Hadoop Compatible File System ## What changes were proposed in this pull request? - solves the current issue with --packages in cluster mode (there is no ticket for it). Also note of some [issues](https://issues.apache.org/jira/browse/SPARK-22657) of the past here when hadoop libs are used at the spark submit side. - supports spark.jars, spark.files, app jar. It works as follows: Spark submit uploads the deps to the HCFS. Then the driver serves the deps via the Spark file server. No hcfs uris are propagated. The related design document is [here](https://docs.google.com/document/d/1peg_qVhLaAl4weo5C51jQicPwLclApBsdR1To2fgc48/edit). the next option to add is the RSS but has to be improved given the discussion in the past about it (Spark 2.3). ## How was this patch tested? - Run integration test suite. - Run an example using S3: ``` ./bin/spark-submit \ ... --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.6 \ --deploy-mode cluster \ --name spark-pi \ --class org.apache.spark.examples.SparkPi \ --conf spark.executor.memory=1G \ --conf spark.kubernetes.namespace=spark \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ --conf spark.driver.memory=1G \ --conf spark.executor.instances=2 \ --conf spark.sql.streaming.metricsEnabled=true \ --conf "spark.driver.extraJavaOptions=-Divy.cache.dir=/tmp -Divy.home=/tmp" \ --conf spark.kubernetes.container.image.pullPolicy=Always \ --conf spark.kubernetes.container.image=skonto/spark:k8s-3.0.0 \ --conf spark.kubernetes.file.upload.path=s3a://fdp-stavros-test \ --conf spark.hadoop.fs.s3a.access.key=... \ --conf spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem \ --conf spark.hadoop.fs.s3a.fast.upload=true \ --conf spark.kubernetes.executor.deleteOnTermination=false \ --conf spark.hadoop.fs.s3a.secret.key=... \ --conf spark.files=client:///...resolv.conf \ file:///my.jar ** ``` Added integration tests based on [Ceph nano](https://github.com/ceph/cn). Looks very [active](http://www.sebastien-han.fr/blog/2019/02/24/Ceph-nano-is-getting-better-and-better/). Unfortunately minio needs hadoop >= 2.8. Closes #23546 from skonto/support-client-deps. Authored-by: Stavros Kontopoulos <stavros.kontopoulos@lightbend.com> Signed-off-by: Erik Erlandson <eerlands@redhat.com>
2019-05-22 19:15:42 -04:00
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.7.4</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>${exec-maven-plugin.version}</version>
<executions>
<execution>
<id>setup-integration-test-env</id>
<phase>pre-integration-test</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>scripts/setup-integration-test-env.sh</executable>
<arguments>
<argument>--unpacked-spark-tgz</argument>
<argument>${spark.kubernetes.test.unpackSparkDir}</argument>
<argument>--image-repo</argument>
<argument>${spark.kubernetes.test.imageRepo}</argument>
<argument>--image-tag</argument>
<argument>${spark.kubernetes.test.imageTag}</argument>
<argument>--java-image-tag</argument>
<argument>${spark.kubernetes.test.javaImageTag}</argument>
<argument>--image-tag-output-file</argument>
<argument>${spark.kubernetes.test.imageTagFile}</argument>
<argument>--deploy-mode</argument>
<argument>${spark.kubernetes.test.deployMode}</argument>
<argument>--spark-tgz</argument>
<argument>${spark.kubernetes.test.sparkTgz}</argument>
</arguments>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skipTests>true</skipTests>
</configuration>
</plugin>
<plugin>
<!-- Triggers scalatest plugin in the integration-test phase instead of
the test phase. -->
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>${scalatest-maven-plugin.version}</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>SparkTestSuite.txt</filereports>
<argLine>-ea -Xmx4g -XX:ReservedCodeCacheSize=1g ${extraScalaTestArgs}</argLine>
<stderr/>
<systemProperties>
<log4j.configuration>file:src/test/resources/log4j.properties</log4j.configuration>
<java.awt.headless>true</java.awt.headless>
<spark.kubernetes.test.imageTagFile>${spark.kubernetes.test.imageTagFile}</spark.kubernetes.test.imageTagFile>
<spark.kubernetes.test.unpackSparkDir>${spark.kubernetes.test.unpackSparkDir}</spark.kubernetes.test.unpackSparkDir>
<spark.kubernetes.test.imageRepo>${spark.kubernetes.test.imageRepo}</spark.kubernetes.test.imageRepo>
<spark.kubernetes.test.deployMode>${spark.kubernetes.test.deployMode}</spark.kubernetes.test.deployMode>
<spark.kubernetes.test.kubeConfigContext>${spark.kubernetes.test.kubeConfigContext}</spark.kubernetes.test.kubeConfigContext>
<spark.kubernetes.test.master>${spark.kubernetes.test.master}</spark.kubernetes.test.master>
<spark.kubernetes.test.namespace>${spark.kubernetes.test.namespace}</spark.kubernetes.test.namespace>
<spark.kubernetes.test.serviceAccountName>${spark.kubernetes.test.serviceAccountName}</spark.kubernetes.test.serviceAccountName>
[SPARK-26729][K8S] Make image names under test configurable ## What changes were proposed in this pull request? Allow specifying system properties to customise the image names for the images used in the integration testing. Useful if your CI/CD pipeline or policy requires using a different naming format. This is one part of addressing SPARK-26729, I plan to have a follow up patch that will also make the names configurable when using `docker-image-tool.sh` ## How was this patch tested? Ran integration tests against custom images generated by our CI/CD pipeline that do not follow Spark's existing hardcoded naming conventions using the new system properties to override the image names appropriately: ``` mvn clean integration-test -pl :spark-kubernetes-integration-tests_${SCALA_VERSION} \ -Pkubernetes -Pkubernetes-integration-tests \ -P${SPARK_HADOOP_PROFILE} -Dhadoop.version=${HADOOP_VERSION} \ -Dspark.kubernetes.test.sparkTgz=${TARBALL} \ -Dspark.kubernetes.test.imageTag=${TAG} \ -Dspark.kubernetes.test.imageRepo=${REPO} \ -Dspark.kubernetes.test.namespace=${K8S_NAMESPACE} \ -Dspark.kubernetes.test.kubeConfigContext=${K8S_CONTEXT} \ -Dspark.kubernetes.test.deployMode=${K8S_TEST_DEPLOY_MODE} \ -Dspark.kubernetes.test.jvmImage=apache-spark \ -Dspark.kubernetes.test.pythonImage=apache-spark-py \ -Dspark.kubernetes.test.rImage=apache-spark-r \ -Dtest.include.tags=k8s ... [INFO] --- scalatest-maven-plugin:1.0:test (integration-test) spark-kubernetes-integration-tests_2.12 --- Discovery starting. Discovery completed in 230 milliseconds. Run starting. Expected test count is: 15 KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Use SparkLauncher.NO_RESOURCE - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - Run extraJVMOptions check on driver - Run SparkRemoteFileTest using a remote data file - Run SparkPi with env and mount secrets. - Run PySpark on simple pi.py example - Run PySpark with Python2 to test a pyfiles example - Run PySpark with Python3 to test a pyfiles example - Run PySpark with memory customization - Run in client mode. - Start pod creation from template Run completed in 8 minutes, 33 seconds. Total number of tests run: 15 Suites: completed 2, aborted 0 Tests: succeeded 15, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Closes #23846 from rvesse/SPARK-26729. Authored-by: Rob Vesse <rvesse@dotnetrdf.org> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-03-20 17:28:27 -04:00
<spark.kubernetes.test.jvmImage>${spark.kubernetes.test.jvmImage}</spark.kubernetes.test.jvmImage>
<spark.kubernetes.test.pythonImage>${spark.kubernetes.test.pythonImage}</spark.kubernetes.test.pythonImage>
<spark.kubernetes.test.rImage>${spark.kubernetes.test.rImage}</spark.kubernetes.test.rImage>
</systemProperties>
<tagsToExclude>${test.exclude.tags}</tagsToExclude>
<tagsToInclude>${test.include.tags}</tagsToInclude>
</configuration>
<executions>
<execution>
<id>test</id>
<phase>none</phase>
<goals>
<goal>test</goal>
</goals>
</execution>
<execution>
<id>integration-test</id>
<phase>integration-test</phase>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>