### What changes were proposed in this pull request?
Pull out NoOpMergedShuffleFileManager inner class outside. This is required since passing dollar sign ($) for the config (`spark.shuffle.server.mergedShuffleFileManagerImpl`) value can be an issue. Currently `spark.shuffle.server.mergedShuffleFileManagerImpl` is by default set to `org.apache.spark.network.shuffle.ExternalBlockHandler$NoOpMergedShuffleFileManager`. After this change the default value be set to `org.apache.spark.network.shuffle.NoOpMergedShuffleFileManager`
### Why are the changes needed?
Passing `$` for the config value can be an issue.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Modified existing unit tests.
Closes#33688 from venkata91/SPARK-36460.
Authored-by: Venkata krishnan Sowrirajan <vsowrirajan@linkedin.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
In stage-level resource scheduling, the allocated 3rd party resources can be obtained in TaskContext using resources() interface, however there is no API to get how many cpus are allocated for the task. Will add a cpus() interface to TaskContext to complement resources(). Althrough the task cpu requests can be got from profile, it's more convenient to get it inside the task code without the need to pass profile from driver side to the executor side.
### What changes were proposed in this pull request?
Add cpus() interface in TaskContext and modify relevant code.
### Why are the changes needed?
TaskContext has resources() to get 3rd party resources allocated. the is no API to get CPU allocated for the task.
### Does this PR introduce _any_ user-facing change?
Add cpus() interface for TaskContext
### How was this patch tested?
Unit tests
Closes#33385 from xwu99/taskcontext-cpus.
Lead-authored-by: Wu, Xiaochang <xiaochang.wu@intel.com>
Co-authored-by: Xiaochang Wu <xiaochang.wu@intel.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
### What changes were proposed in this pull request?
There are 3 ways to use Guava cache in spark code:
1. `Loadingcache` is the main way to use Guava cache in spark code and the key usages are as follows:
a. `LoadingCache` with `maximumsize` data eviction policy, such as `appCache` in `ApplicationCache`, `cache` in `Codegenerator`
b. `LoadingCache` with `maximumWeight` data eviction policy, such as `shuffleIndexCache` in `ExternalShuffleBlockResolver`
c. `LoadingCache` with 'expireAfterWrite' data eviction policy, such as `tableRelationCache` in `SessionCatalog`
2. `ManualCache` is another way to use Guava cache in spark code and the key usage is `cache` in `SharedInMemoryCache`, it use to caches partition file statuses in memory
3. The last use way is `hadoopJobMetadata` in `SparkEnv`, it uses Guava Cache to build a `soft-reference map`.
The goal of this pr is use `Caffeine` instead of `Guava Cache` because `Caffeine` is faster than `Guava Cache` from benchmarks, the main changes as follows:
1. Add `Caffeine` deps to maven `pom.xml`
2. Use `Caffeine` instead of Guava `LoadingCache`, `ManualCache` and soft-reference map in `SparkEnv`
3. Add `LocalCacheBenchmark` to compare performance of `Loadingcache` between `Guava Cache` and `Caffeine`
### Why are the changes needed?
`Caffeine` is faster than `Guava Cache` from benchmarks
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Pass the Jenkins or GitHub Action
- Add `LocalCacheBenchmark` to compare performance of `Loadingcache` between `Guava Cache` and `Caffeine`
Closes#31517 from LuciferYang/guava-cache-to-caffeine.
Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Holden Karau <hkarau@netflix.com>
### What changes were proposed in this pull request?
Switching back the "PVs with local storage" integration test on Docker driver.
I have analyzed why this test was failing on my machine (I hope the root cause of the problem is OS agnostic).
It failed because of the mounting of the host directory into the Minikube node using the `--uid=185` (Spark user user id):
```
$ minikube mount ${PVC_TESTS_HOST_PATH}:${PVC_TESTS_VM_PATH} --9p-version=9p2000.L --gid=0 --uid=185 &; MOUNT_PID=$!
```
Are referring to a nonexistent user. See the the number of occurence of 185 in "/etc/passwd":
```
$ minikube ssh "grep -c 185 /etc/passwd"
0
```
This leads to a permission denied. Skipping the `--uid=185` won't help although the path will listable before the test execution:
```
╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
╰─$ 📁 Mounting host path /var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/tmp.k9X4Gecv into VM as /var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/tmp.k9X4Gecv ...
▪ Mount type:
▪ User ID: docker
▪ Group ID: 0
▪ Version: 9p2000.L
▪ Message Size: 262144
▪ Permissions: 755 (-rwxr-xr-x)
▪ Options: map[]
▪ Bind Address: 127.0.0.1:51740
🚀 Userspace file server: ufs starting
╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
╰─$ minikube ssh "ls /var/folders/t_/fr_vqcyx23vftk81ftz1k5hw0000gn/T/tmp.k9X4Gecv"
╭─attilazsoltpirosapiros-MBP16 ~/git/attilapiros/spark ‹SPARK-35430*›
╰─$
```
But the test will fail and after its execution the `dmesg` shows the following error:
```
[13670.493359] bpfilter: Loaded bpfilter_umh pid 66153
[13670.493363] bpfilter: write fail -32
[13670.530737] bpfilter: Loaded bpfilter_umh pid 66155
...
```
This `bpfilter` is a firewall module and we are back to a permission denied when we want to list the mounted directory.
The solution is to add a spark user with 185 uid when the minikube is started.
**So this must be added to Jenkins job (and the mount should use --gid=0 --uid=185)**:
```
$ minikube ssh "sudo useradd spark -u 185 -g 0 -m -s /bin/bash"
```
### Why are the changes needed?
This integration test is needed to validate the PVs feature.
### Does this PR introduce _any_ user-facing change?
No. It is just testing.
### How was this patch tested?
Running the test locally:
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
```
The "PVs with local storage" was successful but the next test `Launcher client dependencies` the minio stops the test executions on Mac (only on Mac):
```
21/06/29 04:33:32.449 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: 🏃 Starting tunnel for service minio-s3.
21/06/29 04:33:33.425 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: |----------------------------------|----------|-------------|------------------------|
21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: | NAMESPACE | NAME | TARGET PORT | URL |
21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: |----------------------------------|----------|-------------|------------------------|
21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: | 7855c37ca34340c49a98aa8439f4935c | minio-s3 | | http://127.0.0.1:62138 |
21/06/29 04:33:33.426 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: |----------------------------------|----------|-------------|------------------------|
21/06/29 04:33:33.449 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: http://127.0.0.1:62138
21/06/29 04:33:33.449 ScalaTest-main-running-KubernetesSuite INFO ProcessUtils: ❗ Because you are using a Docker driver on darwin, the terminal needs to be open to run it.
```
This is a different problem which is a docker desktop limitation (https://docs.docker.com/docker-for-mac/networking/#per-container-ip-addressing-is-not-possible).
Of course with the default driver on Mac, on hyperkit, all the tests are passing:
```
[INFO] --- scalatest-maven-plugin:2.0.0:test (integration-test) spark-kubernetes-integration-tests_2.12 ---
Discovery starting.
Discovery completed in 498 milliseconds.
Run starting. Expected test count is: 26
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
...
[INFO] BUILD SUCCESS
```
Closes#32793 from attilapiros/SPARK-35430.
Authored-by: attilapiros <piros.attila.zsolt@gmail.com>
Signed-off-by: shane knapp <incomplete@gmail.com>
### What changes were proposed in this pull request?
This PR aims to add a new config to allow K8s API server-side caching for pod listing.
### Why are the changes needed?
Apache Spark currently requests the most recent data which should be consistent. New configuration looses the restriction to reduce the server-side overhead by allowing K8S API server side caching.
https://kubernetes.io/docs/reference/using-api/api-concepts/#the-resourceversion-parameter
- `resourceVersion`: unset
> Most Recent: Return data at the most recent resource version. The returned data must be consistent (i.e. served from etcd via a quorum read).
- `resourceVersion`: "0"
> Any: Return data at any resource version. The newest available resource version is preferred, but strong consistency is not required; data at any resource version may be served.
### Does this PR introduce _any_ user-facing change?
Yes, this is a new feature to reduce the K8s API server side overhead.
### How was this patch tested?
Pass the CIs.
Closes#33563 from dongjoon-hyun/SPARK-36334.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
`ExternalBlockHandler` exposes 4 metrics which are Dropwizard `Timer` metrics, and are named with a `millis` suffix:
```
private final Timer openBlockRequestLatencyMillis = new Timer();
private final Timer registerExecutorRequestLatencyMillis = new Timer();
private final Timer fetchMergedBlocksMetaLatencyMillis = new Timer();
private final Timer finalizeShuffleMergeLatencyMillis = new Timer();
```
However these Dropwizard Timers by default use nanoseconds ([documentation](https://metrics.dropwizard.io/3.2.3/getting-started.html#timers)).
This causes `YarnShuffleServiceMetrics` to expose confusingly-named metrics like `openBlockRequestLatencyMillis_nanos_max` (the actual values are currently in nanos).
This PR adds a new `Timer` subclass, `TimerWithCustomTimeUnit`, which accepts a `TimeUnit` at creation time and exposes timing information using this time unit when values are read. Internally, values are still stored with nanosecond-level precision. The `Timer` metrics within `ExternalBlockHandler` are updated to use the new class with milliseconds as the unit. The logic to include the `nanos` suffix in the metric name within `YarnShuffleServiceMetrics` has also been removed, with the assumption that the metric name itself includes the units.
### Does this PR introduce _any_ user-facing change?
Yes, there are two changes.
First, the names for metrics exposed by `ExternalBlockHandler` via `YarnShuffleServiceMetrics` such as `openBlockRequestLatencyMillis_nanos_max` and `openBlockRequestLatencyMillis_nanos_50thPercentile` have been changed to remove the `_nanos` suffix. This would be considered a breaking change, but these names were only exposed as part of #32388, which has not yet been released (slated for 3.2.0). New names are like `openBlockRequestLatencyMillis_max` and `openBlockRequestLatencyMillis_50thPercentile`
Second, the values of the metrics themselves have changed, to expose milliseconds instead of nanoseconds. Note that this does not affect metrics such as `openBlockRequestLatencyMillis_count` or `openBlockRequestLatencyMillis_rate1`, only the `Snapshot`-related metrics (`max`, `median`, percentiles, etc.). For the YARN case, these metrics were also introduced by #32388, and thus also have not yet been released. It was possible for the nanosecond values to be consumed by some other metrics reporter reading the Dropwizard metrics directly, but I'm not aware of any such usages.
### How was this patch tested?
Unit tests have been updated.
Closes#33116 from xkrogen/xkrogen-SPARK-35259-ess-fix-metric-unit-prefix.
Authored-by: Erik Krogen <xkrogen@apache.org>
Signed-off-by: yi.wu <yi.wu@databricks.com>
### What changes were proposed in this pull request?
Add a new configuration flag to allow Spark to provide hints to the scheduler when we are decommissioning or exiting a pod that this pod will have the least impact for a pre-emption event.
### Why are the changes needed?
Kubernetes added the concepts of pod disruption budgets (which can have selectors based on labels) as well pod deletion for providing hints to the scheduler as to what we would prefer to have pre-empted.
### Does this PR introduce _any_ user-facing change?
New configuration flag
### How was this patch tested?
The deletion unit test was extended.
Closes#33270 from holdenk/SPARK-35956-support-auto-assigning-labels-to-decommissioning-pods.
Lead-authored-by: Holden Karau <hkarau@netflix.com>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Co-authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Holden Karau <hkarau@netflix.com>
### What changes were proposed in this pull request?
Add the support for specifiying executor/driver node selector:
- spark.kubernetes.driver.node.selector.
- spark.kubernetes.executor.node.selector.
### Why are the changes needed?
Now we can only use "spark.kubernetes.node.selector" to set lable for executor/driver. Sometimes, we need set executor/driver pods to different selector separately.
### Does this PR introduce _any_ user-facing change?
Yes
### How was this patch tested?
- KubernetesConfSuite for new added configure
- BasicDriverFeatureStepSuite to make sure driver pods node selector set properly
- BasicExecutorFeatureStepSuite to make sure excutor pods node selector set properly
Closes#33283 from Yikun/SPARK-36075.
Authored-by: Yikun Jiang <yikunkero@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
```logtalk
21/05/20 21:41:21 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying snapshot subscriber.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://kubernetes.docker.internal:6443/api/v1/namespaces/default/pods. Message: Pod "spark_exec-exec-688" is invalid: [metadata.name: Invalid value: "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), spec.hostname: Invalid value: "spark_exec-exec-688": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')]. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[StatusCause(field=metadata.name, message=Invalid value: "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), reason=FieldValueInvalid, additionalProperties={}), StatusCause(field=spec.hostname, message=Invalid value: "spark_exec-exec-688": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?'), reason=FieldValueInvalid, additionalProperties={})], group=null, kind=Pod, name=spark_exec-exec-688, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=Pod "spark_exec-exec-688" is invalid: [metadata.name: Invalid value: "spark_exec-exec-688": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*'), spec.hostname: Invalid value: "spark_exec-exec-688": a DNS-1123 label must consist of lower case alphanumeric characters or '-', and must start and end with an alphanumeric character (e.g. 'my-name', or '123-abc', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?')], metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:583)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:522)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:487)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:448)
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:263)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:870)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:365)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:86)
```
When `spark.kubernetes.executor.podNamePrefix` contains invalid characters, the driver will continuously fail to request executors from k8s master, which causes the app to hang with the above message - `'Message: Pod "spark_exec-exec-688" is invalid'`.
In this PR we fail the app when the setting is wrong.
### Why are the changes needed?
`spark.kubernetes.executor.podNamePrefix` is used when users may want full control of executor pod names. It will hang apps w/ wrong characters, it's better to fail directly.
### Does this PR introduce _any_ user-facing change?
yes, invalid `spark.kubernetes.executor.podNamePrefix` cause app to fail not to hang.
### How was this patch tested?
new tests
Closes#32610 from yaooqinn/SPARK-35460.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
Use uuid instead of `System. currentTimeMillis` as app id in kubernetes client mode.
### Why are the changes needed?
Currently, spark on kubernetes with client mode would use `"spark-application-" + System.currentTimeMillis` as app id by default. It would cause app id conflict if submit several spark applications to kubernetes cluster in a short time.
Unfortunately, the event log use app id as the file name. With the conflict event log file, the exception was thrown.
```
Caused by: java.io.FileNotFoundException: File does not exist: xxx/spark-application-1624766876324.lz4.inprogress (inode 5984170846) Holder does not have any open files.
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2697)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.analyzeFileState(FSDirWriteFileOp.java:521)
at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.validateAddBlock(FSDirWriteFileOp.java:161)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2579)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:846)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:510)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817)
```
### Does this PR introduce _any_ user-facing change?
yes
### How was this patch tested?
manual test
![image](https://user-images.githubusercontent.com/12025282/124435341-7a88e180-dda7-11eb-8e62-bdfec6a0ee3b.png)
Closes#33211 from ulysses-you/k8s-appid.
Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
This PR fixes an issue that `YarnClusterSuite` fails due to `NoClassDefFoundError unless `hadoop-3.2` profile is activated explicitly regardless of building with SBT or Maven.
```
build/sbt -Pyarn "yarn/testOnly org.apache.spark.deploy.yarn.YarnClusterSuite"
...
[info] YarnClusterSuite:
[info] org.apache.spark.deploy.yarn.YarnClusterSuite *** ABORTED *** (598 milliseconds)
[info] java.lang.NoClassDefFoundError: org/bouncycastle/operator/OperatorCreationException
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:888)
[info] at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1410)
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:344)
[info] at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.initResourceManager(MiniYARNCluster.java:359)
```
The solution is modifying `yarn/pom.xml` to activate `hadoop-3.2` profiles by default.
### Why are the changes needed?
hadoop-3.2 profile should be enabled by default so `YarnClusterSuite` should also successfully finishes without `-Phadoop-3.2`.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Run `YarnClusterSuite` with both SBT and Maven without `-Phadoop-3.2` and it successfully finished.
```
build/sbt -Pyarn "yarn/testOnly org.apache.spark.deploy.yarn.YarnClusterSuite"
...
[info] Run completed in 5 minutes, 38 seconds.
[info] Total number of tests run: 27
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
build/mvn -Pyarn -pl resource-managers/yarn test -Dtest=none -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite
...
Run completed in 5 minutes, 49 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```
Closes#33276 from sarutak/fix-bouncy-issue.
Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request?
This PR aims to update `master` branch version to 3.3.0-SNAPSHOT.
### Why are the changes needed?
Start to prepare Apache Spark 3.3.0 and the published snapshot version should not conflict with `branch-3.2`.
### Does this PR introduce _any_ user-facing change?
N/A.
### How was this patch tested?
Pass the CIs.
Closes#33196 from dongjoon-hyun/SPARK-35996.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
By default, the executor pod prefix is generated by the app name. It handles characters that match [^a-z0-9\\-] differently. The '.' and all whitespaces will be converted to '-', but other ones to empty string. Especially, characters like '_', '|' are commonly used as a word separator in many languages.
According to the K8S DNS Label Names, see https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#dns-label-names, we can convert all special characters to `-`.
For example,
```
scala> "xyz_abc_i_am_a_app_name_w/_some_abbrs".replaceAll("[^a-z0-9\\-]", "-").replaceAll("-+", "-")
res11: String = xyz-abc-i-am-a-app-name-w-some-abbrs
scala> "xyz_abc_i_am_a_app_name_w/_some_abbrs".replaceAll("\\s+", "-").replaceAll("\\.", "-").replaceAll("[^a-z0-9\\-]", "").replaceAll("-+", "-")
res12: String = xyzabciamaappnamewsomeabbrs
```
```scala
scala> "time.is%the¥most$valuable_——————thing,it's about time.".replaceAll("[^a-z0-9\\-]", "-").replaceAll("-+", "-")
res9: String = time-is-the-most-valuable-thing-it-s-about-time-
scala> "time.is%the¥most$valuable_——————thing,it's about time.".replaceAll("\\s+", "-").replaceAll("\\.", "-").replaceAll("[^a-z0-9\\-]", "").replaceAll("-+", "-")
res10: String = time-isthemostvaluablethingits-about-time-
```
### Why are the changes needed?
For better UX
### Does this PR introduce _any_ user-facing change?
yes, the executor pod name might look better
### How was this patch tested?
add new ones
Closes#33171 from yaooqinn/SPARK-35969.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
This adds two new additional metrics to `ExternalBlockHandler`:
- `blockTransferRate` -- for indicating the rate of transferring blocks, vs. the data within them
- `blockTransferAvgSize_1min` -- a 1-minute trailing average of block sizes transferred by the ESS
Additionally, this enhances `YarnShuffleServiceMetrics` to expose the histogram/`Snapshot` information from `Timer` metrics within `ExternalBlockHandler`.
### Why are the changes needed?
Currently `ExternalBlockHandler` exposes some useful metrics, but is lacking around metrics for the rate of block transfers. We have `blockTransferRateBytes` to tell us the rate of _bytes_, but no metric to tell us the rate of _blocks_, which is especially relevant when running the ESS on HDDs that are sensitive to random reads. Many small block transfers can have a negative impact on performance, but won't show up as a spike in `blockTransferRateBytes` since the sizes are small. Thus the new metrics to show information around average block size and block transfer rate are very useful to monitor the health/performance of the ESS, especially when running on HDDs.
For the `YarnShuffleServiceMetrics`, currently the three `Timer` metrics exposed by `ExternalBlockHandler` are being underutilized in a YARN-based environment -- they are basically treated as a `Meter`, only exposing rate-based information, when the metrics themselves are collected detailed histograms of timing information. We should expose this information for better observability.
### Does this PR introduce _any_ user-facing change?
Yes, there are two entirely new metrics for the ESS, as documented in `monitoring.md`. Additionally in a YARN environment, `Timer` metrics exposed by the ESS will include more rich timing information.
### How was this patch tested?
New unit tests are added to verify that new metrics are showing up as expected.
We have been running this patch internally for approx. 1 year and have found it to be useful for monitoring the health of ESS and diagnosing performance issues.
Closes#32388 from xkrogen/xkrogen-SPARK-35258-ess-new-metrics.
Authored-by: Erik Krogen <xkrogen@apache.org>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
### What changes were proposed in this pull request?
Refactor the logic for constructing the user classpath from `yarn.ApplicationMaster` into `yarn.Client` so that it can be leveraged on the executor side as well, instead of having the driver construct it and pass it to the executor via command-line arguments. A new method, `getUserClassPath`, is added to `CoarseGrainedExecutorBackend` which defaults to `Nil` (consistent with the existing behavior where non-YARN resource managers do not configure the user classpath). `YarnCoarseGrainedExecutorBackend` overrides this to construct the user classpath from the existing `APP_JAR` and `SECONDARY_JARS` configs.
### Why are the changes needed?
User-provided JARs are made available to executors using a custom classloader, so they do not appear on the standard Java classpath. Instead, they are passed as a list to the executor which then creates a classloader out of the URLs. Currently in the case of YARN, this list of JARs is crafted by the Driver (in `ExecutorRunnable`), which then passes the information to the executors (`CoarseGrainedExecutorBackend`) by specifying each JAR on the executor command line as `--user-class-path /path/to/myjar.jar`. This can cause extremely long argument lists when there are many JARs, which can cause the OS argument length to be exceeded, typically manifesting as the error message:
> /bin/bash: Argument list too long
A [Google search](https://www.google.com/search?q=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22&oq=spark%20%22%2Fbin%2Fbash%3A%20argument%20list%20too%20long%22) indicates that this is not a theoretical problem and afflicts real users, including ours. Passing this list using the configurations instead resolves this issue.
### Does this PR introduce _any_ user-facing change?
No, except for fixing the bug, allowing for larger JAR lists to be passed successfully. Configuration of JARs is identical to before.
### How was this patch tested?
New unit tests were added in `YarnClusterSuite`. Also, we have been running a similar fix internally for 4 months with great success.
Closes#32810 from xkrogen/xkrogen-SPARK-35672-classpath-scalable.
Authored-by: Erik Krogen <xkrogen@apache.org>
Signed-off-by: Thomas Graves <tgraves@apache.org>
### What changes were proposed in this pull request?
This PR aims to use `keyserver.ubuntu.com` as a keyserver for CRAN.
### Why are the changes needed?
Currently, both servers fail and K8s IT fails at SparkR image building phase.
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20K8s%20Builds/job/spark-master-test-k8s/801/console
```
$ docker run -it --rm openjdk:11 /bin/bash
root3e89a8d05378:/# echo "deb http://cloud.r-project.org/bin/linux/debian buster-cran35/" >> /etc/apt/sources.list
root3e89a8d05378:/# (apt-key adv --keyserver keys.gnupg.net --recv-key 'E19F5F87128899B192B1A2C2AD5F960A256A04AF' || apt-key adv --keyserver keys.openpgp.org --recv-key 'E19F5F87128899B192B1A2C2AD5F960A256A04AF')
Executing: /tmp/apt-key-gpghome.8lNIiUuhoE/gpg.1.sh --keyserver keys.gnupg.net --recv-key E19F5F87128899B192B1A2C2AD5F960A256A04AF
gpg: keyserver receive failed: No name
Executing: /tmp/apt-key-gpghome.stxb8XUlx8/gpg.1.sh --keyserver keys.openpgp.org --recv-key E19F5F87128899B192B1A2C2AD5F960A256A04AF
gpg: key AD5F960A256A04AF: new key but contains no user ID - skipped
gpg: Total number processed: 1
gpg: w/o user IDs: 1
root3e89a8d05378:/# apt-get update
...
Err:3 http://cloud.r-project.org/bin/linux/debian buster-cran35/ InRelease
The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FCAE2A0E115C3D8A
...
W: GPG error: http://cloud.r-project.org/bin/linux/debian buster-cran35/ InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY FCAE2A0E115C3D8A
E: The repository 'http://cloud.r-project.org/bin/linux/debian buster-cran35/ InRelease' is not signed.
N: Updating from such a repository can't be done securely, and is therefore disabled by default.
N: See apt-secure(8) manpage for repository creation and user configuration details.
```
`keyserver.ubuntu.com` is a recommended backup server in CRAN document.
- http://cloud.r-project.org/bin/linux/debian/
```
$ docker run -it --rm openjdk:11 /bin/bash
rootc9b183e45ffe:/# echo "deb http://cloud.r-project.org/bin/linux/debian buster-cran35/" >> /etc/apt/sources.list
rootc9b183e45ffe:/# apt-key adv --keyserver keyserver.ubuntu.com --recv-key 'E19F5F87128899B192B1A2C2AD5F960A256A04AF'
Executing: /tmp/apt-key-gpghome.P6cxYkOge7/gpg.1.sh --keyserver keyserver.ubuntu.com --recv-key E19F5F87128899B192B1A2C2AD5F960A256A04AF
gpg: key AD5F960A256A04AF: public key "Johannes Ranke (Wissenschaftlicher Berater) <johannes.rankejrwb.de>" imported
gpg: Total number processed: 1
gpg: imported: 1
rootc9b183e45ffe:/# apt-get update
Get:1 http://deb.debian.org/debian buster InRelease [122 kB]
Get:2 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
Get:3 http://cloud.r-project.org/bin/linux/debian buster-cran35/ InRelease [4375 B]
Get:4 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
Get:5 http://cloud.r-project.org/bin/linux/debian buster-cran35/ Packages [53.3 kB]
Get:6 http://security.debian.org/debian-security buster/updates/main arm64 Packages [287 kB]
Get:7 http://deb.debian.org/debian buster/main arm64 Packages [7735 kB]
Get:8 http://deb.debian.org/debian buster-updates/main arm64 Packages [14.5 kB]
Fetched 8334 kB in 2s (4537 kB/s)
Reading package lists... Done
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass K8s IT Jenkins SparkR image building. Or, manually do the following.
```
$ bin/docker-image-tool.sh -R kubernetes/dockerfiles/spark/bindings/R/Dockerfile build
```
Closes#33071 from dongjoon-hyun/SPARK-35885.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
Improve error message when clients use wrong master URL to submit a job to k8s.
### Why are the changes needed?
Current error messages are not clear for users.
```
(base) ➜ spark git:(master) ./bin/spark-submit \
--master k8s://https://192.168.49.3:8443 \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=3 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=pingsutw/spark:testing \
local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar
21/06/09 20:50:37 WARN Utils: Your hostname, kobe-pc resolves to a loopback address: 127.0.1.1; using 192.168.103.20 instead (on interface ens160)
21/06/09 20:50:37 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
21/06/09 20:50:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/06/09 20:50:38 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
21/06/09 20:50:39 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create] for kind: [Pod] with name: [null] in namespace: [default] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72) at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:380) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:380)
at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:86) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:86)
```
Below command to reproduce;
```
./bin/spark-submit \
--master k8s://https://192.168.49.2:8443 \
--deploy-mode cluster \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
--conf spark.executor.instances=3 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.container.image=pingsutw/spark:testing \
local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar
```
### Does this PR introduce _any_ user-facing change?
Yes, users will see more clear error messages.
### How was this patch tested?
Pass the CIs.
Closes#32874 from pingsutw/SPARK-35699.
Authored-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
This PR aims to be more robust on the underlying Hadoop library changes. Apache Spark's `copyFileToRemote` has an option, `force`, to invoke copying always and it can hit `org.apache.hadoop.fs.PathOperationException` in some Hadoop versions.
From Apache Hadoop 3.3.1, we reverted [HADOOP-16878](https://issues.apache.org/jira/browse/HADOOP-16878) as the last revert commit on `branch-3.3.1`. However, it's still in Apache Hadoop 3.4.0.
- a3b9c37a39
### Why are the changes needed?
Currently, Apache Spark Jenkins hits a flakiness issue.
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2/lastCompletedBuild/testReport/org.apache.spark.deploy.yarn/ClientSuite/distribute_jars_archive/history/
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-3.2-jdk-11/2459/testReport/junit/org.apache.spark.deploy.yarn/ClientSuite/distribute_jars_archive/
```
org.apache.hadoop.fs.PathOperationException:
`Source (file:/home/jenkins/workspace/spark-master-test-maven-hadoop-3.2/resource-managers/yarn/target/tmp/spark-703b8e99-63cc-4ba6-a9bc-25c7cae8f5f9/testJar9120517778809167117.jar) and destination (/home/jenkins/workspace/spark-master-test-maven-hadoop-3.2/resource-managers/yarn/target/tmp/spark-703b8e99-63cc-4ba6-a9bc-25c7cae8f5f9/testJar9120517778809167117.jar)
are equal in the copy command.': Operation not supported
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:403)
```
Apache Spark has three cases.
- `!compareFs(srcFs, destFs)`: This is safe because we will not have this exception.
- `"file".equals(srcFs.getScheme)`: This is safe because this cannot be a `false` alarm.
- `force=true`:
- For the `good` alarm part, Spark works in the same way.
- For the `false` alarm part, Spark is safe because we use `force = true` only for copying `localConfArchive` instead of a general copy between two random clusters.
```scala
val localConfArchive = new Path(createConfArchive(confsToOverride).toURI())
copyFileToRemote(destDir, localConfArchive, replication, symlinkCache, force = true,
destName = Some(LOCALIZED_CONF_ARCHIVE))
```
### Does this PR introduce _any_ user-facing change?
No. This preserves the previous Apache Spark behavior.
### How was this patch tested?
Pass the Jenkins with Maven.
Closes#32983 from dongjoon-hyun/SPARK-35831.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
### What changes were proposed in this pull request?
This adds support in the ESS to serve merged shuffle block meta and data requests to executors.
This change is needed for fetching remote merged shuffle data from the remote shuffle services. This is part of push-based shuffle SPIP [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
This change introduces new messages between clients and the external shuffle service:
1. `MergedBlockMetaRequest`: The client sends this to external shuffle to get the meta information for a merged block. The response to this is one of these :
- `MergedBlockMetaSuccess` : contains request id, number of chunks, and a `ManagedBuffer` which is a `FileSegmentBuffer` backed by the merged block meta file.
- `RpcFailure`: this is sent back to client in case of failure. This is an existing message.
2. `FetchShuffleBlockChunks`: This is similar to `FetchShuffleBlocks` message but it is to fetch merged shuffle chunks instead of blocks.
### Why are the changes needed?
These changes are needed for push-based shuffle. Refer to the SPIP in [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Added unit tests.
The reference PR with the consolidated changes covering the complete implementation is also provided in [SPARK-30602](https://issues.apache.org/jira/browse/SPARK-30602).
We have already verified the functionality and the improved performance as documented in the SPIP doc.
Lead-authored-by: Chandni Singh chsinghlinkedin.com
Co-authored-by: Min Shen mshenlinkedin.com
Closes#32811 from otterc/SPARK-35671.
Lead-authored-by: Chandni Singh <singh.chandni@gmail.com>
Co-authored-by: Min Shen <mshen@linkedin.com>
Co-authored-by: Chandni Singh <chsingh@linkedin.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
### What changes were proposed in this pull request?
To make the test suite more robust, this PR aims to add a new trait, `LocalRootDirsTest`, by refactoring `SortShuffleSuite`'s helper functions and applying it to the following:
- ShuffleNettySuite
- ShuffleOldFetchProtocolSuite
- ExternalShuffleServiceSuite
- KubernetesLocalDiskShuffleDataIOSuite
- LocalDirsSuite
- RDDCleanerSuite
- ALSCleanerSuite
In addition, this fixes a UT in `KubernetesLocalDiskShuffleDataIOSuite`.
### Why are the changes needed?
`ShuffleSuite` is extended by four classes but only `SortShuffleSuite` does the clean-up correctly.
```
ShuffleSuite
- SortShuffleSuite
- ShuffleNettySuite
- ShuffleOldFetchProtocolSuite
- ExternalShuffleServiceSuite
```
Since `KubernetesLocalDiskShuffleDataIOSuite` is looking for the other storage directory, the leftover of `ShuffleSuite` causes flakiness.
- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/2649/testReport/junit/org.apache.spark.shuffle/KubernetesLocalDiskShuffleDataIOSuite/recompute_is_not_blocked_by_the_recovery/
```
org.apache.spark.SparkException: Job aborted due to stage failure: task 0.0 in stage 1.0 (TID 3) had a not serializable result: org.apache.spark.ShuffleSuite$NonJavaSerializableClass
...
org.apache.spark.shuffle.KubernetesLocalDiskShuffleDataIOSuite.$anonfun$new$2(KubernetesLocalDiskShuffleDataIOSuite.scala:52)
```
For the other suites, the clean-up implementation is used but not complete. So, they are refactored to use new trait.
### Does this PR introduce _any_ user-facing change?
No, this is a test-only change.
### How was this patch tested?
Pass the CIs.
Closes#32986 from dongjoon-hyun/SPARK-35832.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
This increases the timeout from 10 seconds to 60 seconds in KubernetesLocalDiskShuffleDataIOSuite to reduce the flakiness.
### Why are the changes needed?
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140003/testReport/
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs
Closes#32967 from dongjoon-hyun/SPARK-35593-2.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
### What changes were proposed in this pull request?
This upgrade default Hadoop version from 3.2.1 to 3.3.1. The changes here are simply update the version number and dependency file.
### Why are the changes needed?
Hadoop 3.3.1 just came out, which comes with many client-side improvements such as for S3A/ABFS (20% faster when accessing S3). These are important for users who want to use Spark in a cloud environment.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Existing unit tests in Spark
- Manually tested using my S3 bucket for event log dir:
```
bin/spark-shell \
-c spark.hadoop.fs.s3a.access.key=$AWS_ACCESS_KEY_ID \
-c spark.hadoop.fs.s3a.secret.key=$AWS_SECRET_ACCESS_KEY \
-c spark.eventLog.enabled=true
-c spark.eventLog.dir=s3a://<my-bucket>
```
- Manually tested against docker-based YARN dev cluster, by running `SparkPi`.
Closes#30135 from sunchao/SPARK-29250.
Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Print the driver pod name instead of Some(name) if absent
### What changes were proposed in this pull request?
### Why are the changes needed?
fix error hint
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
new test
Closes#32889 from yaooqinn/minork8s.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
Previously, the following two commits allow driver-owned on-demand PVC reuse.
- SPARK-35182 Support driver-owned on-demand PVC
- SPARK-35416 Support PersistentVolumeClaim Reuse
This PR aims to recover the shuffle data on those remounted PVCs. The lifecycle of PVCs are tied to the one of Spark jobs. Since this is K8s specific feature, `ShuffleDataIO` plugin is used.
### Why are the changes needed?
Although Pod is killed, we can remount PVCs and recover some data from it.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the newly added test cases.
Closes#32730 from dongjoon-hyun/SPARK-RECOVER-SHUFFLE-DATA.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
AtomicInteger is enough for executor ids, in this PR, we use it to replace AtomicLong like other cluster managers, e.g. yarn, standalone
### Why are the changes needed?
See the discussion here https://github.com/apache/spark/pull/32610#discussion_r648007320
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
pass CI with existing tests
Closes#32837 from yaooqinn/SPARK-35692.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
A follow-up for SPARK-32975 to avoid unexpected the `None.get` exception
Run SparkPi with docker desktop, as podName is an option, we will got
```logtalk
21/06/09 01:09:12 ERROR Utils: Uncaught exception in thread main
java.util.NoSuchElementException: None.get
at scala.None$.get(Option.scala:529)
at scala.None$.get(Option.scala:527)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$1(ExecutorPodsAllocator.scala:110)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1417)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.start(ExecutorPodsAllocator.scala:111)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.start(KubernetesClusterSchedulerBackend.scala:99)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:220)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:581)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2686)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:948)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:942)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:30)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
```
### Why are the changes needed?
fix a regression
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
Manual.
Closes#32830 from yaooqinn/SPARK-32975.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Add a new config that controls the timeout of waiting for driver pod's readiness before allocating executor pods. This wait only happens once on application start.
### Why are the changes needed?
The driver's headless service can be resolved by DNS only after the driver pod is ready. If the executor tries to connect to the headless service before driver pod is ready, it will hit UnkownHostException and get into error state but will not be restarted. **This case usually happens when the driver pod has sidecar containers but hasn't finished their creation when executors start.** So basically there is a race condition. This issue can be mitigated by tweaking this config.
### Does this PR introduce _any_ user-facing change?
A new config `spark.kubernetes.allocation.driver.readinessTimeout` added.
### How was this patch tested?
Exisiting tests.
Closes#32752 from cchriswu/SPARK-32975-fix.
Lead-authored-by: Chris Wu <wucaowei19@gmail.com>
Co-authored-by: Chris Wu <wcaowei@vmware.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This is a follow-up of https://github.com/apache/spark/pull/32564 .
### Why are the changes needed?
To use Set instead of ArrayBuffer and add a return type.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs.
Closes#32758 from dongjoon-hyun/SPARK-35416-2.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
### What changes were proposed in this pull request?
This PR set a default value for `spark.kubernetes.test.sparkTgz` in `kubernetes/integration-tests/pom.xml` for Kubernetes integration tests.
### Why are the changes needed?
In the current master, running the integration tests with the following command will fail because there is no default value set for the property.
```
build/mvn -Dspark.kubernetes.test.namespace=default -Pkubernetes -Pkubernetes-integration-tests -Psparkr -pl resource-managers/kubernetes/integration-tests integration-test
```
```
+ mkdir -p /home/kou/work/oss/spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked
+ tar -xzvf --test-exclude-tags --strip-components=1 -C /home/kou/work/oss/spark/resource-managers/kubernetes/integration-tests/target/spark-dist-unpacked
tar (child): --test-exclude-tags: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
[ERROR] Command execution failed.
```
According to `setup-integration-test-env.sh`, `N/A` is intended as the default value so this PR choose it.
```
SPARK_TGZ="N/A"
MVN="$TEST_ROOT_DIR/build/mvn"
EXCLUDE_TAGS=""
```
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Build and tests successfully finish with the command shown above.
Closes#32722 from sarutak/fix-pom-for-kube-integ.
Authored-by: Kousuke <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
`spark.blockManager.port` does not work for k8s driver pods now, we should make it work as other cluster managers.
### Why are the changes needed?
`spark.blockManager.port` should be able to work for spark driver pod
### Does this PR introduce _any_ user-facing change?
yes, `spark.blockManager.port` will be respect iff it is present && `spark.driver.blockManager.port` is absent
### How was this patch tested?
new tests
Closes#32639 from yaooqinn/SPARK-35493.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
most spark conf keys are case sensitive, including `spark.blockManager.port`, we can not get the correct port number with `spark.blockmanager.port`.
This PR changes the wrong key to `spark.blockManager.port` in `BasicExecutorFeatureStep`.
This PR also ensures a fast fail when the port value is invalid for executor containers. When 0 is specified(it is valid as random port, but invalid as a k8s request), it should not be put in the `containerPort` field of executor pod desc. We do not expect executor pods to continuously fail to create because of invalid requests.
### Why are the changes needed?
bugfix
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
new tests
Closes#32621 from yaooqinn/SPARK-35482.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Kubernetes supports marking secrets and config maps as immutable to gain performance.
https://kubernetes.io/docs/concepts/configuration/configmap/#configmap-immutablehttps://kubernetes.io/docs/concepts/configuration/secret/#secret-immutable
For K8s clusters that run many thousands of Spark applications, this can yield significant reduction in load on the kube-apiserver.
From the K8s docs:
> For clusters that extensively use Secrets (at least tens of thousands of unique Secret to Pod mounts), preventing changes to their data has the following advantages:
> - protects you from accidental (or unwanted) updates that could cause applications outages
> - improves performance of your cluster by significantly reducing load on kube-apiserver, by closing watches for secrets marked as immutable.
For any secrets and config maps we create in Spark that are immutable, we could mark them as immutable by including the following when building the secret/config map
```
.withImmutable(true)
```
This feature has been supported in K8s as beta since K8s 1.19 and as GA since K8s 1.21
### What changes were proposed in this pull request?
All K8s secrets and config maps created by Spark are marked "immutable".
### Why are the changes needed?
See description above.
### Does this PR introduce _any_ user-facing change?
Don't think so
### How was this patch tested?
Augmented existing unit tests.
Closes#32588 from ashrayjain/patch-1.
Authored-by: Ashray Jain <ashrayjain@users.noreply.github.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR aims to add a new configuration, `spark.kubernetes.driver.reusePersistentVolumeClaim`, to reuse driver-owned `PersistentVolumeClaims` of the **deleted** executor pods.
Note also that `driver-owned PersistentVolumeClaims` is controlled by `spark.kubernetes.driver.ownPersistentVolumeClaim` which is recently added.
### Why are the changes needed?
PVC creations take some times. This feature can reduce it by reusing it.
For example, we can start `Pi` app with two executors with PVCs.
```
$ k logs -f pi | grep ExecutorPodsAllocator
21/05/16 23:36:32 INFO ExecutorPodsAllocator: Going to request 2 executors from Kubernetes for ResourceProfile Id: 0, target: 2 running: 0.
21/05/16 23:36:32 INFO ExecutorPodsAllocator: Found 0 reusable PVCs from 0 PVCs
21/05/16 23:36:32 INFO ExecutorPodsAllocator: Trying to create PersistentVolumeClaim pi-exec-1-pvc-0 with StorageClass scaleio
21/05/16 23:36:33 INFO ExecutorPodsAllocator: Trying to create PersistentVolumeClaim pi-exec-2-pvc-0 with StorageClass scaleio
```
After killing one executor, Spark is trying to look up the reusable PVCs, but the dead-executor's PVC may not returned yet because K8s works asynchronously. In this case, Spark is trying to create a new PVC as a normal operation.
```
21/05/16 23:38:51 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 2 running: 1.
21/05/16 23:38:51 INFO ExecutorPodsAllocator: Found 0 reusable PVCs from 2 PVCs
21/05/16 23:38:51 INFO ExecutorPodsAllocator: Trying to create PersistentVolumeClaim pi-exec-3-pvc-0 with StorageClass scaleio
```
After killing another executor, Spark found one reusable PVC, `pi-exec-1-pvc-0`, and reuse it.
```
21/05/16 23:39:18 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes for ResourceProfile Id: 0, target: 2 running: 1.
21/05/16 23:39:18 INFO ExecutorPodsAllocator: Found 1 reusable PVCs from 3 PVCs
21/05/16 23:39:18 INFO ExecutorPodsAllocator: Reuse PersistentVolumeClaim pi-exec-1-pvc-0
```
In this case, we can easily notice the remounted PVC because `ClaimName`, `pi-exec-1-pvc-0`, doesn't have the prefix of pod name, `pi-exec-4`.
```
$ k describe pod pi-exec-4 | grep pi-exec-1-pvc-0
ClaimName: pi-exec-1-pvc-0
```
### Does this PR introduce _any_ user-facing change?
Yes, but this is a new feature which is disabled by the new conf.
### How was this patch tested?
Pass the CIs with the newly added test case.
K8S IT test also passed.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 17 minutes, 7 seconds.
Total number of tests run: 26
Suites: completed 2, aborted 0
Tests: succeeded 26, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 24:14 min
[INFO] Finished at: 2021-05-16T17:24:40-07:00
[INFO] ------------------------------------------------------------------------
```
Closes#32564 from dongjoon-hyun/SPARK-35416.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Adds the exec loss reason to the Spark web UI & in doing so also fix the Kube integration to pass exec loss reason into core.
UI change:
![image](https://user-images.githubusercontent.com/59893/117045762-b975ba80-acc4-11eb-9679-8edab3cfadc2.png)
### Why are the changes needed?
Debugging Spark jobs is *hard*, making it clearer why executors have exited could help.
### Does this PR introduce _any_ user-facing change?
Yes a new column on the executor page.
### How was this patch tested?
K8s unit test updated to validate exec loss reasons are passed through regardless of exec alive state, manual testing to validate the UI.
Closes#32436 from holdenk/SPARK-34764-propegate-reason-for-exec-loss.
Lead-authored-by: Holden Karau <hkarau@apple.com>
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
Signed-off-by: Holden Karau <hkarau@apple.com>
### What changes were proposed in this pull request?
This PR aims to unify two K8s version variables in two `pom.xml`s into one. `kubernetes-client.version` is correct because the artifact ID is `kubernetes-client`.
```
kubernetes.client.version (kubernetes/core module)
kubernetes-client.version (kubernetes/integration-test module)
```
### Why are the changes needed?
Having two variables for the same value is confusing and inconvenient when we upgrade K8s versions.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs. (The compilation test passes are enough.)
Closes#32531 from dongjoon-hyun/SPARK-35394.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR upgrades Kubernetes and Minikube version for integration tests and removes/updates the old code for this new version.
Details of this changes:
- As [discussed in the mailing list](http://apache-spark-developers-list.1001551.n3.nabble.com/minikube-and-kubernetes-cluster-versions-for-integration-testing-td30856.html): updating Minikube version from v0.34.1 to v1.7.3 and kubernetes version from v1.15.12 to v1.17.3.
- making Minikube version checked and fail with an explanation when the test is started with on a version < v1.7.3.
- removing minikube status checking code related to old Minikube versions
- in the Minikube backend using fabric8's `Config.autoConfigure()` method to configure the kubernetes client to use the `minikube` k8s context (like it was in [one of the Minikube's example](https://github.com/fabric8io/kubernetes-client/blob/master/kubernetes-examples/src/main/java/io/fabric8/kubernetes/examples/kubectl/equivalents/ConfigUseContext.java#L36))
- Introducing `persistentVolume` test tag: this would be a temporary change to skip PVC tests in the Kubernetes integration test, as currently the PCV tests are blocking the move to Docker as Minikube's driver (for details please check https://issues.apache.org/jira/browse/SPARK-34738).
### Why are the changes needed?
With the current suggestion one can run into several problems without noticing the Minikube/kubernetes version is the problem.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
It was tested on Mac with [this script](https://gist.github.com/attilapiros/cd58a16bdde833c80c5803c337fffa94#file-check_minikube_versions-zsh) which installs each Minikube versions from v1.7.2 (including this version to test the negative case of the version check) and runs the integration tests.
It was started with:
```
./check_minikube_versions.zsh > test_log 2>&1
```
And there was only one build failure the rest was successful:
```
$ grep "BUILD SUCCESS" test_log | wc -l
26
$ grep "BUILD FAILURE" test_log | wc -l
1
```
It was for Minikube v1.7.2 and the log is:
```
KubernetesSuite:
*** RUN ABORTED ***
java.lang.AssertionError: assertion failed: Unsupported Minikube version is detected: minikube version: v1.7.2.For integration testing Minikube version 1.7.3 or greater is expected.
at scala.Predef$.assert(Predef.scala:223)
at org.apache.spark.deploy.k8s.integrationtest.backend.minikube.Minikube$.getKubernetesClient(Minikube.scala:52)
at org.apache.spark.deploy.k8s.integrationtest.backend.minikube.MinikubeTestBackend$.initialize(MinikubeTestBackend.scala:33)
at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.beforeAll(KubernetesSuite.scala:163)
at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
at org.apache.spark.deploy.k8s.integrationtest.KubernetesSuite.org$scalatest$BeforeAndAfter$$super$run(KubernetesSuite.scala:43)
at org.scalatest.BeforeAndAfter.run(BeforeAndAfter.scala:273)
at org.scalatest.BeforeAndAfter.run$(BeforeAndAfter.scala:271)
...
```
Moreover I made a test with having multiple k8s cluster contexts, too.
Closes#31829 from attilapiros/SPARK-34736.
Lead-authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Co-authored-by: attilapiros <piros.attila.zsolt@gmail.com>
Signed-off-by: attilapiros <piros.attila.zsolt@gmail.com>
### What changes were proposed in this pull request?
This PR aims to upgrade K8s client to 5.3.1.
### Why are the changes needed?
This will bring the latest bug fixes.
- https://github.com/fabric8io/kubernetes-client/releases/tag/v5.3.1
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs.
K8s IT is manually tested like the following.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 18 minutes, 33 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Spark Project Parent POM 3.2.0-SNAPSHOT:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [ 3.959 s]
[INFO] Spark Project Tags ................................. SUCCESS [ 7.830 s]
[INFO] Spark Project Local DB ............................. SUCCESS [ 3.457 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 5.496 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 3.239 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 9.006 s]
[INFO] Spark Project Launcher ............................. SUCCESS [ 2.422 s]
[INFO] Spark Project Core ................................. SUCCESS [02:17 min]
[INFO] Spark Project Kubernetes Integration Tests ......... SUCCESS [21:05 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 23:59 min
[INFO] Finished at: 2021-05-05T11:59:19-07:00
[INFO] ------------------------------------------------------------------------
```
Closes#32443 from dongjoon-hyun/SPARK-35319.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Since SPARK-22757, `KubernetesUtils` has been used as an important utility class by all K8s modules and `ExternalClusterManager`s. This PR aims to promote `KubernetesUtils` to `DeveloperApi` in order to maintain it officially in a backward compatible way at Apache Spark 3.2.0.
### Why are the changes needed?
Apache Spark 3.1.1 makes `Kubernetes` module GA and provides an extensible external cluster manager framework. To have `ExternalClusterManager` for K8s environment, `KubernetesUtils` class is crucial and needs to be stable. By promoting to a subset of K8s developer API, we can maintain these more sustainable way and give a better and stable functionality to K8s users.
In this PR, `Since` annotations denote the last function signature changes because these are going to become public at Apache Spark 3.2.0.
| Version | Function Name |
|-|-|
| 2.3.0 | parsePrefixedKeyValuePairs |
| 2.3.0 | requireNandDefined |
| 2.3.0 | parsePrefixedKeyValuePairs |
| 2.4.0 | parseMasterUrl |
| 3.0.0 | requireBothOrNeitherDefined |
| 3.0.0 | requireSecondIfFirstIsDefined |
| 3.0.0 | selectSparkContainer |
| 3.0.0 | formatPairsBundle |
| 3.0.0 | formatPodState |
| 3.0.0 | containersDescription |
| 3.0.0 | containerStatusDescription |
| 3.0.0 | formatTime |
| 3.0.0 | uniqueID |
| 3.0.0 | buildResourcesQuantities |
| 3.0.0 | uploadAndTransformFileUris |
| 3.0.0 | uploadFileUri |
| 3.0.0 | requireBothOrNeitherDefined |
| 3.0.0 | buildPodWithServiceAccount |
| 3.0.0 | isLocalAndResolvable |
| 3.1.1 | renameMainAppResource |
| 3.1.1 | addOwnerReference |
| 3.2.0 | loadPodFromTemplate |
### Does this PR introduce _any_ user-facing change?
Yes, but this is new API additions.
### How was this patch tested?
Pass the CIs.
Closes#32406 from dongjoon-hyun/SPARK-35280.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR aims to support driver-owned on-demand PVC(Persistent Volume Claim)s. It means dynamically-created PVCs will have the `ownerReference` to `driver` pod instead of `executor` pod.
### Why are the changes needed?
This allows K8s backend scheduler can reuse this later.
**BEFORE**
```
$ k get pvc tpcds-pvc-exec-1-pvc-0 -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
...
ownerReferences:
- apiVersion: v1
controller: true
kind: Pod
name: tpcds-pvc-exec-1
```
**AFTER**
```
$ k get pvc tpcds-pvc-exec-1-pvc-0 -oyaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
...
ownerReferences:
- apiVersion: v1
controller: true
kind: Pod
name: tpcds-pvc
```
### Does this PR introduce _any_ user-facing change?
No. (The default is `false`)
### How was this patch tested?
Manually check the above and pass K8s IT.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 16 minutes, 40 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```
Closes#32288 from dongjoon-hyun/SPARK-35182.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
In YARN, ship the `spark.jars.ivySettings` file to the driver when using `cluster` deploy mode so that `addJar` is able to find it in order to resolve ivy paths.
### Why are the changes needed?
SPARK-33084 introduced support for Ivy paths in `sc.addJar` or Spark SQL `ADD JAR`. If we use a custom ivySettings file using `spark.jars.ivySettings`, it is loaded at b26e7b510b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala (L1280). However, this file is only accessible on the client machine. In YARN cluster mode, this file is not available on the driver and so `addJar` fails to find it.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added unit tests to verify that the `ivySettings` file is localized by the YARN client and that a YARN cluster mode application is able to find to load the `ivySettings` file.
Closes#31591 from shardulm94/SPARK-34472.
Authored-by: Shardul Mahadik <smahadik@linkedin.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
### What changes were proposed in this pull request?
On Running Spark job with yarn and deployment mode as client, Spark Driver and Spark Application master launch in two separate containers. In various scenarios there is need to see Spark Application master logs to see the resource allocation, Decommissioning status and other information shared between yarn RM and Spark Application master.
In Cluster mode Spark driver and Spark AM is on same container, So Log link of the driver already there to see the logs in Spark UI
This PR is for adding the spark AM log link for spark job running in the client mode for yarn. Instead of searching the container id and then find the logs. We can directly check in the Spark UI
This change is only for showing the AM log links in the Client mode when resource manager is yarn.
### Why are the changes needed?
Till now the only way to check this by finding the container id of the AM and check the logs either using Yarn utility or Yarn RM Application History server.
This PR is for adding the spark AM log link for spark job running in the client mode for yarn. Instead of searching the container id and then find the logs. We can directly check in the Spark UI
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added the unit test also checked the Spark UI
**In Yarn Client mode**
Before Change
![image](https://user-images.githubusercontent.com/34540906/112644861-e1733200-8e6b-11eb-939b-c76ca9902a4e.png)
After the Change - The AM info is there
![image](https://user-images.githubusercontent.com/34540906/115264198-b7075280-a153-11eb-98f3-2aed66ffad2a.png)
AM Log
![image](https://user-images.githubusercontent.com/34540906/112645680-c0f7a780-8e6c-11eb-8b82-4ccc0aee927b.png)
**In Yarn Cluster Mode** - The AM log link will not be there
![image](https://user-images.githubusercontent.com/34540906/112649512-86900980-8e70-11eb-9b37-69d5c4b53ffa.png)
Closes#31974 from SaurabhChawla100/SPARK-34877.
Authored-by: SaurabhChawla <s.saurabhtim@gmail.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
### What changes were proposed in this pull request?
This PR aims to support a new configuration, `spark.kubernetes.driver.service.deleteOnTermination`, to clean up `Driver Service` resource during app termination.
### Why are the changes needed?
The K8s service is one of the important resources and sometimes it's controlled by quota.
```
$ k describe quota
Name: service
Namespace: default
Resource Used Hard
-------- ---- ----
services 1 3
```
Apache Spark creates a service for driver whose lifecycle is the same with driver pod.
It means a new Spark job submission fails if the number of completed Spark jobs equals the number of service quota.
**BEFORE**
```
$ k get pod
NAME READY STATUS RESTARTS AGE
org-apache-spark-examples-sparkpi-a32c9278e7061b4d-driver 0/1 Completed 0 31m
org-apache-spark-examples-sparkpi-a9f1f578e721ef62-driver 0/1 Completed 0 78s
$ k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 80m
org-apache-spark-examples-sparkpi-a32c9278e7061b4d-driver-svc ClusterIP None <none> 7078/TCP,7079/TCP,4040/TCP 31m
org-apache-spark-examples-sparkpi-a9f1f578e721ef62-driver-svc ClusterIP None <none> 7078/TCP,7079/TCP,4040/TCP 80s
$ k describe quota
Name: service
Namespace: default
Resource Used Hard
-------- ---- ----
services 3 3
$ bin/spark-submit...
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException:
Failure executing: POST at: https://192.168.64.50:8443/api/v1/namespaces/default/services.
Message: Forbidden! User minikube doesn't have permission.
services "org-apache-spark-examples-sparkpi-843f6978e722819c-driver-svc" is forbidden:
exceeded quota: service, requested: services=1, used: services=3, limited: services=3.
```
**AFTER**
```
$ k get pod
NAME READY STATUS RESTARTS AGE
org-apache-spark-examples-sparkpi-23d5f278e77731a7-driver 0/1 Completed 0 26s
org-apache-spark-examples-sparkpi-d1292278e7768ed4-driver 0/1 Completed 0 67s
org-apache-spark-examples-sparkpi-e5bedf78e776ea9d-driver 0/1 Completed 0 44s
$ k get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 172m
$ k describe quota
Name: service
Namespace: default
Resource Used Hard
-------- ---- ----
services 1 3
```
### Does this PR introduce _any_ user-facing change?
Yes, this PR adds a new configuration, `spark.kubernetes.driver.service.deleteOnTermination`, and enables it by default.
The change is documented at the migration guide.
### How was this patch tested?
Pass the CIs.
This is tested with K8s IT manually.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 19 minutes, 9 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```
Closes#32226 from dongjoon-hyun/SPARK-35131.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Although AS-IS master branch already works with K8s 1.20, this PR aims to upgrade K8s client to 5.3.0 to support K8s 1.20 officially.
- https://github.com/fabric8io/kubernetes-client#compatibility-matrix
The following are the notable breaking API changes.
1. Remove Doneable (5.0+):
- https://github.com/fabric8io/kubernetes-client/pull/2571
2. Change Watcher.onClose signature (5.0+):
- https://github.com/fabric8io/kubernetes-client/pull/2616
3. Change Readiness (5.1+)
- https://github.com/fabric8io/kubernetes-client/pull/2796
### Why are the changes needed?
According to the compatibility matrix, this makes Apache Spark and its external cluster manager extension support all K8s 1.20 features officially for Apache Spark 3.2.0.
### Does this PR introduce _any_ user-facing change?
Yes, this is a dev dependency change which affects K8s cluster extension users.
### How was this patch tested?
Pass the CIs.
This is manually tested with K8s IT.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 17 minutes, 44 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```
Closes#32221 from dongjoon-hyun/SPARK-K8S-530.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Introducing a new test construct:
```
withHttpServer() { baseURL =>
...
}
```
Which starts and stops a Jetty server to serve files via HTTP.
Moreover this PR uses this new construct in the test `Run SparkRemoteFileTest using a remote data file`.
### Why are the changes needed?
Before this PR github URLs was used like "https://raw.githubusercontent.com/apache/spark/master/data/mllib/pagerank_data.txt".
This connects two Spark version in an unhealthy way like connecting the "master" branch which is moving part with the committed test code which is a non-moving (as it might be even released).
So this way a test running for an earlier version of Spark expects something (filename, content, path) from a the latter release and what is worse when the moving version is changed the earlier test will break.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing unit test.
Closes#31935 from attilapiros/SPARK-34789.
Authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR fixes two tests below:
https://github.com/apache/spark/runs/2320161984
```
[info] YarnShuffleIntegrationSuite:
[info] org.apache.spark.deploy.yarn.YarnShuffleIntegrationSuite *** ABORTED *** (228 milliseconds)
[info] org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:373)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:128)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:503)
[info] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
[info] at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:322)
[info] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
[info] at org.apache.spark.deploy.yarn.BaseYarnClusterSuite.beforeAll(BaseYarnClusterSuite.scala:95)
...
[info] Cause: java.net.BindException: Port in use: fv-az186-831:0
[info] at org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1231)
[info] at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1253)
[info] at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1316)
[info] at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1167)
[info] at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:449)
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.startWepApp(ResourceManager.java:1247)
[info] at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceStart(ResourceManager.java:1356)
[info] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.startResourceManager(MiniYARNCluster.java:365)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.access$300(MiniYARNCluster.java:128)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster$ResourceManagerWrapper.serviceStart(MiniYARNCluster.java:503)
[info] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
[info] at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
[info] at org.apache.hadoop.yarn.server.MiniYARNCluster.serviceStart(MiniYARNCluster.java:322)
[info] at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
[info] at org.apache.spark.deploy.yarn.BaseYarnClusterSuite.beforeAll(BaseYarnClusterSuite.scala:95)
[info] at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)
[info] at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
[info] at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
[info] at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:61)
...
```
https://github.com/apache/spark/runs/2323342094
```
[info] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testBadSecret started
[error] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testBadSecret failed: java.lang.AssertionError: Connecting to /10.1.0.161:39895 timed out (120000 ms), took 120.081 sec
[error] at org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testBadSecret(ExternalShuffleSecuritySuite.java:85)
[error] ...
[info] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testBadAppId started
[error] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testBadAppId failed: java.lang.AssertionError: Connecting to /10.1.0.198:44633 timed out (120000 ms), took 120.08 sec
[error] at org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testBadAppId(ExternalShuffleSecuritySuite.java:76)
[error] ...
[info] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testValid started
[error] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testValid failed: java.io.IOException: Connecting to /10.1.0.119:43575 timed out (120000 ms), took 120.089 sec
[error] at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:285)
[error] at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
[error] at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
[error] at org.apache.spark.network.shuffle.ExternalBlockStoreClient.registerWithShuffleServer(ExternalBlockStoreClient.java:211)
[error] at org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.validate(ExternalShuffleSecuritySuite.java:108)
[error] at org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testValid(ExternalShuffleSecuritySuite.java:68)
[error] ...
[info] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testEncryption started
[error] Test org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testEncryption failed: java.io.IOException: Connecting to /10.1.0.248:35271 timed out (120000 ms), took 120.014 sec
[error] at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:285)
[error] at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
[error] at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
[error] at org.apache.spark.network.shuffle.ExternalBlockStoreClient.registerWithShuffleServer(ExternalBlockStoreClient.java:211)
[error] at org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.validate(ExternalShuffleSecuritySuite.java:108)
[error] at org.apache.spark.network.shuffle.ExternalShuffleSecuritySuite.testEncryption(ExternalShu
```
For Yarn cluster suites, its difficult to fix. This PR makes it skipped if it fails to bind.
For shuffle related suites, it uses local host
### Why are the changes needed?
To make the tests stable
### Does this PR introduce _any_ user-facing change?
No, dev-only.
### How was this patch tested?
Its tested in GitHub Actions: https://github.com/HyukjinKwon/spark/runs/2340210765Closes#32126 from HyukjinKwon/SPARK-35002-followup.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Yuming Wang <yumwang@ebay.com>
### What changes were proposed in this pull request?
This PR aims to add `ownerReference` to the executor ConfigMap to fix leakage.
### Why are the changes needed?
SPARK-30985 maintains the executor config map explicitly inside Spark. However, this config map can be leaked when Spark drivers die accidentally or are killed by K8s. We need to add `ownerReference` to make K8s do the garbage collection these automatically.
The number of ConfigMap is one of the resource quota. So, the leaked configMaps currently cause Spark jobs submission failures.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Pass the CIs and check manually.
K8s IT is tested manually.
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- SPARK-33615: Launcher client archives
- SPARK-33748: Launcher python client respecting PYSPARK_PYTHON
- SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python
- Launcher python client dependencies using a zip file
- Test basic decommissioning
- Test basic decommissioning with shuffle cleanup
- Test decommissioning with dynamic allocation & shuffle cleanups
- Test decommissioning timeouts
- Run SparkR on simple dataframe.R example
Run completed in 19 minutes, 2 seconds.
Total number of tests run: 27
Suites: completed 2, aborted 0
Tests: succeeded 27, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```
**BEFORE**
```
$ k get cm spark-exec-450b417895b3b2c7-conf-map -oyaml | grep ownerReferences
```
**AFTER**
```
$ k get cm spark-exec-bb37a27895b1c26c-conf-map -oyaml | grep ownerReferences
f:ownerReferences:
```
Closes#32042 from dongjoon-hyun/SPARK-34948.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Add a new config, `spark.shuffle.service.name`, which allows for Spark applications to look for a YARN shuffle service which is defined at a name other than the default `spark_shuffle`.
Add a new config, `spark.yarn.shuffle.service.metrics.namespace`, which allows for configuring the namespace used when emitting metrics from the shuffle service into the NodeManager's `metrics2` system.
Add a new mechanism by which to override shuffle service configurations independently of the configurations in the NodeManager. When a resource `spark-shuffle-site.xml` is present on the classpath of the shuffle service, the configs present within it will be used to override the configs coming from `yarn-site.xml` (via the NodeManager).
### Why are the changes needed?
There are two use cases which can benefit from these changes.
One use case is to run multiple instances of the shuffle service side-by-side in the same NodeManager. This can be helpful, for example, when running a YARN cluster with a mixed workload of applications running multiple Spark versions, since a given version of the shuffle service is not always compatible with other versions of Spark (e.g. see SPARK-27780). With this PR, it is possible to run two shuffle services like `spark_shuffle` and `spark_shuffle_3.2.0`, one of which is "legacy" and one of which is for new applications. This is possible because YARN versions since 2.9.0 support the ability to run shuffle services within an isolated classloader (see YARN-4577), meaning multiple Spark versions can coexist.
Besides this, the separation of shuffle service configs into `spark-shuffle-site.xml` can be useful for administrators who want to change and/or deploy Spark shuffle service configurations independently of the configurations for the NodeManager (e.g., perhaps they are owned by two different teams).
### Does this PR introduce _any_ user-facing change?
Yes. There are two new configurations related to the external shuffle service, and a new mechanism which can optionally be used to configure the shuffle service. `docs/running-on-yarn.md` has been updated to provide user instructions; please see this guide for more details.
### How was this patch tested?
In addition to the new unit tests added, I have deployed this to a live YARN cluster and successfully deployed two Spark shuffle services simultaneously, one running a modified version of Spark 2.3.0 (which supports some of the newer shuffle protocols) and one running Spark 3.1.1. Spark applications of both versions are able to communicate with their respective shuffle services without issue.
Closes#31936 from xkrogen/xkrogen-SPARK-34828-shufflecompat-config-from-classpath.
Authored-by: Erik Krogen <xkrogen@apache.org>
Signed-off-by: Thomas Graves <tgraves@apache.org>
### What changes were proposed in this pull request?
Extending "EXTRA LOGS FOR THE FAILED TEST" section of k8s integration test log with `kubectl describe pods` output for the failed test.
### Why are the changes needed?
PR builds frequently fails as the k8s integration tests are very flaky now in Amplab Jenkins environment.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Locally by making temporary one of the test fail. The output is:
```
21/03/25 16:55:16.722 ScalaTest-main-running-KubernetesSuite INFO KubernetesSuite:
===== EXTRA LOGS FOR THE FAILED TEST
21/03/25 16:55:17.167 ScalaTest-main-running-KubernetesSuite INFO KubernetesSuite: BEGIN driver DESCRIBE POD
Name: spark-test-app-a2b03971b7c049e8a2629f6a3198842b
Namespace: 35bdb17e308743afaec17538f89a7c3e
Priority: 0
Node: minikube/192.168.64.119
Start Time: Thu, 25 Mar 2021 16:52:10 +0100
Labels: spark-app-locator=75f695685ae44314a99ec13bb39332bc
spark-app-selector=spark-150230742d364a77927a08eed0222065
spark-role=driver
Annotations: <none>
Status: Succeeded
IP: 172.17.0.4
Containers:
spark-kubernetes-driver:
Container ID: docker://d6d27b0551060d9b094f12d1e232dfb5ae78ce38559680c7126c548996da4d95
Image: docker.io/kubespark/spark:3.2.0-SNAPSHOT_9575B805-9CB0-4A16-8A31-AA2F8DDA8EE5
Image ID: docker://sha256:3fc556c73a0d5187b5a14dbdc2f69ef292e60b544b4b4d3715f6749417c20918
Ports: 7078/TCP, 7079/TCP, 4040/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
driver
--properties-file
/opt/spark/conf/spark.properties
--class
org.apache.spark.examples.SparkPi
local:///opt/spark/examples/jars/spark-examples_2.12-3.2.0-SNAPSHOT.jar
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 25 Mar 2021 16:52:11 +0100
Finished: Thu, 25 Mar 2021 16:52:20 +0100
Ready: False
Restart Count: 0
Limits:
memory: 1408Mi
Requests:
cpu: 1
memory: 1408Mi
Environment:
SPARK_USER: attilazsoltpiros
SPARK_APPLICATION_ID: spark-150230742d364a77927a08eed0222065
SPARK_DRIVER_BIND_ADDRESS: (v1:status.podIP)
SPARK_LOCAL_DIRS: /var/data/spark-dab6f1c9-e538-40c8-a7d9-3e88f9b82cfa
SPARK_CONF_DIR: /opt/spark/conf
Mounts:
/opt/spark/conf from spark-conf-volume-driver (rw)
/var/data/spark-dab6f1c9-e538-40c8-a7d9-3e88f9b82cfa from spark-local-dir-1 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-nmfwl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
spark-local-dir-1:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
spark-conf-volume-driver:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: spark-drv-c60832786a15ffbe-conf-map
Optional: false
default-token-nmfwl:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-nmfwl
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m7s default-scheduler Successfully assigned 35bdb17e308743afaec17538f89a7c3e/spark-test-app-a2b03971b7c049e8a2629f6a3198842b to minikube
Normal Pulled 3m7s kubelet, minikube Container image "docker.io/kubespark/spark:3.2.0-SNAPSHOT_9575B805-9CB0-4A16-8A31-AA2F8DDA8EE5" already present on machine
Normal Created 3m7s kubelet, minikube Created container spark-kubernetes-driver
Normal Started 3m6s kubelet, minikube Started container spark-kubernetes-driver
21/03/25 16:55:17.168 ScalaTest-main-running-KubernetesSuite INFO KubernetesSuite: END driver DESCRIBE POD
```
Closes#31962 from attilapiros/SPARK-34869.
Authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
Support submit to k8s only with token.
### Why are the changes needed?
Now, sumbit to k8s always need oauth files.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Before, submit job out of k8s cluster without correct ca.crt, we may get this exception:
```
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439)
at sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:306)
at sun.security.validator.Validator.validate(Validator.java:271)
at sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:312)
```
When set spark.kubernetes.trust.certificates = true, we can submit only with correct token, no need to config ca.crt in local env.
Submit as:
```
bin/spark-submit \
--master $master \
--name pi \
--deploy-mode cluster \
--conf spark.kubernetes.container.image=$image \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
--conf spark.kubernetes.authenticate.submission.oauthToken=$clusterToken \
--conf spark.kubernetes.trust.certificates=true \
local:///opt/spark/examples/src/main/python/pi.py 200
```
Closes#30684 from hddong/trust-certs.
Authored-by: hongdongdong <hongdongdong@cmss.chinamobile.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>