## What changes were proposed in this pull request?
Due to the newly added API in Hadoop 2.6.4+, Spark builds against Hadoop 2.6.0~2.6.3 will meet compile error. So here still reverting back to use reflection to handle this issue.
## How was this patch tested?
Manual verification.
Author: jerryshao <sshao@hortonworks.com>
Closes#16884 from jerryshao/SPARK-19545.
Add back mockito test dep in YARN module, as it ends up being required in a Maven build
## How was this patch tested?
PR builder again, but also a local `mvn` run using the command that the broken Jenkins job uses
Author: Sean Owen <sowen@cloudera.com>
Closes#16853 from srowen/SPARK-19464.2.
## What changes were proposed in this pull request?
- Remove support for Hadoop 2.5 and earlier
- Remove reflection and code constructs only needed to support multiple versions at once
- Update docs to reflect newer versions
- Remove older versions' builds and profiles.
## How was this patch tested?
Existing tests
Author: Sean Owen <sowen@cloudera.com>
Closes#16810 from srowen/SPARK-19464.
That method is prone to stack overflows when the input map is really
large; instead, use plain "map". Also includes a unit test that was
tested and caused stack overflows without the fix.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#16667 from vanzin/SPARK-18750.
## What changes were proposed in this pull request?
remove ununsed imports and outdated comments, and fix some minor code style issue.
## How was this patch tested?
existing ut
Author: uncleGen <hustyugm@gmail.com>
Closes#16591 from uncleGen/SPARK-19227.
## What changes were proposed in this pull request?
`spark.yarn.access.namenodes` configuration cannot actually reflects the usage of it, inside the code it is the Hadoop filesystems we get tokens, not NNs. So here propose to update the name of this configuration, also change the related code and doc.
## How was this patch tested?
Local verification.
Author: jerryshao <sshao@hortonworks.com>
Closes#16560 from jerryshao/SPARK-19179.
## What changes were proposed in this pull request?
#16092 moves YARN resource manager related code to resource-managers/yarn directory. The test case ```YarnSchedulerBackendSuite``` was added after that but with the wrong place. I move it to correct directory in this PR.
## How was this patch tested?
Existing test.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes#16595 from yanboliang/yarn.
Currently Spark can only get token renewal interval from security HDFS (hdfs://), if Spark runs with other security file systems like webHDFS (webhdfs://), wasb (wasb://), ADLS, it will ignore these tokens and not get token renewal intervals from these tokens. These will make Spark unable to work with these security clusters. So instead of only checking HDFS token, we should generalize to support different DelegationTokenIdentifier.
## How was this patch tested?
Manually verified in security cluster.
Author: jerryshao <sshao@hortonworks.com>
Closes#16432 from jerryshao/SPARK-19021.
## What changes were proposed in this pull request?
There are many locations in the Spark repo where the same word occurs consecutively. Sometimes they are appropriately placed, but many times they are not. This PR removes the inappropriately duplicated words.
## How was this patch tested?
N/A since only docs or comments were updated.
Author: Niranjan Padmanabhan <niranjan.padmanabhan@gmail.com>
Closes#16455 from neurons/np.structure_streaming_doc.
## What changes were proposed in this pull request?
LauncherState should be only set to SUBMITTED after the application is submitted.
Currently the state is set before the application is actually submitted.
## How was this patch tested?
no test is added in this patch
Author: mingfei <mingfei.smf@alipay.com>
Closes#16459 from shimingfei/fixLauncher.
## What changes were proposed in this pull request?
The configuration `spark.yarn.security.tokens.{service}.enabled` is deprecated. Now we should use `spark.yarn.security.credentials.{service}.enabled`. Some places in the doc is not updated yet.
## How was this patch tested?
N/A. Just doc change.
Please review http://spark.apache.org/contributing.html before opening a pull request.
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes#16444 from viirya/minor-credential-provider-doc.
Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spark-tags' test JAR.
Alternative to #16303.
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes#16311 from ryan-williams/tt.
## What changes were proposed in this pull request?
93cdb8a7d0 Introduced a compile error under scala 2.10, this fixes that error.
## How was this patch tested?
locally ran
```
dev/change-version-to-2.10.sh
build/sbt -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Dscala-2.10 "project yarn" "test-only *YarnAllocatorSuite"
```
(which failed at test compilation before this change)
Author: Imran Rashid <irashid@cloudera.com>
Closes#16298 from squito/blacklist-2.10.
## What changes were proposed in this pull request?
This builds upon the blacklisting introduced in SPARK-17675 to add blacklisting of executors and nodes for an entire Spark application. Resources are blacklisted based on tasks that fail, in tasksets that eventually complete successfully; they are automatically returned to the pool of active resources based on a timeout. Full details are available in a design doc attached to the jira.
## How was this patch tested?
Added unit tests, ran them via Jenkins, also ran a handful of them in a loop to check for flakiness.
The added tests include:
- verifying BlacklistTracker works correctly
- verifying TaskSchedulerImpl interacts with BlacklistTracker correctly (via a mock BlacklistTracker)
- an integration test for the entire scheduler with blacklisting in a few different scenarios
Author: Imran Rashid <irashid@cloudera.com>
Author: mwws <wei.mao@intel.com>
Closes#14079 from squito/blacklist-SPARK-8425.
## What changes were proposed in this pull request?
Fix `java.util.NoSuchElementException` when running Spark in non-hdfs security environment.
In the current code, we assume `HDFS_DELEGATION_KIND` token will be found in Credentials. But in some cloud environments, HDFS is not required, so we should avoid this exception.
## How was this patch tested?
Manually verified in local environment.
Author: jerryshao <sshao@hortonworks.com>
Closes#16265 from jerryshao/SPARK-18840.