spark-instrumented-optimizer/common
Thomas Graves a74ec6d7bb [SPARK-22218] spark shuffle services fails to update secret on app re-attempts
This patch fixes application re-attempts when running spark on yarn using the external shuffle service with security on.  Currently executors will fail to launch on any application re-attempt when launched on a nodemanager that had an executor from the first attempt.  The reason for this is because we aren't updating the secret key after the first application attempt.  The fix here is to just remove the containskey check to see if it already exists. In this way, we always add it and make sure its the most recent secret.  Similarly remove the check for containsKey on the remove since its just adding extra check that isn't really needed.

Note this worked before spark 2.2 because the check used to be contains (which was looking for the value) rather then containsKey, so that never matched and it was just always adding the new secret.

Patch was tested on a 10 node cluster as well as added the unit test.
The test ran was a wordcount where the output directory already existed.  With the bug present the application attempt failed with max number of executor Failures which were all saslExceptions.  With the fix present the application re-attempts fail with directory already exists or when you remove the directory between attempts the re-attemps succeed.

Author: Thomas Graves <tgraves@unharmedunarmed.corp.ne1.yahoo.com>

Closes #19450 from tgravescs/SPARK-22218.
2017-10-09 12:56:37 -07:00
..
kvstore [SPARK-20642][CORE] Store FsHistoryProvider listing data in a KVStore. 2017-09-27 20:33:41 +08:00
network-common [SPARK-22066][BUILD] Update checkstyle to 8.2, enable it, fix violations 2017-09-20 10:01:46 +01:00
network-shuffle [SPARK-22218] spark shuffle services fails to update secret on app re-attempts 2017-10-09 12:56:37 -07:00
network-yarn [SPARK-17321][YARN] Avoid writing shuffle metadata to disk if NM recovery is disabled 2017-08-31 09:26:20 +08:00
sketch [SPARK-22066][BUILD] Update checkstyle to 8.2, enable it, fix violations 2017-09-20 10:01:46 +01:00
tags [SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT 2017-04-24 21:48:04 -07:00
unsafe [SPARK-22130][CORE] UTF8String.trim() scans " " twice 2017-09-27 23:19:10 +09:00