spark-instrumented-optimizer/resource-managers
DB Tsai 9b792518b2 [SPARK-31960][YARN][BUILD] Only populate Hadoop classpath for no-hadoop build
### What changes were proposed in this pull request?
If a Spark distribution has built-in hadoop runtime, Spark will not populate the hadoop classpath from `yarn.application.classpath` and `mapreduce.application.classpath` when a job is submitted to Yarn. Users can override this behavior by setting `spark.yarn.populateHadoopClasspath` to `true`.

### Why are the changes needed?
Without this, Spark will populate the hadoop classpath from `yarn.application.classpath` and `mapreduce.application.classpath` even Spark distribution has built-in hadoop. This results jar conflict and many unexpected behaviors in runtime.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually test with two builds, with-hadoop and no-hadoop builds.

Closes #28788 from dbtsai/yarn-classpath.

Authored-by: DB Tsai <d_tsai@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2020-06-18 06:08:40 +00:00
..
kubernetes [SPARK-31994][K8S] Docker image should use https urls for only deb.debian.org mirrors 2020-06-15 11:26:03 -07:00
mesos [SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling 2020-04-09 11:00:29 +00:00
yarn [SPARK-31960][YARN][BUILD] Only populate Hadoop classpath for no-hadoop build 2020-06-18 06:08:40 +00:00