dba314029b
Sent secondary jars to distributed cache of all containers and add the cached jars to classpath before executors start. Tested on a YARN cluster (CDH-5.0). `spark-submit --jars` also works in standalone server and `yarn-client`. Thanks for @andrewor14 for testing! I removed "Doesn't work for drivers in standalone mode with "cluster" deploy mode." from `spark-submit`'s help message, though we haven't tested mesos yet. CC: @dbtsai @sryza Author: Xiangrui Meng <meng@databricks.com> Closes #848 from mengxr/yarn-classpath and squashes the following commits: 23e7df4 [Xiangrui Meng] rename spark.jar to __spark__.jar and app.jar to __app__.jar to avoid confliction apped $CWD/ and $CWD/* to the classpath remove unused methods a40f6ed [Xiangrui Meng] standalone -> cluster 65e04ad [Xiangrui Meng] update spark-submit help message and add a comment for yarn-client 11e5354 [Xiangrui Meng] minor changes 3e7e1c4 [Xiangrui Meng] use sparkConf instead of hadoop conf dc3c825 [Xiangrui Meng] add secondary jars to classpath in yarn |
||
---|---|---|
.. | ||
alpha | ||
common/src | ||
stable | ||
pom.xml | ||
README.md |
YARN DIRECTORY LAYOUT
Hadoop Yarn related codes are organized in separate directories to minimize duplicated code.
-
common : Common codes that do not depending on specific version of Hadoop.
-
alpha / stable : Codes that involve specific version of Hadoop YARN API.
alpha represents 0.23 and 2.0.x stable represents 2.2 and later, until the API changes again.
alpha / stable will build together with common dir into a single jar