spark-instrumented-optimizer/sbin
Timothy Chen 53befacced [SPARK-5338] [MESOS] Add cluster mode support for Mesos
This patch adds the support for cluster mode to run on Mesos.
It introduces a new Mesos framework dedicated to launch new apps/drivers, and can be called with the spark-submit script and specifying --master flag to the cluster mode REST interface instead of Mesos master.

Example:
./bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi --master mesos://10.0.0.206:8077 --executor-memory 1G --total-executor-cores 100 examples/target/spark-examples_2.10-1.3.0-SNAPSHOT.jar 30

Part of this patch is also to abstract the StandaloneRestServer so it can have different implementations of the REST endpoints.

Features of the cluster mode in this PR:
- Supports supervise mode where scheduler will keep trying to reschedule exited job.
- Adds a new UI for the cluster mode scheduler to see all the running jobs, finished jobs, and supervise jobs waiting to be retried
- Supports state persistence to ZK, so when the cluster scheduler fails over it can pick up all the queued and running jobs

Author: Timothy Chen <tnachen@gmail.com>
Author: Luc Bourlier <luc.bourlier@typesafe.com>

Closes #5144 from tnachen/mesos_cluster_mode and squashes the following commits:

069e946 [Timothy Chen] Fix rebase.
e24b512 [Timothy Chen] Persist submitted driver.
390c491 [Timothy Chen] Fix zk conf key for mesos zk engine.
e324ac1 [Timothy Chen] Fix merge.
fd5259d [Timothy Chen] Address review comments.
1553230 [Timothy Chen] Address review comments.
c6c6b73 [Timothy Chen] Pass spark properties to mesos cluster tasks.
f7d8046 [Timothy Chen] Change app name to spark cluster.
17f93a2 [Timothy Chen] Fix head of line blocking in scheduling drivers.
6ff8e5c [Timothy Chen] Address comments and add logging.
df355cd [Timothy Chen] Add metrics to mesos cluster scheduler.
20f7284 [Timothy Chen] Address review comments
7252612 [Timothy Chen] Fix tests.
a46ad66 [Timothy Chen] Allow zk cli param override.
920fc4b [Timothy Chen] Fix scala style issues.
862b5b5 [Timothy Chen] Support asking driver status when it's retrying.
7f214c2 [Timothy Chen] Fix RetryState visibility
e0f33f7 [Timothy Chen] Add supervise support and persist retries.
371ce65 [Timothy Chen] Handle cluster mode recovery and state persistence.
3d4dfa1 [Luc Bourlier] Adds support to kill submissions
febfaba [Timothy Chen] Bound the finished drivers in memory
543a98d [Timothy Chen] Schedule multiple jobs
6887e5e [Timothy Chen] Support looking at SPARK_EXECUTOR_URI env variable in schedulers
8ec76bc [Timothy Chen] Fix Mesos dispatcher UI.
d57d77d [Timothy Chen] Add documentation
825afa0 [Luc Bourlier] Supports more spark-submit parameters
b8e7181 [Luc Bourlier] Adds a shutdown latch to keep the deamon running
0fa7780 [Luc Bourlier] Launch task through the mesos scheduler
5b7a12b [Timothy Chen] WIP: Making a cluster mode a mesos framework.
4b2f5ef [Timothy Chen] Specify user jar in command to be replaced with local.
e775001 [Timothy Chen] Support fetching remote uris in driver runner.
7179495 [Timothy Chen] Change Driver page output and add logging
880bc27 [Timothy Chen] Add Mesos Cluster UI to display driver results
9986731 [Timothy Chen] Kill drivers when shutdown
67cbc18 [Timothy Chen] Rename StandaloneRestClient to RestClient and add sbin scripts
e3facdd [Timothy Chen] Add Mesos Cluster dispatcher
2015-04-28 13:33:57 -07:00
..
slaves.sh [SPARK-3584] sbin/slaves doesn't work when we use password authentication for SSH 2014-09-25 16:49:15 -07:00
spark-config.sh [SPARK-4076] Parameter expansion in spark-config is wrong 2014-10-24 13:04:35 -07:00
spark-daemon.sh [SPARK-6952] Handle long args when detecting PID reuse 2015-04-17 11:08:37 +01:00
spark-daemons.sh Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into spark-915-segregate-scripts 2014-01-02 17:55:21 +05:30
start-all.sh SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within. 2014-09-08 10:24:15 -07:00
start-history-server.sh SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within. 2014-09-08 10:24:15 -07:00
start-master.sh SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within. 2014-09-08 10:24:15 -07:00
start-mesos-dispatcher.sh [SPARK-5338] [MESOS] Add cluster mode support for Mesos 2015-04-28 13:33:57 -07:00
start-shuffle-service.sh [SPARK-4286] Add an external shuffle service that can be run as a daemon. 2015-04-28 12:08:18 -07:00
start-slave.sh [Spark-4848] Allow different Worker configurations in standalone cluster 2015-04-13 18:21:16 -07:00
start-slaves.sh [Spark-4848] Allow different Worker configurations in standalone cluster 2015-04-13 18:21:16 -07:00
start-thriftserver.sh [SPARK-4924] Add a library for launching Spark jobs programmatically. 2015-03-11 01:03:01 -07:00
stop-all.sh [Minor]fix the wrong description 2015-03-07 12:35:26 +00:00
stop-history-server.sh SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within. 2014-09-08 10:24:15 -07:00
stop-master.sh [Minor]fix the wrong description 2015-03-07 12:35:26 +00:00
stop-mesos-dispatcher.sh [SPARK-5338] [MESOS] Add cluster mode support for Mesos 2015-04-28 13:33:57 -07:00
stop-shuffle-service.sh [SPARK-4286] Add an external shuffle service that can be run as a daemon. 2015-04-28 12:08:18 -07:00
stop-slave.sh [Spark-4848] Allow different Worker configurations in standalone cluster 2015-04-13 18:21:16 -07:00
stop-slaves.sh [Spark-4848] Allow different Worker configurations in standalone cluster 2015-04-13 18:21:16 -07:00
stop-thriftserver.sh [SPARK-3658][SQL] Start thrift server as a daemon 2014-10-01 15:15:24 -07:00