This patch adds partial support for running spark on mesos inside of a docker container. Only fine-grained mode is presently supported, and there is no checking done to ensure that the version of libmesos is recent enough to have a DockerInfo structure in the protobuf (other than pinning a mesos version in the pom.xml).
Author: Chris Heller <hellertime@gmail.com>
Closes#3074 from hellertime/SPARK-2691 and squashes the following commits:
d504af6 [Chris Heller] Assist type inference
f64885d [Chris Heller] Fix errant line length
17c41c0 [Chris Heller] Base Dockerfile on mesosphere/mesos image
8aebda4 [Chris Heller] Simplfy Docker image docs
1ae7f4f [Chris Heller] Style points
974bd56 [Chris Heller] Convert map to flatMap
5d8bdf7 [Chris Heller] Factor out the DockerInfo construction.
7b75a3d [Chris Heller] Align to styleguide
80108e7 [Chris Heller] Bend to the will of RAT
ba77056 [Chris Heller] Explicit RAT exclude
abda5e5 [Chris Heller] Wildcard .rat-excludes
2f2873c [Chris Heller] Exclude spark-mesos from RAT
a589a5b [Chris Heller] Add example Dockerfile
b6825ce [Chris Heller] Remove use of EasyMock
eae1b86 [Chris Heller] Move properties under 'spark.mesos.'
c184d00 [Chris Heller] Use map on Option to be consistent with non-coarse code
fb9501a [Chris Heller] Bumped mesos version to current release
fa11879 [Chris Heller] Add listenerBus to EasyMock
882151e [Chris Heller] Changes to scala style
b22d42d [Chris Heller] Exclude template from RAT
db536cf [Chris Heller] Remove unneeded mocks
dea1bd5 [Chris Heller] Force default protocol
7dac042 [Chris Heller] Add test for DockerInfo
5456c0c [Chris Heller] Adjust syntax style
521c194 [Chris Heller] Adjust version info
6e38f70 [Chris Heller] Document Mesos Docker properties
29572ab [Chris Heller] Support all DockerInfo fields
b8c0dea [Chris Heller] Support for mesos DockerInfo in coarse-mode.
482a9fd [Chris Heller] Support for mesos DockerInfo in fine-grained mode.
This patch adds the support for cluster mode to run on Mesos.
It introduces a new Mesos framework dedicated to launch new apps/drivers, and can be called with the spark-submit script and specifying --master flag to the cluster mode REST interface instead of Mesos master.
Example:
./bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi --master mesos://10.0.0.206:8077 --executor-memory 1G --total-executor-cores 100 examples/target/spark-examples_2.10-1.3.0-SNAPSHOT.jar 30
Part of this patch is also to abstract the StandaloneRestServer so it can have different implementations of the REST endpoints.
Features of the cluster mode in this PR:
- Supports supervise mode where scheduler will keep trying to reschedule exited job.
- Adds a new UI for the cluster mode scheduler to see all the running jobs, finished jobs, and supervise jobs waiting to be retried
- Supports state persistence to ZK, so when the cluster scheduler fails over it can pick up all the queued and running jobs
Author: Timothy Chen <tnachen@gmail.com>
Author: Luc Bourlier <luc.bourlier@typesafe.com>
Closes#5144 from tnachen/mesos_cluster_mode and squashes the following commits:
069e946 [Timothy Chen] Fix rebase.
e24b512 [Timothy Chen] Persist submitted driver.
390c491 [Timothy Chen] Fix zk conf key for mesos zk engine.
e324ac1 [Timothy Chen] Fix merge.
fd5259d [Timothy Chen] Address review comments.
1553230 [Timothy Chen] Address review comments.
c6c6b73 [Timothy Chen] Pass spark properties to mesos cluster tasks.
f7d8046 [Timothy Chen] Change app name to spark cluster.
17f93a2 [Timothy Chen] Fix head of line blocking in scheduling drivers.
6ff8e5c [Timothy Chen] Address comments and add logging.
df355cd [Timothy Chen] Add metrics to mesos cluster scheduler.
20f7284 [Timothy Chen] Address review comments
7252612 [Timothy Chen] Fix tests.
a46ad66 [Timothy Chen] Allow zk cli param override.
920fc4b [Timothy Chen] Fix scala style issues.
862b5b5 [Timothy Chen] Support asking driver status when it's retrying.
7f214c2 [Timothy Chen] Fix RetryState visibility
e0f33f7 [Timothy Chen] Add supervise support and persist retries.
371ce65 [Timothy Chen] Handle cluster mode recovery and state persistence.
3d4dfa1 [Luc Bourlier] Adds support to kill submissions
febfaba [Timothy Chen] Bound the finished drivers in memory
543a98d [Timothy Chen] Schedule multiple jobs
6887e5e [Timothy Chen] Support looking at SPARK_EXECUTOR_URI env variable in schedulers
8ec76bc [Timothy Chen] Fix Mesos dispatcher UI.
d57d77d [Timothy Chen] Add documentation
825afa0 [Luc Bourlier] Supports more spark-submit parameters
b8e7181 [Luc Bourlier] Adds a shutdown latch to keep the deamon running
0fa7780 [Luc Bourlier] Launch task through the mesos scheduler
5b7a12b [Timothy Chen] WIP: Making a cluster mode a mesos framework.
4b2f5ef [Timothy Chen] Specify user jar in command to be replaced with local.
e775001 [Timothy Chen] Support fetching remote uris in driver runner.
7179495 [Timothy Chen] Change Driver page output and add logging
880bc27 [Timothy Chen] Add Mesos Cluster UI to display driver results
9986731 [Timothy Chen] Kill drivers when shutdown
67cbc18 [Timothy Chen] Rename StandaloneRestClient to RestClient and add sbin scripts
e3facdd [Timothy Chen] Add Mesos Cluster dispatcher
- Defined executorCores from "spark.mesos.executor.cores"
- Changed the amount of mesosExecutor's cores to executorCores.
- Added new configuration option on running-on-mesos.md
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#5063 from jongyoul/SPARK-6350 and squashes the following commits:
9238d6e [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs - Changed configuration name - Made mesosExecutorCores private
2d41241 [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs
89edb4f [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs
8ba7694 [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs
7549314 [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Fixed docs
4ae7b0c [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Removed TODO
c27efce [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Fixed Mesos*Suite for supporting integer WorkerOffers - Fixed Documentation
1fe4c03 [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Change available resources of cpus to integer value beacuse WorkerOffer support the amount cpus as integer value
5f3767e [Jongyoul Lee] Revert "[SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode"
4b7c69e [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Changed configruation name and description from "spark.mesos.executor.cores" to "spark.executor.frameworkCores"
0556792 [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Defined executorCores from "spark.mesos.executor.cores" - Changed the amount of mesosExecutor's cores to executorCores. - Added new configuration option on running-on-mesos.md
- Fixed calculateTotalMemory to use spark.mesos.executor.memoryOverhead
- Added testCase
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#5099 from jongyoul/SPARK-6423 and squashes the following commits:
6747fce [Jongyoul Lee] [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set - Changed a description of spark.mesos.executor.memoryOverhead
475a7c8 [Jongyoul Lee] [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set - Fit the import rules
453c5a2 [Jongyoul Lee] [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set - Fixed calculateTotalMemory to use spark.mesos.executor.memoryOverhead - Added testCase
- fixed a description of spark.mesos.executor.memoryOverhead from 7% to 10%
- This is a second part of SPARK-6085
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#5065 from jongyoul/SPARK-6085-1 and squashes the following commits:
c5af84c [Jongyoul Lee] SPARK-6085 Part. 2 Increase default value for memory overhead - Changed "MiB" to "MB"
dbac1c0 [Jongyoul Lee] SPARK-6085 Part. 2 Increase default value for memory overhead - fixed a description of spark.mesos.executor.memoryOverhead from 7% to 10%
- MESOS_NATIVE_LIBRARY become deprecated
- Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#4361 from jongyoul/SPARK-3619-1 and squashes the following commits:
f1ea91f [Jongyoul Lee] Merge branch 'SPARK-3619-1' of https://github.com/jongyoul/spark into SPARK-3619-1
a6a00c2 [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - Removed 'Known issues' section
2e15a21 [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY
0dace7b [Jongyoul Lee] [SPARK-3619] Upgrade to Mesos 0.21 to work around MESOS-1688 - MESOS_NATIVE_LIBRARY become deprecated - Chagned MESOS_NATIVE_LIBRARY to MESOS_NATIVE_JAVA_LIBRARY
Author: tedyu <yuzhihong@gmail.com>
Closes#4836 from tedyu/master and squashes the following commits:
d65b495 [tedyu] SPARK-6085 Increase default value for memory overhead
1fdd4df [tedyu] SPARK-6085 Increase default value for memory overhead
Author: Timothy Chen <tnachen@gmail.com>
Closes#3349 from tnachen/mesos_doc and squashes the following commits:
737ef49 [Timothy Chen] Add TOC
5ca546a [Timothy Chen] Update description around cores requested.
26283a5 [Timothy Chen] Add mesos specific configurations into doc
When using Mesos with the fine-grained mode, a Spark job can run into a dead lock on low allocatable memory on Mesos slaves. As a work-around 32 MB (= Mesos MIN_MEM) are allocated for each task, to ensure Mesos making new offers after task completion.
From my perspective, it would be better to fix this problem in Mesos by dropping the constraint on memory for offers, but as temporary solution this patch helps to avoid the dead lock on current Mesos versions.
See [[MESOS-1688] No offers if no memory is allocatable](https://issues.apache.org/jira/browse/MESOS-1688) for details for this problem.
Author: Martin Weindel <martin.weindel@gmail.com>
Closes#1860 from MartinWeindel/master and squashes the following commits:
5762030 [Martin Weindel] reverting work-around
a6bf837 [Martin Weindel] added known issue for issue MESOS-1688
d9d2ca6 [Martin Weindel] work around for problem with Mesos offering semantic (see [https://issues.apache.org/jira/browse/MESOS-1688])
It should be `spark-env.sh` rather than `spark.env.sh`.
Author: Cheng Lian <lian.cs.zju@gmail.com>
Closes#2119 from liancheng/fix-mesos-doc and squashes the following commits:
f360548 [Cheng Lian] Fixed a typo in docs/running-on-mesos.md
The existing docs are highly misleading. For standalone mode, for example, it encourages the user to use standalone-cluster mode, which is not officially supported. The safeguards have been added in Spark submit itself to prevent bad documentation from leading users down the wrong path in the future.
This PR is prompted by countless headaches users of Spark have run into on the mailing list.
Author: Andrew Or <andrewor14@gmail.com>
Closes#1200 from andrewor14/submit-docs and squashes the following commits:
5ea2460 [Andrew Or] Rephrase cluster vs client explanation
c827f32 [Andrew Or] Clarify spark submit messages
9f7ed8f [Andrew Or] Clarify client vs cluster deploy mode + add safeguards
This is a fairly large PR to clean up and update the docs for 1.0. The major changes are:
* A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs
* New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark
* Spark-submit guide moved to a separate page and expanded slightly
* Various cleanups of the menu system, security docs, and others
* Updated look of title bar to differentiate the docs from previous Spark versions
You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html.
Author: Matei Zaharia <matei@databricks.com>
Closes#896 from mateiz/1.0-docs and squashes the following commits:
03e6853 [Matei Zaharia] Some tweaks to configuration and YARN docs
0779508 [Matei Zaharia] tweak
ef671d4 [Matei Zaharia] Keep frames in JavaDoc links, and other small tweaks
1bf4112 [Matei Zaharia] Review comments
4414f88 [Matei Zaharia] tweaks
d04e979 [Matei Zaharia] Fix some old links to Java guide
a34ed33 [Matei Zaharia] tweak
541bb3b [Matei Zaharia] miscellaneous changes
fcefdec [Matei Zaharia] Moved submitting apps to separate doc
61d72b4 [Matei Zaharia] stuff
181f217 [Matei Zaharia] migration guide, remove old language guides
e11a0da [Matei Zaharia] Add more API functions
6a030a9 [Matei Zaharia] tweaks
8db0ae3 [Matei Zaharia] Added key-value pairs section
318d2c9 [Matei Zaharia] tweaks
1c81477 [Matei Zaharia] New section on basics and function syntax
e38f559 [Matei Zaharia] Actually added programming guide to Git
a33d6fe [Matei Zaharia] First pass at updating programming guide to support all languages, plus other tweaks throughout
3b6a876 [Matei Zaharia] More CSS tweaks
01ec8bf [Matei Zaharia] More CSS tweaks
e6d252e [Matei Zaharia] Change color of doc title bar to differentiate from 0.9.0
- Mention Apache downloads first
- Shorten some wording
Author: Matei Zaharia <matei@databricks.com>
Closes#806 from mateiz/doc-update and squashes the following commits:
d9345cd [Matei Zaharia] typo
a179f8d [Matei Zaharia] Tweaks to Mesos docs
Place more emphasis on using precompiled binary versions of Spark and Mesos
instead of encouraging the reader to compile from source.
Author: Andrew Ash <andrew@andrewash.com>
Closes#756 from ash211/spark-1818 and squashes the following commits:
7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos
e7dea8e [Andrew Ash] Add troubleshooting and debugging section
956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell
de3353b [Andrew Ash] Wrap to 100char
7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs
3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution
41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050
8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml
74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation
throughout the docs: SPARK_VERSION, SCALA_VERSION, and MESOS_VERSION.
To use them, e.g. use {{site.SPARK_VERSION}}.
Also removes uses of {{HOME_PATH}} which were being resolved to ""
by the templating system anyway.
- Rework/expand the nav bar with more of the docs site
- Removing parts of docs about EC2 and Mesos that differentiate between
running 0.5 and before
- Merged subheadings from running-on-amazon-ec2.html that are still relevant
(i.e., "Using a newer version of Spark" and "Accessing Data in S3") into
ec2-scripts.html and deleted running-on-amazon-ec2.html
- Added some TODO comments to a few docs
- Updated the blurb about AMP Camp
- Renamed programming-guide to spark-programming-guide
- Fixing typos/etc. in Standalone Spark doc
which can be compiled via jekyll, using the command `jekyll`. To compile
and run a local webserver to serve the doc as a website, run
`jekyll --server`.