Commit graph

50 commits

Author SHA1 Message Date
Timothy Chen 53befacced [SPARK-5338] [MESOS] Add cluster mode support for Mesos
This patch adds the support for cluster mode to run on Mesos.
It introduces a new Mesos framework dedicated to launch new apps/drivers, and can be called with the spark-submit script and specifying --master flag to the cluster mode REST interface instead of Mesos master.

Example:
./bin/spark-submit --deploy-mode cluster --class org.apache.spark.examples.SparkPi --master mesos://10.0.0.206:8077 --executor-memory 1G --total-executor-cores 100 examples/target/spark-examples_2.10-1.3.0-SNAPSHOT.jar 30

Part of this patch is also to abstract the StandaloneRestServer so it can have different implementations of the REST endpoints.

Features of the cluster mode in this PR:
- Supports supervise mode where scheduler will keep trying to reschedule exited job.
- Adds a new UI for the cluster mode scheduler to see all the running jobs, finished jobs, and supervise jobs waiting to be retried
- Supports state persistence to ZK, so when the cluster scheduler fails over it can pick up all the queued and running jobs

Author: Timothy Chen <tnachen@gmail.com>
Author: Luc Bourlier <luc.bourlier@typesafe.com>

Closes #5144 from tnachen/mesos_cluster_mode and squashes the following commits:

069e946 [Timothy Chen] Fix rebase.
e24b512 [Timothy Chen] Persist submitted driver.
390c491 [Timothy Chen] Fix zk conf key for mesos zk engine.
e324ac1 [Timothy Chen] Fix merge.
fd5259d [Timothy Chen] Address review comments.
1553230 [Timothy Chen] Address review comments.
c6c6b73 [Timothy Chen] Pass spark properties to mesos cluster tasks.
f7d8046 [Timothy Chen] Change app name to spark cluster.
17f93a2 [Timothy Chen] Fix head of line blocking in scheduling drivers.
6ff8e5c [Timothy Chen] Address comments and add logging.
df355cd [Timothy Chen] Add metrics to mesos cluster scheduler.
20f7284 [Timothy Chen] Address review comments
7252612 [Timothy Chen] Fix tests.
a46ad66 [Timothy Chen] Allow zk cli param override.
920fc4b [Timothy Chen] Fix scala style issues.
862b5b5 [Timothy Chen] Support asking driver status when it's retrying.
7f214c2 [Timothy Chen] Fix RetryState visibility
e0f33f7 [Timothy Chen] Add supervise support and persist retries.
371ce65 [Timothy Chen] Handle cluster mode recovery and state persistence.
3d4dfa1 [Luc Bourlier] Adds support to kill submissions
febfaba [Timothy Chen] Bound the finished drivers in memory
543a98d [Timothy Chen] Schedule multiple jobs
6887e5e [Timothy Chen] Support looking at SPARK_EXECUTOR_URI env variable in schedulers
8ec76bc [Timothy Chen] Fix Mesos dispatcher UI.
d57d77d [Timothy Chen] Add documentation
825afa0 [Luc Bourlier] Supports more spark-submit parameters
b8e7181 [Luc Bourlier] Adds a shutdown latch to keep the deamon running
0fa7780 [Luc Bourlier] Launch task through the mesos scheduler
5b7a12b [Timothy Chen] WIP: Making a cluster mode a mesos framework.
4b2f5ef [Timothy Chen] Specify user jar in command to be replaced with local.
e775001 [Timothy Chen] Support fetching remote uris in driver runner.
7179495 [Timothy Chen] Change Driver page output and add logging
880bc27 [Timothy Chen] Add Mesos Cluster UI to display driver results
9986731 [Timothy Chen] Kill drivers when shutdown
67cbc18 [Timothy Chen] Rename StandaloneRestClient to RestClient and add sbin scripts
e3facdd [Timothy Chen] Add Mesos Cluster dispatcher
2015-04-28 13:33:57 -07:00
Iulian Dragos 8aab94d898 [SPARK-4286] Add an external shuffle service that can be run as a daemon.
This allows Mesos deployments to use the shuffle service (and implicitly dynamic allocation). It does so by adding a new "main" class and two corresponding scripts in `sbin`:

- `sbin/start-shuffle-service.sh`
- `sbin/stop-shuffle-service.sh`

Specific options can be passed in `SPARK_SHUFFLE_OPTS`.

This is picking up work from #3861 /cc tnachen

Author: Iulian Dragos <jaguarul@gmail.com>

Closes #4990 from dragos/feature/external-shuffle-service and squashes the following commits:

6c2b148 [Iulian Dragos] Import order and wrong name fixup.
07804ad [Iulian Dragos] Moved ExternalShuffleService to the `deploy` package + other minor tweaks.
4dc1f91 [Iulian Dragos] Reviewer’s comments:
8145429 [Iulian Dragos] Add an external shuffle service that can be run as a daemon.
2015-04-28 12:08:18 -07:00
Punya Biswal f6a9a57a72 [SPARK-6952] Handle long args when detecting PID reuse
sbin/spark-daemon.sh used

    ps -p "$TARGET_PID" -o args=

to figure out whether the process running with the expected PID is actually a Spark
daemon. When running with a large classpath, the output of ps gets
truncated and the check fails spuriously.

This weakens the check to see if it's a java command (which is something
we do in other parts of the script) rather than looking for the specific
main class name. This means that SPARK-4832 might happen under a
slightly broader range of circumstances (a java program happened to
reuse the same PID), but it seems worthwhile compared to failing
consistently with a large classpath.

Author: Punya Biswal <pbiswal@palantir.com>

Closes #5535 from punya/feature/SPARK-6952 and squashes the following commits:

7ea12d1 [Punya Biswal] Handle long args when detecting PID reuse
2015-04-17 11:08:37 +01:00
Nathan Kronenfeld 435b8779df [Spark-4848] Allow different Worker configurations in standalone cluster
This refixes #3699 with the latest code.
This fixes SPARK-4848

I've changed the stand-alone cluster scripts to allow different workers to have different numbers of instances, with both port and web-ui port following allong appropriately.

I did this by moving the loop over instances from start-slaves and stop-slaves (on the master) to start-slave and stop-slave (on the worker).

Wile I was at it, I changed SPARK_WORKER_PORT to work the same way as SPARK_WORKER_WEBUI_PORT, since the new methods work fine for both.

Author: Nathan Kronenfeld <nkronenfeld@oculusinfo.com>

Closes #5140 from nkronenfeld/feature/spark-4848 and squashes the following commits:

cf5f47e [Nathan Kronenfeld] Merge remote branch 'upstream/master' into feature/spark-4848
044ca6f [Nathan Kronenfeld] Documentation and formatting as requested by by andrewor14
d739640 [Nathan Kronenfeld] Move looping through instances from the master to the workers, so that each worker respects its own number of instances and web-ui port
2015-04-13 18:21:16 -07:00
Pradeep Chanumolu 240ea03fae [SPARK-6671] Add status command for spark daemons
SPARK-6671
Currently using the spark-daemon.sh script we can start and stop the spark demons. But we cannot get the status of the daemons. It will be nice to include the status command in the spark-daemon.sh script, through which we can know if the spark demon is alive or not.

Author: Pradeep Chanumolu <pchanumolu@maprtech.com>

Closes #5327 from pchanumolu/master and squashes the following commits:

d3a1f05 [Pradeep Chanumolu] Make status command check consistent with Stop command
5062926 [Pradeep Chanumolu] Fix indentation in spark-daemon.sh
3e66bc8 [Pradeep Chanumolu] SPARK-6671 : Add status command to spark daemons
1ac3918 [Pradeep Chanumolu] Add status command to spark-daemon
2015-04-13 13:02:55 +01:00
Jose Manuel Gomez 19d4c392fa [HOTFIX] Update start-slave.sh
wihtout this change the below error happens when I execute sbin/start-all.sh

localhost: /spark-1.3/sbin/start-slave.sh: line 32: unexpected EOF while looking for matching `"'
localhost: /spark-1.3/sbin/start-slave.sh: line 33: syntax error: unexpected end of file

my operating system is Linux Mint 17.1 Rebecca

Author: Jose Manuel Gomez <jmgomez@stratio.com>

Closes #5262 from josegom/patch-2 and squashes the following commits:

453af8b [Jose Manuel Gomez] Update start-slave.sh
2c456bd [Jose Manuel Gomez] Update start-slave.sh
2015-03-30 14:59:26 +01:00
WangTaoTheTonic 99631438c0 [SPARK-6552][Deploy][Doc]expose start-slave.sh to user and update outdated doc
https://issues.apache.org/jira/browse/SPARK-6552

/cc srowen

Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes #5205 from WangTaoTheTonic/SPARK-6552 and squashes the following commits:

b02263c [WangTaoTheTonic] use less than rather than less equal
f0fa408 [WangTaoTheTonic] expose start-slave.sh
2015-03-28 12:32:35 +00:00
Marcelo Vanzin 517975d89d [SPARK-4924] Add a library for launching Spark jobs programmatically.
This change encapsulates all the logic involved in launching a Spark job
into a small Java library that can be easily embedded into other applications.

The overall goal of this change is twofold, as described in the bug:

- Provide a public API for launching Spark processes. This is a common request
  from users and currently there's no good answer for it.

- Remove a lot of the duplicated code and other coupling that exists in the
  different parts of Spark that deal with launching processes.

A lot of the duplication was due to different code needed to build an
application's classpath (and the bootstrapper needed to run the driver in
certain situations), and also different code needed to parse spark-submit
command line options in different contexts. The change centralizes those
as much as possible so that all code paths can rely on the library for
handling those appropriately.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #3916 from vanzin/SPARK-4924 and squashes the following commits:

18c7e4d [Marcelo Vanzin] Fix make-distribution.sh.
2ce741f [Marcelo Vanzin] Add lots of quotes.
3b28a75 [Marcelo Vanzin] Update new pom.
a1b8af1 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
897141f [Marcelo Vanzin] Review feedback.
e2367d2 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
28cd35e [Marcelo Vanzin] Remove stale comment.
b1d86b0 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
00505f9 [Marcelo Vanzin] Add blurb about new API in the programming guide.
5f4ddcc [Marcelo Vanzin] Better usage messages.
92a9cfb [Marcelo Vanzin] Fix Win32 launcher, usage.
6184c07 [Marcelo Vanzin] Rename field.
4c19196 [Marcelo Vanzin] Update comment.
7e66c18 [Marcelo Vanzin] Fix pyspark tests.
0031a8e [Marcelo Vanzin] Review feedback.
c12d84b [Marcelo Vanzin] Review feedback. And fix spark-submit on Windows.
e2d4d71 [Marcelo Vanzin] Simplify some code used to launch pyspark.
43008a7 [Marcelo Vanzin] Don't make builder extend SparkLauncher.
b4d6912 [Marcelo Vanzin] Use spark-submit script in SparkLauncher.
28b1434 [Marcelo Vanzin] Add a comment.
304333a [Marcelo Vanzin] Fix propagation of properties file arg.
bb67b93 [Marcelo Vanzin] Remove unrelated Yarn change (that is also wrong).
8ec0243 [Marcelo Vanzin] Add missing newline.
95ddfa8 [Marcelo Vanzin] Fix handling of --help for spark-class command builder.
72da7ec [Marcelo Vanzin] Rename SparkClassLauncher.
62978e4 [Marcelo Vanzin] Minor cleanup of Windows code path.
9cd5b44 [Marcelo Vanzin] Make all non-public APIs package-private.
e4c80b6 [Marcelo Vanzin] Reorganize the code so that only SparkLauncher is public.
e50dc5e [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
de81da2 [Marcelo Vanzin] Fix CommandUtils.
86a87bf [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
2061967 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
46d46da [Marcelo Vanzin] Clean up a test and make it more future-proof.
b93692a [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
ad03c48 [Marcelo Vanzin] Revert "Fix a thread-safety issue in "local" mode."
0b509d0 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
23aa2a9 [Marcelo Vanzin] Read java-opts from conf dir, not spark home.
7cff919 [Marcelo Vanzin] Javadoc updates.
eae4d8e [Marcelo Vanzin] Fix new unit tests on Windows.
e570fb5 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
44cd5f7 [Marcelo Vanzin] Add package-info.java, clean up javadocs.
f7cacff [Marcelo Vanzin] Remove "launch Spark in new thread" feature.
7ed8859 [Marcelo Vanzin] Some more feedback.
54cd4fd [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
61919df [Marcelo Vanzin] Clean leftover debug statement.
aae5897 [Marcelo Vanzin] Use launcher classes instead of jars in non-release mode.
e584fc3 [Marcelo Vanzin] Rework command building a little bit.
525ef5b [Marcelo Vanzin] Rework Unix spark-class to handle argument with newlines.
8ac4e92 [Marcelo Vanzin] Minor test cleanup.
e946a99 [Marcelo Vanzin] Merge PySparkLauncher into SparkSubmitCliLauncher.
c617539 [Marcelo Vanzin] Review feedback round 1.
fc6a3e2 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
f26556b [Marcelo Vanzin] Fix a thread-safety issue in "local" mode.
2f4e8b4 [Marcelo Vanzin] Changes needed to make this work with SPARK-4048.
799fc20 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
bb5d324 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
53faef1 [Marcelo Vanzin] Merge branch 'master' into SPARK-4924
a7936ef [Marcelo Vanzin] Fix pyspark tests.
656374e [Marcelo Vanzin] Mima fixes.
4d511e7 [Marcelo Vanzin] Fix tools search code.
7a01e4a [Marcelo Vanzin] Fix pyspark on Yarn.
1b3f6e9 [Marcelo Vanzin] Call SparkSubmit from spark-class launcher for unknown classes.
25c5ae6 [Marcelo Vanzin] Centralize SparkSubmit command line parsing.
27be98a [Marcelo Vanzin] Modify Spark to use launcher lib.
6f70eea [Marcelo Vanzin] [SPARK-4924] Add a library for launching Spark jobs programatically.
2015-03-11 01:03:01 -07:00
WangTaoTheTonic 729c05bda8 [Minor]fix the wrong description
Found it by accident. I'm not gonna file jira for this as it is a very tiny fix.

Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes #4936 from WangTaoTheTonic/wrongdesc and squashes the following commits:

fb8a8ec [WangTaoTheTonic] fix the wrong description
aca5596 [WangTaoTheTonic] fix the wrong description
2015-03-07 12:35:26 +00:00
Zhang, Liye d8b3da9ddf [CORE, DEPLOY][minor] align arguments order with docs of worker
The help message for starting `worker` is `Usage: Worker [options] <master>`. While in `start-slaves.sh`, the format is not align with that, it is confusing for the fist glance.

Author: Zhang, Liye <liye.zhang@intel.com>

Closes #4924 from liyezhang556520/startSlaves and squashes the following commits:

7fd5deb [Zhang, Liye] align arguments order with docs of worker
2015-03-06 09:34:07 +00:00
Zhan Zhang ad6b169dee [Spark-5889] Remove pid file after stopping service.
Currently the pid file is not deleted, and potentially may cause some problem after service is stopped. The fix remove the pid file after service stopped.

Author: Zhan Zhang <zhazhan@gmail.com>

Closes #4676 from zhzhan/spark-5889 and squashes the following commits:

eb01be1 [Zhan Zhang] solve review comments
b4c009e [Zhan Zhang] solve review comments
018110a [Zhan Zhang] spark-5889: remove pid file after stopping service
088d2a2 [Zhan Zhang] squash all commits
c1f1fa5 [Zhan Zhang] test
2015-02-19 23:13:02 +00:00
Cheng Hao 94cdb05ff7 [SPARK-5825] [Spark Submit] Remove the double checking instance name when stopping the service
`spark-daemon.sh` will confirm the process id by fuzzy matching the class name while stopping the service, however, it will fail if the java process arguments is very long (greater than 4096 characters).
This PR looses the check for the service process.

Author: Cheng Hao <hao.cheng@intel.com>

Closes #4611 from chenghao-intel/stopping_service and squashes the following commits:

a0051f6 [Cheng Hao] loosen the process checking while stopping a service
2015-02-19 12:07:51 -08:00
WangTaoTheTonic 1768bd5143 [SPARK-4832][Deploy]some other processes might take the daemon pid
Some other processes might use the pid saved in pid file. In that case we should ignore it and launch daemons.

JIRA is down for maintenance. I will file one once it return.

Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes #3683 from WangTaoTheTonic/otherproc and squashes the following commits:

daa86a1 [WangTaoTheTonic] some bash style fix
8befee7 [WangTaoTheTonic] handle the mistake scenario
cf4ecc6 [WangTaoTheTonic] remove redundant condition
f36cfb4 [WangTaoTheTonic] some other processes might take the pid
2015-02-13 10:27:23 +00:00
Tom Panning 1ca0a1014e [SPARK-5176] The thrift server does not support cluster mode
Output an error message if the thrift server is started in cluster mode.

Author: Tom Panning <tom.panning@nextcentury.com>

Closes #4137 from tpanningnextcen/spark-5176-thrift-cluster-mode-error and squashes the following commits:

f5c0509 [Tom Panning] [SPARK-5176] The thrift server does not support cluster mode
2015-02-01 17:57:31 -08:00
Jongyoul Lee 4a4f9ccba2 [SPARK-5088] Use spark-class for running executors directly
Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #3897 from jongyoul/SPARK-5088 and squashes the following commits:

8232aa8 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Added a listenerBus for fixing test cases
932289f [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Rebased from master
613cb47 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Fixed code if spark.executor.uri doesn't have any value - Added test cases
ff57bda [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Adjusted orders of import
97e4bd4 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Changed command for using spark-class directly - Delete sbin/spark-executor and moved some codes into spark-class' case statement
2015-01-19 02:01:56 -08:00
jbencook 61f1a70227 [SPARK-874] adding a --wait flag
This PR adds a --wait flag to the `./sbin/stop-all.sh` script.

Author: jbencook <jbenjamincook@gmail.com>

Closes #3567 from jbencook/master and squashes the following commits:

d05c5bb [jbencook] [SPARK-874] adding a --wait flag
2014-12-09 12:16:19 -08:00
Kousuke Saruta 44d8b45a38 [SPARK-4110] Wrong comments about default settings in spark-daemon.sh
In spark-daemon.sh, thare are following comments.

    #   SPARK_CONF_DIR  Alternate conf dir. Default is ${SPARK_PREFIX}/conf.
    #   SPARK_LOG_DIR   Where log files are stored.  PWD by default.

But, I think the default value for SPARK_CONF_DIR is `${SPARK_HOME}/conf` and for SPARK_LOG_DIR is `${SPARK_HOME}/logs`.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2972 from sarutak/SPARK-4110 and squashes the following commits:

5a171a2 [Kousuke Saruta] Fixed wrong comments
2014-10-28 12:29:01 -07:00
Kousuke Saruta 30ea2868e7 [SPARK-4076] Parameter expansion in spark-config is wrong
In sbin/spark-config.sh, parameter expansion is used to extract source root as follows.

    this="${BASH_SOURCE-$0}"

I think, the parameter expansion should be ":" instead of "".
If we use "-" and BASH_SOURCE="", (empty character is set, not unset),
"" (empty character) is set to $this.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2930 from sarutak/SPARK-4076 and squashes the following commits:

32a0370 [Kousuke Saruta] Fixed wrong parameter expansion
2014-10-24 13:04:35 -07:00
WangTaoTheTonic 9d320e222c [SPARK-3696]Do not override the user-difined conf_dir
https://issues.apache.org/jira/browse/SPARK-3696

We see if SPARK_CONF_DIR is already defined before assignment.

Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #2541 from WangTaoTheTonic/confdir and squashes the following commits:

c3f31e0 [WangTaoTheTonic] Do not override the user-difined conf_dir
2014-10-03 10:42:41 -07:00
WangTaoTheTonic d61f2c15bb [SPARK-3658][SQL] Start thrift server as a daemon
https://issues.apache.org/jira/browse/SPARK-3658

And keep the `CLASS_NOT_FOUND_EXIT_STATUS` and exit message in `SparkSubmit.scala`.

Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>

Closes #2509 from WangTaoTheTonic/thriftserver and squashes the following commits:

5dcaab2 [WangTaoTheTonic] issue about coupling
8ad9f95 [WangTaoTheTonic] generalization
598e21e [WangTao] take thrift server as a daemon
2014-10-01 15:15:24 -07:00
Kousuke Saruta 0dc868e787 [SPARK-3584] sbin/slaves doesn't work when we use password authentication for SSH
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2444 from sarutak/slaves-scripts-modification and squashes the following commits:

eff7394 [Kousuke Saruta] Improve the description about Cluster Launch Script in docs/spark-standalone.md
7858225 [Kousuke Saruta] Modified sbin/slaves to use the environment variable "SPARK_SSH_FOREGROUND" as a flag
53d7121 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into slaves-scripts-modification
e570431 [Kousuke Saruta] Added a description for SPARK_SSH_FOREGROUND variable
7120a0c [Kousuke Saruta] Added a description about default host for sbin/slaves
1bba8a9 [Kousuke Saruta] Added SPARK_SSH_FOREGROUND flag to sbin/slaves
88e2f17 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into slaves-scripts-modification
297e75d [Kousuke Saruta] Modified sbin/slaves not to export HOSTLIST
2014-09-25 16:49:15 -07:00
WangTaoTheTonic 3447d10090 [SPARK-3547]Using a special exit code instead of 1 to represent ClassNotFoundExcepti...
...on

As improvement of https://github.com/apache/spark/pull/1944, we should use more special exit code to represent ClassNotFoundException.

Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #2421 from WangTaoTheTonic/classnotfoundExitCode and squashes the following commits:

645a22a [WangTaoTheTonic] Serveral typos to trigger Jenkins
d6ae559 [WangTaoTheTonic] use 101 instead
a2d6465 [WangTaoTheTonic] use 127 instead
fbb232f [WangTaoTheTonic] Using a special exit code instead of 1 to represent ClassNotFoundException
2014-09-18 10:17:18 -07:00
Prashant Sharma e16a8e7db5 SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
...

Tested ! TBH, it isn't a great idea to have directory with spaces within. Because emacs doesn't like it then hadoop doesn't like it. and so on...

Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #2229 from ScrapCodes/SPARK-3337/quoting-shell-scripts and squashes the following commits:

d4ad660 [Prashant Sharma] SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
2014-09-08 10:24:15 -07:00
Cheng Lian faeb9c0e14 [SPARK-2964] [SQL] Remove duplicated code from spark-sql and start-thriftserver.sh
Author: Cheng Lian <lian.cs.zju@gmail.com>
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #1886 from sarutak/SPARK-2964 and squashes the following commits:

8ef8751 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-2964
26e7c95 [Kousuke Saruta] Revert "Shorten timeout to more reasonable value"
ffb68fa [Kousuke Saruta] Modified spark-sql and start-thriftserver.sh to use bin/utils.sh
8c6f658 [Kousuke Saruta] Merge branch 'spark-3026' of https://github.com/liancheng/spark into SPARK-2964
81b43a8 [Cheng Lian] Shorten timeout to more reasonable value
a89e66d [Cheng Lian] Fixed command line options quotation in scripts
9c894d3 [Cheng Lian] Fixed bin/spark-sql -S option typo
be4736b [Cheng Lian] Report better error message when running JDBC/CLI without hive-thriftserver profile enabled
2014-08-26 17:33:40 -07:00
witgo 9f04db17e5 SPARK-2481: The environment variables SPARK_HISTORY_OPTS is covered in spark-env.sh
Author: witgo <witgo@qq.com>
Author: GuoQiang Li <witgo@qq.com>

Closes #1341 from witgo/history_env and squashes the following commits:

b4fd9f8 [GuoQiang Li] review commit
0ebe401 [witgo] *-history-server.sh load spark-config.sh
2014-08-25 19:22:27 -07:00
wangfei a1e8b1bc97 SPARK_LOGFILE and SPARK_ROOT_LOGGER no longer need in spark-daemon.sh
Author: wangfei <wangfei_hello@126.com>

Closes #2057 from scwf/patch-7 and squashes the following commits:

1b7b9a5 [wangfei] SPARK_LOGFILE and SPARK_ROOT_LOGGER no longer need in spark-daemon.sh
2014-08-20 16:00:46 -07:00
wangfei 267fdffe27 [SPARK-2925] [sql]fix spark-sql and start-thriftserver shell bugs when set --driver-java-options
https://issues.apache.org/jira/browse/SPARK-2925

Run cmd like this will get the error
bin/spark-sql --driver-java-options '-Xdebug -Xnoagent -Xrunjdwp:transport=dt_socket,address=8788,server=y,suspend=y'

Error: Unrecognized option '-Xnoagent'.
Run with --help for usage help or --verbose for debug output

Author: wangfei <wangfei_hello@126.com>
Author: wangfei <wangfei1@huawei.com>

Closes #1851 from scwf/patch-2 and squashes the following commits:

516554d [wangfei] quote variables to fix this issue
8bd40f2 [wangfei] quote variables to fix this problem
e6d79e3 [wangfei] fix start-thriftserver bug when set driver-java-options
948395d [wangfei] fix spark-sql error when set --driver-java-options
2014-08-14 10:55:51 -07:00
Cheng Lian a6cd31108f [SPARK-2678][Core][SQL] A workaround for SPARK-2678
JIRA issues:

- Main: [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678)
- Related: [SPARK-2874](https://issues.apache.org/jira/browse/SPARK-2874)

Related PR:

- #1715

This PR is both a fix for SPARK-2874 and a workaround for SPARK-2678. Fixing SPARK-2678 completely requires some API level changes that need further discussion, and we decided not to include it in Spark 1.1 release. As currently SPARK-2678 only affects Spark SQL scripts, this workaround is enough for Spark 1.1. Command line option handling logic in bash scripts looks somewhat dirty and duplicated, but it helps to provide a cleaner user interface as well as retain full downward compatibility for now.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #1801 from liancheng/spark-2874 and squashes the following commits:

8045d7a [Cheng Lian] Make sure test suites pass
8493a9e [Cheng Lian] Using eval to retain quoted arguments
aed523f [Cheng Lian] Fixed typo in bin/spark-sql
f12a0b1 [Cheng Lian] Worked arount SPARK-2678
daee105 [Cheng Lian] Fixed usage messages of all Spark SQL related scripts
2014-08-06 12:28:35 -07:00
Josh Rosen 22649b6cde [SPARK-2305] [PySpark] Update Py4J to version 0.8.2.1
Author: Josh Rosen <joshrosen@apache.org>

Closes #1626 from JoshRosen/SPARK-2305 and squashes the following commits:

03fb283 [Josh Rosen] Update Py4J to version 0.8.2.1.
2014-07-29 19:02:06 -07:00
Cheng Lian a7a9d14479 [SPARK-2410][SQL] Merging Hive Thrift/JDBC server (with Maven profile fix)
JIRA issue: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)

Another try for #1399 & #1600. Those two PR breaks Jenkins builds because we made a separate profile `hive-thriftserver` in sub-project `assembly`, but the `hive-thriftserver` module is defined outside the `hive-thriftserver` profile. Thus every time a pull request that doesn't touch SQL code will also execute test suites defined in `hive-thriftserver`, but tests fail because related .class files are not included in the assembly jar.

In the most recent commit, module `hive-thriftserver` is moved into its own profile to fix this problem. All previous commits are squashed for clarity.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #1620 from liancheng/jdbc-with-maven-fix and squashes the following commits:

629988e [Cheng Lian] Moved hive-thriftserver module definition into its own profile
ec3c7a7 [Cheng Lian] Cherry picked the Hive Thrift server
2014-07-28 12:07:30 -07:00
Patrick Wendell e5bbce9a60 Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"
This reverts commit f6ff2a61d0.
2014-07-27 18:46:58 -07:00
Cheng Lian f6ff2a61d0 [SPARK-2410][SQL] Merging Hive Thrift/JDBC server
(This is a replacement of #1399, trying to fix potential `HiveThriftServer2` port collision between parallel builds. Please refer to [these comments](https://github.com/apache/spark/pull/1399#issuecomment-50212572) for details.)

JIRA issue: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)

Merging the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc).

Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #1600 from liancheng/jdbc and squashes the following commits:

ac4618b [Cheng Lian] Uses random port for HiveThriftServer2 to avoid collision with parallel builds
090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR
21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs
fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd]
199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver
1083e9d [Cheng Lian] Fixed failed test suites
7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic
9cc0f06 [Cheng Lian] Starts beeline with spark-submit
cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile
061880f [Cheng Lian] Addressed all comments by @pwendell
7755062 [Cheng Lian] Adapts test suites to spark-submit settings
40bafef [Cheng Lian] Fixed more license header issues
e214aab [Cheng Lian] Added missing license headers
b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh
f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft
3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit
a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit
61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit
2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
2014-07-27 13:03:38 -07:00
Michael Armbrust afd757a241 Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"
This reverts commit 06dc0d2c6b.

#1399 is making Jenkins fail.  We should investigate and put this back after its passing tests.

Author: Michael Armbrust <michael@databricks.com>

Closes #1594 from marmbrus/revertJDBC and squashes the following commits:

59748da [Michael Armbrust] Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"
2014-07-25 15:36:57 -07:00
Cheng Lian 06dc0d2c6b [SPARK-2410][SQL] Merging Hive Thrift/JDBC server
JIRA issue:

- Main: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)
- Related: [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678)

Cherry picked the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc).

(Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.)

TODO

- [x] Use `spark-submit` to launch the server, the CLI and beeline
- [x] Migration guideline draft for Shark users

----

Hit by a bug in `SparkSubmitArguments` while working on this PR: all application options that are recognized by `SparkSubmitArguments` are stolen as `SparkSubmit` options. For example:

```bash
$ spark-submit --class org.apache.hive.beeline.BeeLine spark-internal --help
```

This actually shows usage information of `SparkSubmit` rather than `BeeLine`.

~~Fixed this bug here since the `spark-internal` related stuff also touches `SparkSubmitArguments` and I'd like to avoid conflict.~~

**UPDATE** The bug mentioned above is now tracked by [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678). Decided to revert changes to this bug since it involves more subtle considerations and worth a separate PR.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #1399 from liancheng/thriftserver and squashes the following commits:

090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR
21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs
fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd]
199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver
1083e9d [Cheng Lian] Fixed failed test suites
7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic
9cc0f06 [Cheng Lian] Starts beeline with spark-submit
cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile
061880f [Cheng Lian] Addressed all comments by @pwendell
7755062 [Cheng Lian] Adapts test suites to spark-submit settings
40bafef [Cheng Lian] Fixed more license header issues
e214aab [Cheng Lian] Added missing license headers
b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh
f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft
3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit
a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit
61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit
2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
2014-07-25 12:20:49 -07:00
Marcelo Vanzin 21ddd7d1e9 [SPARK-1768] History server enhancements.
Two improvements to the history server:

- Separate the HTTP handling from history fetching, so that it's easy to add
  new backends later (thinking about SPARK-1537 in the long run)

- Avoid loading all UIs in memory. Do lazy loading instead, keeping a few in
  memory for faster access. This allows the app limit to go away, since holding
  just the listing in memory shouldn't be too expensive unless the user has millions
  of completed apps in the history (at which point I'd expect other issues to arise
  aside from history server memory usage, such as FileSystem.listStatus()
  starting to become ridiculously expensive).

I also fixed a few minor things along the way which aren't really worth mentioning.
I also removed the app's log path from the UI since that information may not even
exist depending on which backend is used (even though there is only one now).

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #718 from vanzin/hist-server and squashes the following commits:

53620c9 [Marcelo Vanzin] Add mima exclude, fix scaladoc wording.
c21f8d8 [Marcelo Vanzin] Feedback: formatting, docs.
dd8cc4b [Marcelo Vanzin] Standardize on using spark.history.* configuration.
4da3a52 [Marcelo Vanzin] Remove UI from ApplicationHistoryInfo.
2a7f68d [Marcelo Vanzin] Address review feedback.
4e72c77 [Marcelo Vanzin] Remove comment about ordering.
249bcea [Marcelo Vanzin] Remove offset / count from provider interface.
ca5d320 [Marcelo Vanzin] Remove code that deals with unfinished apps.
6e2432f [Marcelo Vanzin] Second round of feedback.
b2c570a [Marcelo Vanzin] Make class package-private.
4406f61 [Marcelo Vanzin] Cosmetic change to listing header.
e852149 [Marcelo Vanzin] Initialize new app array to expected size.
e8026f4 [Marcelo Vanzin] Review feedback.
49d2fd3 [Marcelo Vanzin] Fix a comment.
91e96ca [Marcelo Vanzin] Fix scalastyle issues.
6fbe0d8 [Marcelo Vanzin] Better handle failures when loading app info.
eee2f5a [Marcelo Vanzin] Ensure server.stop() is called when shutting down.
bda2fa1 [Marcelo Vanzin] Rudimentary paging support for the history UI.
b284478 [Marcelo Vanzin] Separate history server from history backend.
2014-06-23 13:53:44 -07:00
Bouke van der Bijl 2fd2752e57 Include the sbin/spark-config.sh in spark-executor
This is needed because broadcast values are broken on pyspark on Mesos, it tries to import pyspark but can't, as the PYTHONPATH is not set due to changes in ff5be9a4

https://issues.apache.org/jira/browse/SPARK-1725

Author: Bouke van der Bijl <boukevanderbijl@gmail.com>

Closes #651 from bouk/include-spark-config-in-mesos-executor and squashes the following commits:

b2f1295 [Bouke van der Bijl] Inline PYTHONPATH in spark-executor
eedbbcc [Bouke van der Bijl] Include the sbin/spark-config.sh in spark-executor
2014-05-08 20:43:37 -07:00
Sandy Ryza ff5be9a41e SPARK-1004. PySpark on YARN
This reopens https://github.com/apache/incubator-spark/pull/640 against the new repo

Author: Sandy Ryza <sandy@cloudera.com>

Closes #30 from sryza/sandy-spark-1004 and squashes the following commits:

89889d4 [Sandy Ryza] Move unzipping py4j to the generate-resources phase so that it gets included in the jar the first time
5165a02 [Sandy Ryza] Fix docs
fd0df79 [Sandy Ryza] PySpark on YARN
2014-04-29 23:24:34 -07:00
Andrew Or 79820fe825 [SPARK-1276] Add a HistoryServer to render persisted UI
The new feature of event logging, introduced in #42, allows the user to persist the details of his/her Spark application to storage, and later replay these events to reconstruct an after-the-fact SparkUI.
Currently, however, a persisted UI can only be rendered through the standalone Master. This greatly limits the use case of this new feature as many people also run Spark on Yarn / Mesos.

This PR introduces a new entity called the HistoryServer, which, given a log directory, keeps track of all completed applications independently of a Spark Master. Unlike Master, the HistoryServer needs not be running while the application is still running. It is relatively light-weight in that it only maintains static information of applications and performs no scheduling.

To quickly test it out, generate event logs with ```spark.eventLog.enabled=true``` and run ```sbin/start-history-server.sh <log-dir-path>```. Your HistoryServer awaits on port 18080.

Comments and feedback are most welcome.

---

A few other changes introduced in this PR include refactoring the WebUI interface, which is beginning to have a lot of duplicate code now that we have added more functionality to it. Two new SparkListenerEvents have been introduced (SparkListenerApplicationStart/End) to keep track of application name and start/finish times. This PR also clarifies the semantics of the ReplayListenerBus introduced in #42.

A potential TODO in the future (not part of this PR) is to render live applications in addition to just completed applications. This is useful when applications fail, a condition that our current HistoryServer does not handle unless the user manually signals application completion (by creating the APPLICATION_COMPLETION file). Handling live applications becomes significantly more challenging, however, because it is now necessary to render the same SparkUI multiple times. To avoid reading the entire log every time, which is inefficient, we must handle reading the log from where we previously left off, but this becomes fairly complicated because we must deal with the arbitrary behavior of each input stream.

Author: Andrew Or <andrewor14@gmail.com>

Closes #204 from andrewor14/master and squashes the following commits:

7b7234c [Andrew Or] Finished -> Completed
b158d98 [Andrew Or] Address Patrick's comments
69d1b41 [Andrew Or] Do not block on posting SparkListenerApplicationEnd
19d5dd0 [Andrew Or] Merge github.com:apache/spark
f7f5bf0 [Andrew Or] Make history server's web UI port a Spark configuration
2dfb494 [Andrew Or] Decouple checking for application completion from replaying
d02dbaa [Andrew Or] Expose Spark version and include it in event logs
2282300 [Andrew Or] Add documentation for the HistoryServer
567474a [Andrew Or] Merge github.com:apache/spark
6edf052 [Andrew Or] Merge github.com:apache/spark
19e1fb4 [Andrew Or] Address Thomas' comments
248cb3d [Andrew Or] Limit number of live applications + add configurability
a3598de [Andrew Or] Do not close file system with ReplayBus + fix bind address
bc46fc8 [Andrew Or] Merge github.com:apache/spark
e2f4ff9 [Andrew Or] Merge github.com:apache/spark
050419e [Andrew Or] Merge github.com:apache/spark
81b568b [Andrew Or] Fix strange error messages...
0670743 [Andrew Or] Decouple page rendering from loading files from disk
1b2f391 [Andrew Or] Minor changes
a9eae7e [Andrew Or] Merge branch 'master' of github.com:apache/spark
d5154da [Andrew Or] Styling and comments
5dbfbb4 [Andrew Or] Merge branch 'master' of github.com:apache/spark
60bc6d5 [Andrew Or] First complete implementation of HistoryServer (only for finished apps)
7584418 [Andrew Or] Report application start/end times to HistoryServer
8aac163 [Andrew Or] Add basic application table
c086bd5 [Andrew Or] Add HistoryServer and scripts ++ Refactor WebUI interface
2014-04-10 10:39:34 -07:00
Aaron Davidson 007a733434 SPARK-1286: Make usage of spark-env.sh idempotent
Various spark scripts load spark-env.sh. This can cause growth of any variables that may be appended to (SPARK_CLASSPATH, SPARK_REPL_OPTS) and it makes the precedence order for options specified in spark-env.sh less clear.

One use-case for the latter is that we want to set options from the command-line of spark-shell, but these options will be overridden by subsequent loading of spark-env.sh. If we were to load the spark-env.sh first and then set our command-line options, we could guarantee correct precedence order.

Note that we use SPARK_CONF_DIR if available to support the sbin/ scripts, which always set this variable from sbin/spark-config.sh. Otherwise, we default to the ../conf/ as usual.

Author: Aaron Davidson <aaron@databricks.com>

Closes #184 from aarondav/idem and squashes the following commits:

e291f91 [Aaron Davidson] Use "private" variables in load-spark-env.sh
8da8360 [Aaron Davidson] Add .sh extension to load-spark-env.sh
93a2471 [Aaron Davidson] SPARK-1286: Make usage of spark-env.sh idempotent
2014-03-24 22:24:21 -07:00
Nick Lanham a18ea00f3a Bundle tachyon: SPARK-1269
This should all work as expected with the current version of the tachyon tarball (0.4.1)

Author: Nick Lanham <nick@afternight.org>

Closes #137 from nicklan/bundle-tachyon and squashes the following commits:

2eee15b [Nick Lanham] Put back in exec, start tachyon first
738ba23 [Nick Lanham] Move tachyon out of sbin
f2f9bc6 [Nick Lanham] More checks for tachyon script
111e8e1 [Nick Lanham] Only try tachyon operations if tachyon script exists
0561574 [Nick Lanham] Copy over web resources so web interface can run
4dc9809 [Nick Lanham] Update to tachyon 0.4.1
0a1a20c [Nick Lanham] Add scripts using tachyon tarball
2014-03-18 22:04:57 -07:00
CodingCat 437b62fcb0 [SPARK-1041] remove dead code in start script, remind user to set that in spark-env.sh
the lines in start-master.sh and start-slave.sh no longer work

in ec2, the host name has changed, e.g.

ubuntu@ip-172-31-36-93:~$ hostname
ip-172-31-36-93

also, the URL to fetch public DNS name also changed, e.g.

ubuntu@ip-172-31-36-93:~$ wget -q -O - http://instance-data.ec2.internal/latest/meta-data/public-hostname
ubuntu@ip-172-31-36-93:~$  (returns nothing)

since we have spark-ec2 project, we don't need to have such ec2-specific lines here, instead, user only need to set in spark-env.sh

Author: CodingCat <zhunansjtu@gmail.com>

Closes #588 from CodingCat/deadcode_in_sbin and squashes the following commits:

e4236e0 [CodingCat] remove dead code in start script, remind user set that in spark-env.sh
2014-02-22 20:21:15 -08:00
sproblvem dea4ba9d80 Update stop-slaves.sh
The most recently version has changed the directory structure, but this script "sbin/stop-all.sh" doesn't change with it accordingly. This mistake makes "sbin/stop-all.sh" can't stop the slave node.
2014-01-07 11:11:59 +08:00
Prashant Sharma 9ae382c363 sbin/compute-classpath* bin/compute-classpath* 2014-01-03 15:12:29 +05:30
Prashant Sharma 74ba97fcf7 sbin/spark-class* -> bin/spark-class* 2014-01-03 15:08:01 +05:30
Prashant Sharma 980afd280a Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into spark-915-segregate-scripts
Conflicts:
	bin/spark-shell
	core/pom.xml
	core/src/main/scala/org/apache/spark/SparkContext.scala
	core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala
	core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala
	core/src/test/scala/org/apache/spark/DriverSuite.scala
	python/run-tests
	sbin/compute-classpath.sh
	sbin/spark-class
	sbin/stop-slaves.sh
2014-01-02 17:55:21 +05:30
Andrew xia 52ccf4f859 deprecate "spark" script and SPAKR_CLASSPATH environment variable 2013-10-12 14:34:14 +08:00
Andrew xia cc37b3151c refactor $FWD variable 2013-09-29 22:00:19 +08:00
shane-huang 1d53792a0a add scripts in bin
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 16:13:46 +08:00
shane-huang fcfe4f9204 add admin scripts to sbin
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 12:42:34 +08:00
shane-huang dfbdc9ddb7 added spark-class and spark-executor to sbin
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 11:28:58 +08:00