Commit graph

2164 commits

Author SHA1 Message Date
Matei Zaharia 85019d76a4 Merge pull request #427 from woggling/dag-sched-tests
Tests for DAGScheduler
2013-02-02 19:09:59 -08:00
Stephen Haberman 7aba123f0c Further simplify checking for Nil. 2013-02-02 13:53:28 -06:00
Charles Reiss 6107957962 Merge remote-tracking branch 'base/master' into dag-sched-tests
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
2013-02-02 00:33:30 -08:00
Stephen Haberman cae8a6795c Fix dangling old variable names. 2013-02-02 02:15:39 -06:00
Stephen Haberman 696eec32c9 Move executorMemory up into SchedulerBackend. 2013-02-02 02:03:26 -06:00
Stephen Haberman 103c375ba0 Merge branch 'master' into sparkmem 2013-02-02 01:57:18 -06:00
Stephen Haberman 28e0cb9f31 Fix createActorSystem not actually using the systemName parameter.
This meant all system names were "spark", which worked, but didn't
lead to the most intuitive log output.

This fixes createActorSystem to use the passed system name, and
refactors Master/Worker to encapsulate their system/actor names
instead of having the clients guess at them.

Note that the driver system name, "spark", is left as is, and is
still repeated a few times, but that seems like a separate issue.
2013-02-02 01:11:37 -06:00
Charles Reiss 1fd5ee323d Code review changes: add sc.stop; style of multiline comments; parens on procedure calls. 2013-02-01 22:33:38 -08:00
Matei Zaharia ae26911ec0 Add back test for distinct without parens 2013-02-01 21:07:24 -08:00
Matei Zaharia 7ae4b6a23d Merge pull request #441 from stephenh/lessnoisyakka
Reduce the amount of duplicate logging Akka does to stdout.
2013-02-01 21:03:37 -08:00
Stephen Haberman 12c1eb4756 Reduce the amount of duplicate logging Akka does to stdout.
Given we have Akka logging go through SLF4j to log4j, we don't need
all the extra noise of Akka's stdout logger that is supposedly only
used during Akka init time but seems to continue logging lots of
noisy network events that we either don't care about or are in the
log4j logs anyway.

See:

http://doc.akka.io/docs/akka/2.0/general/configuration.html

    # Log level for the very basic logger activated during AkkaApplication startup
    # Options: ERROR, WARNING, INFO, DEBUG
    # stdout-loglevel = "WARNING"
2013-02-01 21:21:44 -06:00
Matei Zaharia 8b3041c723 Reduced the memory usage of reduce and similar operations
These operations used to wait for all the results to be available in an
array on the driver program before merging them. They now merge values
incrementally as they arrive.
2013-02-01 15:38:42 -08:00
Matei Zaharia 4529876db0 Merge branch 'master' of github.com:mesos/spark 2013-02-01 14:07:38 -08:00
Matei Zaharia 9970926ede formatting 2013-02-01 14:07:34 -08:00
Matei Zaharia 79c24abe4c Merge pull request #432 from stephenh/moreprivacy
Add more private declarations.
2013-02-01 14:06:55 -08:00
Matei Zaharia de340ddf0b Merge pull request #437 from stephenh/cancelmetacleaner
Stop BlockManagers metadataCleaner.
2013-02-01 12:59:25 -08:00
Matei Zaharia 0455650713 Merge pull request #439 from JoshRosen/spark-580
Use spark.local.dir for PySpark temp files (SPARK-580).
2013-02-01 12:07:42 -08:00
Josh Rosen e211f405bc Use spark.local.dir for PySpark temp files (SPARK-580). 2013-02-01 11:50:27 -08:00
Matei Zaharia b6a6092177 Merge pull request #438 from JoshRosen/spark-674
Do not launch JavaGateways on workers (SPARK-674).
2013-02-01 11:29:47 -08:00
Josh Rosen 9cc6ff9c4e Do not launch JavaGateways on workers (SPARK-674).
The problem was that the gateway was being initialized whenever the
pyspark.context module was loaded.  The fix uses lazy initialization
that occurs only when SparkContext instances are actually constructed.

I also made the gateway and jvm variables private.

This change results in ~3-4x performance improvement when running the
PySpark unit tests.
2013-02-01 11:13:10 -08:00
Imran Rashid c6190067ae remove unneeded (and unused) filter on block info 2013-02-01 09:55:25 -08:00
Stephen Haberman 59c57e48df Stop BlockManagers metadataCleaner. 2013-02-01 10:34:02 -06:00
Matei Zaharia 571af31304 Merge pull request #433 from rxin/master
Changed PartitionPruningRDD's split to make sure it returns the correct split index.
2013-02-01 00:32:41 -08:00
Matei Zaharia 5ce5efec10 Merge pull request #435 from JoshRosen/pyspark_stdout_fix
Fix stdout redirection in PySpark.
2013-02-01 00:32:07 -08:00
Josh Rosen 57b64d0d19 Fix stdout redirection in PySpark. 2013-02-01 00:25:19 -08:00
Imran Rashid 8a0a5ed533 track total partitions, in addition to cached partitions; use scala string formatting 2013-02-01 00:23:38 -08:00
Imran Rashid f127f2ae76 fixup merge (master -> driver renaming) 2013-02-01 00:20:49 -08:00
Reynold Xin f9af9cee6f Moved PruneDependency into PartitionPruningRDD.scala. 2013-02-01 00:02:46 -08:00
Matei Zaharia 7e2e046e37 Merge pull request #434 from pwendell/python-exceptions
SPARK-673: Capture and re-throw Python exceptions
2013-01-31 21:58:26 -08:00
Patrick Wendell 39ab83e957 Small fix from last commit 2013-01-31 21:52:52 -08:00
Patrick Wendell c33f0ef41a Some style cleanup 2013-01-31 21:50:02 -08:00
Matei Zaharia 95e14fbc38 Merge pull request #431 from mbautin/revert_default_profile
Remove activation of profiles by default
2013-01-31 21:34:59 -08:00
Patrick Wendell 3446d5c8d6 SPARK-673: Capture and re-throw Python exceptions
This patch alters the Python <-> executor protocol to pass on
exception data when they occur in user Python code.
2013-01-31 18:06:11 -08:00
Reynold Xin 6289d9654e Removed the TODO comment from PartitionPruningRDD. 2013-01-31 17:49:36 -08:00
Reynold Xin 5b0fc265c2 Changed PartitionPruningRDD's split to make sure it returns the correct
split index.
2013-01-31 17:48:39 -08:00
Stephen Haberman 782187c210 Once we find a split with no block, we don't have to look for more. 2013-01-31 18:27:25 -06:00
Stephen Haberman 418e36caa8 Add more private declarations. 2013-01-31 17:18:33 -06:00
Mikhail Bautin fe3eceab57 Remove activation of profiles by default
See the discussion at https://github.com/mesos/spark/pull/355 for why
default profile activation is a problem.
2013-01-31 13:30:41 -08:00
Imran Rashid 02a6761589 Merge branch 'master' into blockmanager_info
Conflicts:
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
2013-01-30 18:52:35 -08:00
Imran Rashid c1df24d085 rename Slaves --> Executor 2013-01-30 18:51:14 -08:00
Matei Zaharia 55327a283e Merge pull request #430 from pwendell/pyspark-guide
Minor improvements to PySpark docs
2013-01-30 15:35:29 -08:00
Patrick Wendell 3f945e3b83 Make module help available in python shell.
Also, adds a line in doc explaining how to use.
2013-01-30 15:04:06 -08:00
Patrick Wendell 58a7d320d7 Inclue packaging and launching pyspark in guide.
It's nicer if all the commands you need are made explicit.
2013-01-30 15:04:02 -08:00
Matei Zaharia d12330bd2c Merge pull request #426 from woggling/conn-manager-ips
Remember ConnectionManagerId used to initiate SendingConnections
2013-01-30 15:02:53 -08:00
Matei Zaharia 612a9fee71 Merge pull request #428 from woggling/mesos-exec-id
Make ExecutorIDs include SlaveIDs when running Mesos
2013-01-30 15:01:46 -08:00
Matei Zaharia dfb721b970 Merge pull request #429 from stephenh/includemessage
Include message and exitStatus if availalbe.
2013-01-30 15:01:24 -08:00
Stephen Haberman 871476d506 Include message and exitStatus if availalbe. 2013-01-30 16:56:46 -06:00
Charles Reiss 252845d304 Remove remants of attempt to use slaveId-executorId in MesosExecutorBackend 2013-01-30 10:38:06 -08:00
Charles Reiss f7de6978c1 Use Mesos ExecutorIDs to hold SlaveIDs. Then we can safely use
the Mesos ExecutorID as a Spark ExecutorID.
2013-01-30 09:38:57 -08:00
Charles Reiss 7f51458774 Comment at top of DAGSchedulerSuite 2013-01-30 09:34:53 -08:00