Commit graph

2130 commits

Author SHA1 Message Date
Reynold Xin 802bfb870d - Created AsyncRDDActions.
- Make FutureJob a Scala Future instead of Java Future.
2013-10-03 01:22:28 -07:00
Reynold Xin e8e917f209 Merge branch 'master' into kill
Conflicts:
	core/src/main/scala/org/apache/spark/TaskEndReason.scala
	core/src/main/scala/org/apache/spark/executor/Executor.scala
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
2013-10-02 23:01:34 -07:00
Reynold Xin 1c48ba0d9f Merge remote-tracking branch 'origin' into kill
Conflicts:
	core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
2013-10-02 16:40:44 -07:00
Kay Ousterhout 0dcad2edcb Added additional unit test for repeated task failures 2013-09-30 23:26:15 -07:00
Kay Ousterhout dea4677c88 Fixed compilation errors and broken test. 2013-09-30 22:07:01 -07:00
Kay Ousterhout 8deda427bc Merge remote-tracking branch 'upstream/master' into results_through-bm
Conflicts:
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterScheduler.scala
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/main/scala/org/apache/spark/scheduler/local/LocalTaskSetManager.scala
2013-09-30 10:16:58 -07:00
Kay Ousterhout 58b764b7c6 Addressed Matei's code review comments 2013-09-30 10:11:59 -07:00
Reynold Xin 70a0b993d4 Merge pull request #14 from kayousterhout/untangle_scheduler
Improved organization of scheduling packages.

This commit does not change any code -- only file organization.
Please let me know if there was some masterminded strategy behind
the existing organization that I failed to understand!

There are two components of this change:
(1) Moving files out of the cluster package, and down
a level to the scheduling package. These files are all used by
the local scheduler in addition to the cluster scheduler(s), so
should not be in the cluster package. As a result of this change,
none of the files in the local package reference files in the
cluster package.

(2) Moving the mesos package to within the cluster package.
The mesos scheduling code is for a cluster, and represents a
specific case of cluster scheduling (the Mesos-related classes
often subclass cluster scheduling classes). Thus, the most logical
place for it seems to be within the cluster package.

The one thing about the scheduling code that seems a little funny to me
is the naming of the SchedulerBackends.  The StandaloneSchedulerBackend
is not just for Standalone mode, but instead is used by Mesos coarse grained
mode and Yarn, and the backend that *is* just for Standalone mode is instead called SparkDeploySchedulerBackend. I didn't change this because I wasn't sure if there
was a reason for this naming that I'm just not aware of.
2013-09-26 14:11:54 -07:00
Reynold Xin c514cd1587 Merge pull request #930 from holdenk/master
Add mapPartitionsWithIndex
2013-09-26 13:48:20 -07:00
Reynold Xin 560ee5c9bb Merge pull request #7 from wannabeast/memorystore-fixes
some minor fixes to MemoryStore

This is a repeat of #5, moved to its own branch in my repo.

This makes all updates to   on ; it skips on synchronizing the reads where it can get away with it.
2013-09-26 11:27:34 -07:00
Patrick Wendell 6566a19b38 Merge pull request #9 from rxin/limit
Smarter take/limit implementation.
2013-09-26 08:01:04 -07:00
Kay Ousterhout d85fe41b2b Improved organization of scheduling packages.
This commit does not change any code -- only file organization.

There are two components of this change:
(1) Moving files out of the cluster package, and down
a level to the scheduling package. These files are all used by
the local scheduler in addition to the cluster scheduler(s), so
should not be in the cluster package. As a result of this change,
none of the files in the local package reference files in the
cluster package.

(2) Moving the mesos package to within the cluster package.
The mesos scheduling code is for a cluster, and represents a
specific case of cluster scheduling (the Mesos-related classes
often subclass cluster scheduling classes). Thus, the most logical
place for it is within the cluster package.
2013-09-25 12:45:46 -07:00
Patrick Wendell 6079721fa1 Update build version in master 2013-09-24 11:41:51 -07:00
Holden Karau 0cef683553 Fix formatting :) 2013-09-23 19:39:42 -07:00
Reynold Xin ff540a015b Merge branch 'master' of github.com:markhamstra/incubator-spark 2013-09-23 11:55:02 -07:00
Kay Ousterhout c75eb14fe5 Send Task results through the block manager when larger than Akka frame size.
This change requires adding an extra failure mode: tasks can complete
successfully, but the result gets lost or flushed from the block manager
before it's been fetched.
2013-09-22 21:20:48 -07:00
Holden Karau 7fe0b0ff56 Switch indent from 2 to 4 spaces 2013-09-22 19:44:51 -07:00
jerryshao 77e9da1f34 Change Exception to NoSuchElementException and minor style fix 2013-09-22 16:50:08 +08:00
jerryshao 85024acd2e Remove infix style and others 2013-09-22 14:20:55 +08:00
jerryshao 5850f599dd Refactor FairSchedulableBuilder:
1. Configuration can be read from classpath if not set explicitly.
2. Add missing close handler.
2013-09-22 14:20:55 +08:00
Reynold Xin a2ea069a5f Merge pull request #937 from jerryshao/localProperties-fix
Fix PR926 local properties issues in Spark Streaming like scenarios
2013-09-21 23:04:42 -07:00
jerryshao aa0c29f747 Add barrier for local properties unit test and fix some styles 2013-09-22 09:53:11 +08:00
Reynold Xin 42571d30d0 Smarter take/limit implementation. 2013-09-20 17:09:53 -07:00
Reynold Xin 1d87616b61 Made output of CoGroup and aggregations interruptible. 2013-09-19 23:31:36 -07:00
Mike 9524b943a4 Synchronize on "entries" the remaining update to "currentMemory".
Make "currentMemory" @volatile, so that it's reads in ensureFreeSpace() are atomic and up-to-date--i.e., currentMemory can't increase while putLock is held (though it could decrease, which would only help ensureFreeSpace()).
2013-09-19 23:31:35 -07:00
Ankur Dave 026dba6aba After unit tests, clear port properties unconditionally
In MapOutputTrackerSuite, the "remote fetch" test sets spark.driver.port
and spark.hostPort, assuming that they will be cleared by
LocalSparkContext. However, the test never sets sc, so it remains null,
causing LocalSparkContext to skip clearing these properties. Subsequent
tests therefore fail with java.net.BindException: "Address already in
use".

This commit makes LocalSparkContext clear the properties even if sc is
null.
2013-09-19 22:05:23 -07:00
Reynold Xin c5e40954eb Wrap around cached data to InterruptibleIterator. 2013-09-19 18:44:38 -07:00
Reynold Xin c68e72be59 Added comment to InterruptibleIterator. 2013-09-19 18:40:06 -07:00
Reynold Xin 70953810b4 Added task killing iterator to RDDs that take inputs. 2013-09-19 18:33:16 -07:00
Reynold Xin f19984dafe More logging changes (task killing for local cluster doesn't work yet). 2013-09-19 18:14:51 -07:00
Reynold Xin 85a0dffe0f Made task killing work for standalone cluster schedulers. 2013-09-19 16:41:29 -07:00
Reynold Xin 9f8190c17d Fixed a bug for zero partition in JobWaiter. 2013-09-18 22:42:35 -07:00
Reynold Xin 9332246bd0 Added a hack to kill all active jobs in SparkContext. 2013-09-18 04:38:24 -07:00
Reynold Xin bf515688e7 Allow SparkContext.submitJob to submit a job for only a subset of the partitions. 2013-09-18 04:16:18 -07:00
jerryshao ffa5f8e11d Fix issue when local properties pass from parent to child thread 2013-09-18 17:33:24 +08:00
Reynold Xin 37d8f37a8e Added a submitJob interface that returns a Future of the result. 2013-09-17 21:13:59 -07:00
Reynold Xin 1cb42e6b2d Properly handle job failure when the job gets killed. 2013-09-16 22:10:45 -07:00
Reynold Xin cbc48be13b Initial commit for job killing. 2013-09-16 18:54:06 -07:00
Holden Karau bfcddf4700 Make mapPartitionsWithIndex work with JavaRDD's 2013-09-14 15:53:42 -07:00
Holden Karau 74f710f6cd Start of working on SPARK-615 2013-09-11 22:35:58 -07:00
Mike d34672f668 Set currentMemory to 0 in clear().
Remove unnecessary entries.get() call.
2013-09-11 18:01:19 -07:00
Kay Ousterhout 93c4253275 Changed localProperties to use ThreadLocal (not DynamicVariable).
The fact that DynamicVariable uses an InheritableThreadLocal
can cause problems where the properties end up being shared
across threads in certain circumstances.
2013-09-11 13:01:39 -07:00
Patrick Wendell 91a59e6b10 Merge pull request #919 from mateiz/jets3t
Add explicit jets3t dependency, which is excluded in hadoop-client
2013-09-11 10:21:48 -07:00
Patrick Wendell b9128d34bf Merge pull request #922 from pwendell/port-change
Change default port number from 3030 to 4030.
2013-09-11 10:03:06 -07:00
Patrick Wendell bddf135670 Change port from 3030 to 4040 2013-09-11 10:01:38 -07:00
David McCauley 5dd875c5b5 SPARK-894 - Not all WebUI fields delivered VIA JSON 2013-09-11 10:46:37 +01:00
Mike 293c758cc0 Remove MemoryStore$Entry.dropPending, unused as of 42e0a68082. 2013-09-10 00:24:35 -07:00
Matei Zaharia f117dc6d0d Add explicit jets3t dependency, which is excluded in hadoop-client 2013-09-10 06:39:25 +00:00
Matei Zaharia c81377b9ed Merge pull request #915 from ooyala/master
Get rid of / improve ugly NPE when Utils.deleteRecursively() fails
2013-09-09 20:16:19 -07:00
Evan Chan fdb8b0eec3 Style fix: put body of if within curly braces 2013-09-09 14:29:32 -07:00