Commit graph

1017 commits

Author SHA1 Message Date
Imran Rashid d0bfac3eed taskInfo tracks if a task is run on a preferred host 2013-02-21 15:19:34 -08:00
Imran Rashid 6f62a57858 add runtime breakdowns 2013-02-21 15:19:34 -08:00
Imran Rashid 176cb20703 add task result size; better formatting for time interval distributions; cleanup distribution formatting 2013-02-21 15:19:33 -08:00
Imran Rashid f2fcabf2ea add timing around parts of executor & track result size 2013-02-21 15:19:33 -08:00
Imran Rashid ff127cfcd3 Merge branch 'master' into stageInfo
Conflicts:
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/storage/BlockManager.scala
2013-02-21 15:16:21 -08:00
Imran Rashid baab23abdf TaskContext does not hold a reference to Task; instead, it has a shared instance of TaskMetrics with Task 2013-02-21 14:13:01 -08:00
Matei Zaharia 05bc02e80b Merge pull request #482 from woggling/shutdown-exceptions
Don't call System.exit over uncaught exceptions from shutdown hooks
2013-02-19 20:56:15 -08:00
Charles Reiss 092c631fa8 Pull detection of being in a shutdown hook into utility function. 2013-02-19 17:49:55 -08:00
Reynold Xin 130f704baf Added a method to create PartitionPruningRDD. 2013-02-19 16:03:52 -08:00
Charles Reiss d0588bd6d7 Catch/log errors deleting temp dirs 2013-02-19 13:04:06 -08:00
Charles Reiss 687581c3ec Paranoid uncaught exception handling for exceptions during shutdown 2013-02-19 13:03:02 -08:00
Matei Zaharia 7151e1e4c8 Rename "jobs" to "applications" in the standalone cluster 2013-02-17 23:23:08 -08:00
Matei Zaharia 06e5e6627f Renamed "splits" to "partitions" 2013-02-17 22:13:26 -08:00
Matei Zaharia 340cc54e47 Merge pull request #471 from stephenh/parallelrdd
Move ParallelCollection into spark.rdd package.
2013-02-16 16:39:15 -08:00
Matei Zaharia 3260b6120e Merge pull request #470 from stephenh/morek
Make CoGroupedRDDs explicitly have the same key type.
2013-02-16 16:38:38 -08:00
Stephen Haberman e7713adb99 Move ParallelCollection into spark.rdd package. 2013-02-16 13:20:48 -06:00
Stephen Haberman ae2234687d Make CoGroupedRDDs explicitly have the same key type. 2013-02-16 13:10:31 -06:00
Stephen Haberman 4328873294 Add assertion about dependencies. 2013-02-16 01:16:40 -06:00
Stephen Haberman c34b8ad2c5 Avoid a shuffle if combineByKey is passed the same partitioner. 2013-02-16 00:54:03 -06:00
Imran Rashid bffee929ab Merge branch 'master' into stageInfo
Conflicts:
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/storage/BlockManager.scala
2013-02-15 10:35:04 -08:00
Imran Rashid 893bad9089 use appid instead of frameworkid; simplify stupid condition 2013-02-13 20:30:21 -08:00
Imran Rashid 8f18e7e863 include jobid in Executor commandline args 2013-02-13 13:05:13 -08:00
Matei Zaharia bfeed4725d Merge pull request #465 from pwendell/java-sort-fix
SPARK-696: sortByKey should use 'ascending' parameter
2013-02-11 18:23:12 -08:00
Patrick Wendell 21df6ffc13 SPARK-696: sortByKey should use 'ascending' parameter 2013-02-11 17:43:26 -08:00
Matei Zaharia ea08537143 Fixed an exponential recursion that could happen with doCheckpoint due
to lack of memoization
2013-02-11 13:23:50 -08:00
Imran Rashid e9f53ec0ea undo chnage to onCompleteCallbacks 2013-02-11 09:36:49 -08:00
Matei Zaharia da8afbc77e Some bug and formatting fixes to FT
Conflicts:
	core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:43:38 -08:00
root 1b47fa2752 Detect hard crashes of workers using a heartbeat mechanism.
Also fixes some issues in the rest of the code with detecting workers this way.

Conflicts:
	core/src/main/scala/spark/deploy/master/Master.scala
	core/src/main/scala/spark/deploy/worker/Worker.scala
	core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:28:28 -08:00
Matei Zaharia 8c66c49962 Tweak web UI so that people don't get confused about master URL format
Conflicts:
	core/src/main/twirl/spark/deploy/master/index.scala.html
	core/src/main/twirl/spark/deploy/worker/index.scala.html
2013-02-10 21:58:34 -08:00
Imran Rashid d9461b15d3 cleanup a bunch of imports 2013-02-10 21:41:40 -08:00
Imran Rashid 383af599bb SparkContext.addSparkListener; "std" listener in StatsReportListener 2013-02-10 14:19:37 -08:00
Imran Rashid b7d9e24394 use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver 2013-02-10 14:18:52 -08:00
Matei Zaharia ccb1ca4a23 Merge pull request #448 from squito/fetch_maxBytesInFlight
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-09 18:15:18 -08:00
Matei Zaharia f750daa510 Merge pull request #452 from stephenh/misc
Add RDD.coalesce, clean up some RDDs, other misc.
2013-02-09 18:12:56 -08:00
Stephen Haberman 4619ee0787 Move JavaRDDLike.coalesce into the right places. 2013-02-09 20:05:42 -06:00
Stephen Haberman fb7599870f Fix JavaRDDLike.coalesce return type. 2013-02-09 16:10:52 -06:00
Stephen Haberman 2a18cd826c Add back return types. 2013-02-09 10:12:04 -06:00
Stephen Haberman da52b16b38 Remove RDD.coalesce default arguments. 2013-02-09 10:11:54 -06:00
Imran Rashid 04e828f7c1 general fixes to Distribution, plus some tests 2013-02-08 19:07:36 -08:00
Mark Hamstra b8863a79d3 Merge branch 'master' of https://github.com/mesos/spark into commutative
Conflicts:
	core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Mark Hamstra 934a53c8b6 Change docs on 'reduce' since the merging of local reduces no longer preserves
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Stephen Haberman a9c8d53cfa Clean up RDDs, mainly to use getSplits.
Also made sure clearDependencies() was calling super, to ensure
the getSplits/getDependencies vars in the RDD base class get
cleaned up.
2013-02-05 22:16:59 -06:00
Stephen Haberman f4d43cb43e Remove unneeded zipWithIndex.
Also rename r->rdd and remove unneeded extra type info.
2013-02-05 21:26:45 -06:00
Stephen Haberman f2bc748013 Add RDD.coalesce. 2013-02-05 21:23:36 -06:00
Stephen Haberman 67df7f2fa2 Add private, minor formatting. 2013-02-05 21:08:21 -06:00
Imran Rashid 379564c7e0 setup plumbing to get task metrics; lots of unfinished parts, but basic flow in place 2013-02-05 18:30:21 -08:00
Matei Zaharia 9cfa068379 Merge pull request #450 from stephenh/inlinemergepair
Inline mergePair to look more like the narrow dep branch.
2013-02-05 18:28:44 -08:00
Stephen Haberman 870b2aaf5d Merge branch 'master' into fixdeathpactexception
Conflicts:
	core/src/main/scala/spark/deploy/worker/Worker.scala
2013-02-05 20:27:09 -06:00
Stephen Haberman 0e19093fd8 Handle Terminated to avoid endless DeathPactExceptions.
Credit to Roland Kuhn, Akka's tech lead, for pointing out this
various obvious fix, but StandaloneExecutorBackend.preStart's
catch block would never (ever) get hit, because all of the
operation's in preStart are async.

So, the System.exit in the catch block was skipped, and instead
Akka was sending Terminated messages which, since we didn't
handle, it turned into DeathPactException, which started
a postRestart/preStart infinite loop.
2013-02-05 18:58:00 -06:00
Stephen Haberman 8bd0e888f3 Inline mergePair to look more like the narrow dep branch.
No functionality changes, I think this is just more consistent
given mergePair isn't called multiple times/recursive.

Also added a comment to explain the usual case of having two parent RDDs.
2013-02-05 17:50:25 -06:00