Commit graph

2235 commits

Author SHA1 Message Date
Matei Zaharia 8c66c49962 Tweak web UI so that people don't get confused about master URL format
Conflicts:
	core/src/main/twirl/spark/deploy/master/index.scala.html
	core/src/main/twirl/spark/deploy/worker/index.scala.html
2013-02-10 21:58:34 -08:00
Matei Zaharia 0b788b760b Update Windows scripts to launch daemons with less RAM and fix a few
other issues

Conflicts:
	run2.cmd
2013-02-10 21:51:49 -08:00
Imran Rashid d9461b15d3 cleanup a bunch of imports 2013-02-10 21:41:40 -08:00
Imran Rashid 383af599bb SparkContext.addSparkListener; "std" listener in StatsReportListener 2013-02-10 14:19:37 -08:00
Imran Rashid b7d9e24394 use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver 2013-02-10 14:18:52 -08:00
Josh Rosen 131b56afd0 Update issue tracker link in contributing guide. 2013-02-10 13:28:31 -08:00
Matei Zaharia b1d809913b Merge pull request #460 from markhamstra/404
Fixed a 404 in 'Tuning Spark' -- missing '.html'
2013-02-10 13:01:09 -08:00
Mark Hamstra 4975dcdafc Fixed a 404 -- missing '.html' 2013-02-10 12:55:47 -08:00
Matei Zaharia ccb1ca4a23 Merge pull request #448 from squito/fetch_maxBytesInFlight
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-09 18:15:18 -08:00
Matei Zaharia 76ac0ce6c0 Merge pull request #446 from pwendell/olap-example
SPARK-678: Adding an example with an OLAP roll-up
2013-02-09 18:14:44 -08:00
Matei Zaharia f750daa510 Merge pull request #452 from stephenh/misc
Add RDD.coalesce, clean up some RDDs, other misc.
2013-02-09 18:12:56 -08:00
Stephen Haberman 4619ee0787 Move JavaRDDLike.coalesce into the right places. 2013-02-09 20:05:42 -06:00
Josh Rosen fc5b2e8b83 Merge pull request #457 from markhamstra/commutative
Add commutative requirement for 'reduce' to Python docstring.
2013-02-09 15:54:48 -08:00
Stephen Haberman fb7599870f Fix JavaRDDLike.coalesce return type. 2013-02-09 16:10:52 -06:00
Mark Hamstra b7a1fb5c5d Add commutative requirement for 'reduce' to Python docstring. 2013-02-09 12:14:11 -08:00
Matei Zaharia 51db4c1f30 Merge pull request #453 from markhamstra/commutative
Change docs on 'reduce' since the merging of local reduces no longer pre...
2013-02-09 10:36:30 -08:00
Stephen Haberman 2a18cd826c Add back return types. 2013-02-09 10:12:04 -06:00
Stephen Haberman da52b16b38 Remove RDD.coalesce default arguments. 2013-02-09 10:11:54 -06:00
Imran Rashid 04e828f7c1 general fixes to Distribution, plus some tests 2013-02-08 19:07:36 -08:00
Mark Hamstra b8863a79d3 Merge branch 'master' of https://github.com/mesos/spark into commutative
Conflicts:
	core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Matei Zaharia b53174a6f3 Merge pull request #454 from MLnick/ipython
SPARK-685 Adding IPYTHON environment variable support for launching pyspark using ...
2013-02-07 18:29:04 -08:00
Nick Pentreath 21d3946d17 Adding IPYTHON environment variable support for launching pyspark using ipython shell 2013-02-07 16:54:31 +02:00
Mark Hamstra 934a53c8b6 Change docs on 'reduce' since the merging of local reduces no longer preserves
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Patrick Wendell dab81a8511 Fixing to match Spark styleguide 2013-02-05 20:57:04 -08:00
Stephen Haberman a9c8d53cfa Clean up RDDs, mainly to use getSplits.
Also made sure clearDependencies() was calling super, to ensure
the getSplits/getDependencies vars in the RDD base class get
cleaned up.
2013-02-05 22:16:59 -06:00
Stephen Haberman f4d43cb43e Remove unneeded zipWithIndex.
Also rename r->rdd and remove unneeded extra type info.
2013-02-05 21:26:45 -06:00
Stephen Haberman f2bc748013 Add RDD.coalesce. 2013-02-05 21:23:36 -06:00
Stephen Haberman 67df7f2fa2 Add private, minor formatting. 2013-02-05 21:08:21 -06:00
Imran Rashid 379564c7e0 setup plumbing to get task metrics; lots of unfinished parts, but basic flow in place 2013-02-05 18:30:21 -08:00
Matei Zaharia 9cfa068379 Merge pull request #450 from stephenh/inlinemergepair
Inline mergePair to look more like the narrow dep branch.
2013-02-05 18:28:44 -08:00
Matei Zaharia 03eefbb200 Merge pull request #451 from stephenh/fixdeathpactexception
Handle Terminated to avoid endless DeathPactExceptions.
2013-02-05 18:27:54 -08:00
Stephen Haberman 870b2aaf5d Merge branch 'master' into fixdeathpactexception
Conflicts:
	core/src/main/scala/spark/deploy/worker/Worker.scala
2013-02-05 20:27:09 -06:00
Matei Zaharia a4611d66f0 Merge pull request #449 from stephenh/longerdriversuite
Increase DriverSuite timeout.
2013-02-05 17:58:22 -08:00
Stephen Haberman 0e19093fd8 Handle Terminated to avoid endless DeathPactExceptions.
Credit to Roland Kuhn, Akka's tech lead, for pointing out this
various obvious fix, but StandaloneExecutorBackend.preStart's
catch block would never (ever) get hit, because all of the
operation's in preStart are async.

So, the System.exit in the catch block was skipped, and instead
Akka was sending Terminated messages which, since we didn't
handle, it turned into DeathPactException, which started
a postRestart/preStart infinite loop.
2013-02-05 18:58:00 -06:00
Stephen Haberman 1ba3393ceb Increase DriverSuite timeout. 2013-02-05 17:56:50 -06:00
Stephen Haberman 8bd0e888f3 Inline mergePair to look more like the narrow dep branch.
No functionality changes, I think this is just more consistent
given mergePair isn't called multiple times/recursive.

Also added a comment to explain the usual case of having two parent RDDs.
2013-02-05 17:50:25 -06:00
Imran Rashid 1704b124d8 add as many fetch requests as we can, subject to maxBytesInFlight 2013-02-05 14:33:52 -08:00
Imran Rashid cfab1a3528 add as many fetch requests as we can, subject to maxBytesInFlight 2013-02-05 14:31:46 -08:00
Imran Rashid 696e4b2167 track remoteFetchTime 2013-02-05 14:29:16 -08:00
Imran Rashid b29f9cc978 BlockManager.getMultiple returns a custom iterator, to enable tracking of shuffle performance 2013-02-05 14:00:44 -08:00
Matei Zaharia 2d9eca9fbb Merge pull request #447 from pwendell/streaming-constructor
Streaming constructor which takes JavaSparkContext
2013-02-05 11:45:44 -08:00
Patrick Wendell 7eea64aa4c Streaming constructor which takes JavaSparkContext
It's sometimes helpful to directly pass a JavaSparkContext,
and take advantage of the various constructors available for that.
2013-02-05 11:43:16 -08:00
Imran Rashid e319ac74c1 cogrouped RDD stores the amount of time taken to read shuffle data in each task 2013-02-05 10:18:16 -08:00
Imran Rashid 295b534398 task context keeps a handle on Task -- giant hack, temporary for tracking shuffle times & amount 2013-02-05 10:18:16 -08:00
Imran Rashid 9df7e2ae55 Shuffle Fetchers use a timed iterator 2013-02-05 10:18:16 -08:00
Imran Rashid 1ad77c4766 add TimedIterator 2013-02-05 10:18:15 -08:00
Imran Rashid 843084d69d track total bytes written by ShuffleMapTasks 2013-02-05 10:18:15 -08:00
Imran Rashid b430d2359d Merge branch 'master' into stageInfo
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/scheduler/local/LocalScheduler.scala
2013-02-04 21:40:44 -08:00
Patrick Wendell cc37601ecb Adding an example with an OLAP roll-up 2013-02-04 14:18:11 -08:00
Matei Zaharia f6ec547ea7 Small fix to test for distinct 2013-02-04 13:14:54 -08:00