Commit graph

2437 commits

Author SHA1 Message Date
Patrick Wendell 04786d0739 small fix 2013-02-11 10:05:49 -08:00
Patrick Wendell c65988bdc1 Fix for MapPartitions 2013-02-11 10:03:37 -08:00
Patrick Wendell 20cf770545 Fix for flatmap 2013-02-11 10:03:37 -08:00
Patrick Wendell 314d87a038 Indentation fix 2013-02-11 10:03:37 -08:00
Patrick Wendell f0b68c623c Initial cut at replacing K, V in Java files 2013-02-11 10:03:37 -08:00
Imran Rashid e9f53ec0ea undo chnage to onCompleteCallbacks 2013-02-11 09:36:49 -08:00
Matei Zaharia da8afbc77e Some bug and formatting fixes to FT
Conflicts:
	core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:43:38 -08:00
root 1b47fa2752 Detect hard crashes of workers using a heartbeat mechanism.
Also fixes some issues in the rest of the code with detecting workers this way.

Conflicts:
	core/src/main/scala/spark/deploy/master/Master.scala
	core/src/main/scala/spark/deploy/worker/Worker.scala
	core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:28:28 -08:00
Matei Zaharia 05d2e94838 Use a separate memory setting for standalone cluster daemons
Conflicts:
	docs/_config.yml
2013-02-10 21:59:41 -08:00
Matei Zaharia 8c66c49962 Tweak web UI so that people don't get confused about master URL format
Conflicts:
	core/src/main/twirl/spark/deploy/master/index.scala.html
	core/src/main/twirl/spark/deploy/worker/index.scala.html
2013-02-10 21:58:34 -08:00
Matei Zaharia 0b788b760b Update Windows scripts to launch daemons with less RAM and fix a few
other issues

Conflicts:
	run2.cmd
2013-02-10 21:51:49 -08:00
Imran Rashid d9461b15d3 cleanup a bunch of imports 2013-02-10 21:41:40 -08:00
Tathagata Das fd90daf850 Fixed bugs in FileInputDStream and Scheduler that occasionally failed to reprocess old files after recovering from master failure. Completely modified spark.streaming.FailureTest to test multiple master failures using file input stream. 2013-02-10 19:48:42 -08:00
Tathagata Das 16baea62bc Fixed bug in CheckpointRDD to prevent exception when the original RDD had zero splits. 2013-02-10 19:14:49 -08:00
Imran Rashid 383af599bb SparkContext.addSparkListener; "std" listener in StatsReportListener 2013-02-10 14:19:37 -08:00
Imran Rashid b7d9e24394 use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver 2013-02-10 14:18:52 -08:00
Josh Rosen 131b56afd0 Update issue tracker link in contributing guide. 2013-02-10 13:28:31 -08:00
Matei Zaharia b1d809913b Merge pull request #460 from markhamstra/404
Fixed a 404 in 'Tuning Spark' -- missing '.html'
2013-02-10 13:01:09 -08:00
Mark Hamstra 4975dcdafc Fixed a 404 -- missing '.html' 2013-02-10 12:55:47 -08:00
Stephen Haberman 680f42e6cd Change defaultPartitioner to use upstream split size.
Previously it used the SparkContext.defaultParallelism, which occassionally
ended up being a very bad guess. Looking at upstream RDDs seems to make
better use of the context.

Also sorted the upstream RDDs by partition size first, as if we have
a hugely-partitioned RDD and tiny-partitioned RDD, it is unlikely
we want the resulting RDD to be tiny-partitioned.
2013-02-10 02:27:03 -06:00
Patrick Wendell 2ed791fd7f Minor fixes 2013-02-09 22:00:38 -08:00
Patrick Wendell 1859c9f93c Changing to use Timer based on code review 2013-02-09 21:55:17 -08:00
Matei Zaharia ccb1ca4a23 Merge pull request #448 from squito/fetch_maxBytesInFlight
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-09 18:15:18 -08:00
Matei Zaharia 76ac0ce6c0 Merge pull request #446 from pwendell/olap-example
SPARK-678: Adding an example with an OLAP roll-up
2013-02-09 18:14:44 -08:00
Matei Zaharia f750daa510 Merge pull request #452 from stephenh/misc
Add RDD.coalesce, clean up some RDDs, other misc.
2013-02-09 18:12:56 -08:00
Stephen Haberman 4619ee0787 Move JavaRDDLike.coalesce into the right places. 2013-02-09 20:05:42 -06:00
Josh Rosen fc5b2e8b83 Merge pull request #457 from markhamstra/commutative
Add commutative requirement for 'reduce' to Python docstring.
2013-02-09 15:54:48 -08:00
Tathagata Das 99a5fc498a Added an initial spark job to ensure worker nodes are initialized. 2013-02-09 15:18:05 -08:00
Stephen Haberman fb7599870f Fix JavaRDDLike.coalesce return type. 2013-02-09 16:10:52 -06:00
Mark Hamstra b7a1fb5c5d Add commutative requirement for 'reduce' to Python docstring. 2013-02-09 12:14:11 -08:00
Matei Zaharia 51db4c1f30 Merge pull request #453 from markhamstra/commutative
Change docs on 'reduce' since the merging of local reduces no longer pre...
2013-02-09 10:36:30 -08:00
Stephen Haberman 2a18cd826c Add back return types. 2013-02-09 10:12:04 -06:00
Stephen Haberman da52b16b38 Remove RDD.coalesce default arguments. 2013-02-09 10:11:54 -06:00
Imran Rashid 04e828f7c1 general fixes to Distribution, plus some tests 2013-02-08 19:07:36 -08:00
Mark Hamstra b8863a79d3 Merge branch 'master' of https://github.com/mesos/spark into commutative
Conflicts:
	core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Prashant Sharma 291dd47c7f Taking FeederActor out as seperate program 2013-02-08 14:34:07 +05:30
Matei Zaharia b53174a6f3 Merge pull request #454 from MLnick/ipython
SPARK-685 Adding IPYTHON environment variable support for launching pyspark using ...
2013-02-07 18:29:04 -08:00
Tathagata Das bcee3cb2db Merge pull request #455 from tdas/streaming
Merging latest master branch changes to the streaming branch
2013-02-07 15:05:20 -08:00
Tathagata Das 4cc223b478 Merge branch 'mesos-master' into streaming 2013-02-07 13:59:31 -08:00
Tathagata Das d55e3aa467 Updated JavaStreamingContext with updated kafkaStream API. 2013-02-07 13:59:18 -08:00
Tathagata Das c6b2f765d3 Merge branch 'mesos-streaming' into streaming 2013-02-07 13:13:53 -08:00
Tathagata Das 12300758cc Merge pull request #372 from Reinvigorate/sm-kafka
Removing offset management code that is non-existent in kafka 0.7.0+
2013-02-07 12:41:07 -08:00
Tathagata Das 915d9931fe Merge pull request #373 from Reinvigorate/sm-updateStateByKey
StateDStream changes to give updateStateByKey consistent behavior
2013-02-07 11:59:19 -08:00
Nick Pentreath 21d3946d17 Adding IPYTHON environment variable support for launching pyspark using ipython shell 2013-02-07 16:54:31 +02:00
Mark Hamstra 934a53c8b6 Change docs on 'reduce' since the merging of local reduces no longer preserves
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Patrick Wendell dab81a8511 Fixing to match Spark styleguide 2013-02-05 20:57:04 -08:00
Stephen Haberman a9c8d53cfa Clean up RDDs, mainly to use getSplits.
Also made sure clearDependencies() was calling super, to ensure
the getSplits/getDependencies vars in the RDD base class get
cleaned up.
2013-02-05 22:16:59 -06:00
Stephen Haberman f4d43cb43e Remove unneeded zipWithIndex.
Also rename r->rdd and remove unneeded extra type info.
2013-02-05 21:26:45 -06:00
Stephen Haberman f2bc748013 Add RDD.coalesce. 2013-02-05 21:23:36 -06:00
Stephen Haberman 67df7f2fa2 Add private, minor formatting. 2013-02-05 21:08:21 -06:00