Commit graph

2196 commits

Author SHA1 Message Date
Matei Zaharia 455d015076 Clean up EC2 script options a bit 2013-02-17 16:53:12 -08:00
Matei Zaharia 08e444df0e Change EC2 script to use 0.6 AMIs by default, for now 2013-02-17 14:01:48 -08:00
Matei Zaharia 2a907dceb3 Merge pull request #421 from shivaram/spark-ec2-change
Switch spark_ec2.py to use the new spark-ec2 scripts.
2013-02-17 13:48:43 -08:00
Matei Zaharia 340cc54e47 Merge pull request #471 from stephenh/parallelrdd
Move ParallelCollection into spark.rdd package.
2013-02-16 16:39:15 -08:00
Matei Zaharia 3260b6120e Merge pull request #470 from stephenh/morek
Make CoGroupedRDDs explicitly have the same key type.
2013-02-16 16:38:38 -08:00
Stephen Haberman e7713adb99 Move ParallelCollection into spark.rdd package. 2013-02-16 13:20:48 -06:00
Stephen Haberman ae2234687d Make CoGroupedRDDs explicitly have the same key type. 2013-02-16 13:10:31 -06:00
Matei Zaharia 9d979fb630 Merge pull request #469 from stephenh/samepartitionercombine
If combineByKey is using the same partitioner, skip the shuffle.
2013-02-16 10:07:42 -08:00
Stephen Haberman 4328873294 Add assertion about dependencies. 2013-02-16 01:16:40 -06:00
Stephen Haberman c34b8ad2c5 Avoid a shuffle if combineByKey is passed the same partitioner. 2013-02-16 00:54:03 -06:00
Matei Zaharia beb7ab8708 Merge pull request #467 from squito/executor_job_id
include jobid in Executor commandline args
2013-02-15 22:09:24 -08:00
Tathagata Das 3bcc6e5c03 Merge pull request #466 from pwendell/java-stream-transform
STREAMING-50: Support transform workaround in JavaPairDStream
2013-02-14 21:30:55 -08:00
Imran Rashid 893bad9089 use appid instead of frameworkid; simplify stupid condition 2013-02-13 20:30:21 -08:00
Matei Zaharia e8663e0fe5 Merge pull request #461 from JoshRosen/fix/issue-tracker-link
Update issue tracker link in contributing guide
2013-02-13 18:42:17 -08:00
Imran Rashid 8f18e7e863 include jobid in Executor commandline args 2013-02-13 13:05:13 -08:00
Patrick Wendell 3f3e77f28b STREAMING-50: Support transform workaround in JavaPairDStream
This ports a useful workaround (the `transform` function) to
JavaPairDStream. It is necessary to do things like sorting which
are not supported yet in the core streaming API.
2013-02-12 14:02:32 -08:00
Matei Zaharia fd7e414bd0 Merge pull request #464 from pwendell/java-type-fix
SPARK-694: All references to [K, V] in JavaDStreamLike should be changed to [K2, V2]
2013-02-11 19:19:05 -08:00
Matei Zaharia bfeed4725d Merge pull request #465 from pwendell/java-sort-fix
SPARK-696: sortByKey should use 'ascending' parameter
2013-02-11 18:23:12 -08:00
Patrick Wendell 21df6ffc13 SPARK-696: sortByKey should use 'ascending' parameter 2013-02-11 17:43:26 -08:00
Matei Zaharia 582d31dff9 Formatting fixes 2013-02-11 13:24:54 -08:00
Matei Zaharia ea08537143 Fixed an exponential recursion that could happen with doCheckpoint due
to lack of memoization
2013-02-11 13:23:50 -08:00
Patrick Wendell d09c36065c Using tuple swap() 2013-02-11 10:45:45 -08:00
Patrick Wendell 04786d0739 small fix 2013-02-11 10:05:49 -08:00
Patrick Wendell c65988bdc1 Fix for MapPartitions 2013-02-11 10:03:37 -08:00
Patrick Wendell 20cf770545 Fix for flatmap 2013-02-11 10:03:37 -08:00
Patrick Wendell 314d87a038 Indentation fix 2013-02-11 10:03:37 -08:00
Patrick Wendell f0b68c623c Initial cut at replacing K, V in Java files 2013-02-11 10:03:37 -08:00
Matei Zaharia da8afbc77e Some bug and formatting fixes to FT
Conflicts:
	core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:43:38 -08:00
root 1b47fa2752 Detect hard crashes of workers using a heartbeat mechanism.
Also fixes some issues in the rest of the code with detecting workers this way.

Conflicts:
	core/src/main/scala/spark/deploy/master/Master.scala
	core/src/main/scala/spark/deploy/worker/Worker.scala
	core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
2013-02-10 22:28:28 -08:00
Matei Zaharia 05d2e94838 Use a separate memory setting for standalone cluster daemons
Conflicts:
	docs/_config.yml
2013-02-10 21:59:41 -08:00
Matei Zaharia 8c66c49962 Tweak web UI so that people don't get confused about master URL format
Conflicts:
	core/src/main/twirl/spark/deploy/master/index.scala.html
	core/src/main/twirl/spark/deploy/worker/index.scala.html
2013-02-10 21:58:34 -08:00
Matei Zaharia 0b788b760b Update Windows scripts to launch daemons with less RAM and fix a few
other issues

Conflicts:
	run2.cmd
2013-02-10 21:51:49 -08:00
Josh Rosen 131b56afd0 Update issue tracker link in contributing guide. 2013-02-10 13:28:31 -08:00
Matei Zaharia b1d809913b Merge pull request #460 from markhamstra/404
Fixed a 404 in 'Tuning Spark' -- missing '.html'
2013-02-10 13:01:09 -08:00
Mark Hamstra 4975dcdafc Fixed a 404 -- missing '.html' 2013-02-10 12:55:47 -08:00
Matei Zaharia ccb1ca4a23 Merge pull request #448 from squito/fetch_maxBytesInFlight
add as many fetch requests as we can, subject to maxBytesInFlight
2013-02-09 18:15:18 -08:00
Matei Zaharia 76ac0ce6c0 Merge pull request #446 from pwendell/olap-example
SPARK-678: Adding an example with an OLAP roll-up
2013-02-09 18:14:44 -08:00
Matei Zaharia f750daa510 Merge pull request #452 from stephenh/misc
Add RDD.coalesce, clean up some RDDs, other misc.
2013-02-09 18:12:56 -08:00
Stephen Haberman 4619ee0787 Move JavaRDDLike.coalesce into the right places. 2013-02-09 20:05:42 -06:00
Josh Rosen fc5b2e8b83 Merge pull request #457 from markhamstra/commutative
Add commutative requirement for 'reduce' to Python docstring.
2013-02-09 15:54:48 -08:00
Stephen Haberman fb7599870f Fix JavaRDDLike.coalesce return type. 2013-02-09 16:10:52 -06:00
Mark Hamstra b7a1fb5c5d Add commutative requirement for 'reduce' to Python docstring. 2013-02-09 12:14:11 -08:00
Matei Zaharia 51db4c1f30 Merge pull request #453 from markhamstra/commutative
Change docs on 'reduce' since the merging of local reduces no longer pre...
2013-02-09 10:36:30 -08:00
Stephen Haberman 2a18cd826c Add back return types. 2013-02-09 10:12:04 -06:00
Stephen Haberman da52b16b38 Remove RDD.coalesce default arguments. 2013-02-09 10:11:54 -06:00
Mark Hamstra b8863a79d3 Merge branch 'master' of https://github.com/mesos/spark into commutative
Conflicts:
	core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Matei Zaharia b53174a6f3 Merge pull request #454 from MLnick/ipython
SPARK-685 Adding IPYTHON environment variable support for launching pyspark using ...
2013-02-07 18:29:04 -08:00
Nick Pentreath 21d3946d17 Adding IPYTHON environment variable support for launching pyspark using ipython shell 2013-02-07 16:54:31 +02:00
Mark Hamstra 934a53c8b6 Change docs on 'reduce' since the merging of local reduces no longer preserves
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Patrick Wendell dab81a8511 Fixing to match Spark styleguide 2013-02-05 20:57:04 -08:00