Commit graph

1484 commits

Author SHA1 Message Date
Matei Zaharia 902a608187 Update version to 0.6.1-SNAPSHOT to show this is in development 2012-10-22 11:43:57 -07:00
Matei Zaharia 1be335e8fa Merge branch 'master' into dev 2012-10-21 00:05:02 -07:00
Matei Zaharia 15e95be2fd Merge pull request #285 from tomdz/cdh4-dev
Support for Hadoop 2 distributions such as cdh4
2012-10-20 23:35:01 -07:00
Matei Zaharia 6999724ce8 Fix a path in the web UI 2012-10-20 23:33:37 -07:00
Patrick Wendell 45430f2cb9 Merge pull request #290 from pwendell/dev
Two trivial commits to test JIRA integration
2012-10-19 23:18:44 -07:00
Patrick Wendell cd0936529b SPARK-581 #resolve Removing whitespace to test JIRA 2012-10-19 23:17:44 -07:00
Patrick Wendell d50028b345 Adding whitespace to test JIRA integration 2012-10-19 23:17:44 -07:00
Matei Zaharia bff5ceff53 Merge pull request #287 from rxin/startslave
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:12:05 -07:00
Reynold Xin f67bcbed07 Use SPARK_MASTER_IP if it is set in start-slaves.sh. 2012-10-19 01:08:23 -07:00
Thomas Dudziak d9c2a89c57 Support for Hadoop 2 distributions such as cdh4 2012-10-18 16:08:54 -07:00
Josh Rosen 365a4c1e68 Allow EC2 script to stop/destroy cluster after master/slave failures. 2012-10-18 10:36:50 -07:00
Reynold Xin 4a3fb06ac2 Updated Kryo to 2.20. 2012-10-16 01:10:01 -07:00
Reynold Xin 3b97124604 Changed Spark version back to 0.6.0 2012-10-15 21:39:51 -07:00
Reynold Xin 63fae9bc23 Serialize accumulator updates in TaskResult for local mode. 2012-10-15 21:38:28 -07:00
Reynold Xin 9087a1abef Changed version to 0.6.0-rxin. 2012-10-15 13:54:04 -07:00
Matei Zaharia 388a111153 Fix sbt assembly's merge rules 2012-10-15 10:21:16 -07:00
Reynold Xin 42d20fa8da Added a method to report slave memory status. 2012-10-14 22:30:53 -07:00
Matei Zaharia 63fe4e9d33 Merge pull request #279 from pwendell/dev
Removing credentials line in build.
2012-10-14 19:36:41 -07:00
Patrick Wendell 629dd2691e Removing credentials line in build. 2012-10-14 19:33:39 -07:00
Matei Zaharia f8768da418 Comment out PGP stuff for publish-local to work 2012-10-14 17:37:21 -07:00
Matei Zaharia 1f06445b03 tweak 2012-10-14 12:04:58 -07:00
Matei Zaharia 4947bd0958 tweak 2012-10-14 12:02:58 -07:00
Matei Zaharia 6c766a9187 tweak 2012-10-14 12:02:32 -07:00
Matei Zaharia 8192fe0325 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-14 12:01:38 -07:00
Matei Zaharia 1c73d8974d Update README 2012-10-14 12:00:25 -07:00
Matei Zaharia 7855bacd26 Merge pull request #278 from pwendell/quickstart-fix
Adding dependency repos in quickstart example
2012-10-14 11:52:24 -07:00
Patrick Wendell 7a03a0e35d Adding dependency repos in quickstart example 2012-10-14 11:48:24 -07:00
Matei Zaharia 64dbf8d372 Made ShuffleDependency automatically find a shuffle ID for itself 2012-10-14 10:00:22 -07:00
Matei Zaharia 64b52166ee Changed default Hadoop version back to 0.20.205 2012-10-14 09:51:34 -07:00
Matei Zaharia 4be12d97ec Some doc fixes, including showing version number in nav bar again 2012-10-13 19:05:11 -07:00
Matei Zaharia 19910c00c3 tweaks 2012-10-13 16:22:39 -07:00
Matei Zaharia 4a3e9cf69c Document how to configure SPARK_MEM & co on a per-job basis 2012-10-13 16:20:25 -07:00
Matei Zaharia ce6b5a3ee5 Uncomment Maven publishing stuff and set version to 0.6.0 2012-10-13 15:55:39 -07:00
Matei Zaharia 8815aeba0c Take executor environment vars as an arguemnt to SparkContext 2012-10-13 15:31:11 -07:00
Josh Rosen 33cd3a0c12 Remove map-side combining from ShuffleMapTask.
This separation of concerns simplifies the 
ShuffleDependency and ShuffledRDD interfaces.

Map-side combining can be performed in a
mapPartitions() call prior to shuffling the RDD.

I don't anticipate this having much of a 
performance impact: in both approaches, each tuple
is hashed twice: once in the bucket partitioning
and once in the combiner's hashtable.  The same
steps are being performed, but in a different
order and through one extra Iterator.
2012-10-13 14:59:20 -07:00
Josh Rosen 10bcd217d2 Remove mapSideCombine field from Aggregator.
Instead, the presence or absense of a ShuffleDependency's aggregator
will control whether map-side combining is performed.
2012-10-13 14:59:20 -07:00
Josh Rosen 4775c55641 Change ShuffleFetcher to return an Iterator. 2012-10-13 14:59:20 -07:00
Josh Rosen 110832e88f Add helper methods to Aggregator. 2012-10-13 14:57:56 -07:00
Matei Zaharia 84979499db Merge pull request #273 from dennybritz/executorVars
Let the user specify environment variables to be passed to the Executors
2012-10-13 14:52:14 -07:00
Denny 0700d1920a Protect from null env variables in mesos. 2012-10-13 13:57:59 -07:00
Denny 21047d923e Protect from setting null environment variables. 2012-10-13 13:44:24 -07:00
Denny fa41d50f7d Don't use system envs for Mesos. 2012-10-13 13:15:50 -07:00
Denny 67c42a41d0 Let the user specify environment variables to be passed to the Executors.
Also removed unused variables in the ExecutorRunner.
2012-10-13 13:08:44 -07:00
Matei Zaharia 5b7ee173e1 Update EC2 scripts for Spark 0.6 2012-10-12 19:53:03 -07:00
Matei Zaharia b4067cbad4 More doc updates, and moved Serializer to a subpackage. 2012-10-12 18:19:21 -07:00
Matei Zaharia 8d7b77bcb5 Some doc and usability improvements:
- Added a StorageLevels class for easy access to StorageLevel constants
  in Java
- Added doc comments on Function classes in Java
- Updated Accumulator and HadoopWriter docs slightly
2012-10-12 17:53:20 -07:00
Matei Zaharia 682b2d9329 Added a test for when an RDD only partially fits in memory 2012-10-12 14:58:26 -07:00
Matei Zaharia dca496bb77 Document cartesian() operation 2012-10-12 14:46:41 -07:00
Matei Zaharia 1183b30941 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-12 14:40:07 -07:00
Matei Zaharia 603b419fdf Tweak 2012-10-12 14:40:00 -07:00