Commit graph

1426 commits

Author SHA1 Message Date
Matei Zaharia bff5ceff53 Merge pull request #287 from rxin/startslave
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:12:05 -07:00
Reynold Xin f67bcbed07 Use SPARK_MASTER_IP if it is set in start-slaves.sh. 2012-10-19 01:08:23 -07:00
Thomas Dudziak d9c2a89c57 Support for Hadoop 2 distributions such as cdh4 2012-10-18 16:08:54 -07:00
Josh Rosen 365a4c1e68 Allow EC2 script to stop/destroy cluster after master/slave failures. 2012-10-18 10:36:50 -07:00
Reynold Xin 3b97124604 Changed Spark version back to 0.6.0 2012-10-15 21:39:51 -07:00
Reynold Xin 63fae9bc23 Serialize accumulator updates in TaskResult for local mode. 2012-10-15 21:38:28 -07:00
Reynold Xin 9087a1abef Changed version to 0.6.0-rxin. 2012-10-15 13:54:04 -07:00
Matei Zaharia 388a111153 Fix sbt assembly's merge rules 2012-10-15 10:21:16 -07:00
Reynold Xin 42d20fa8da Added a method to report slave memory status. 2012-10-14 22:30:53 -07:00
Matei Zaharia 63fe4e9d33 Merge pull request #279 from pwendell/dev
Removing credentials line in build.
2012-10-14 19:36:41 -07:00
Patrick Wendell 629dd2691e Removing credentials line in build. 2012-10-14 19:33:39 -07:00
Matei Zaharia f8768da418 Comment out PGP stuff for publish-local to work 2012-10-14 17:37:21 -07:00
Matei Zaharia 1f06445b03 tweak 2012-10-14 12:04:58 -07:00
Matei Zaharia 4947bd0958 tweak 2012-10-14 12:02:58 -07:00
Matei Zaharia 6c766a9187 tweak 2012-10-14 12:02:32 -07:00
Matei Zaharia 8192fe0325 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-14 12:01:38 -07:00
Matei Zaharia 1c73d8974d Update README 2012-10-14 12:00:25 -07:00
Matei Zaharia 7855bacd26 Merge pull request #278 from pwendell/quickstart-fix
Adding dependency repos in quickstart example
2012-10-14 11:52:24 -07:00
Patrick Wendell 7a03a0e35d Adding dependency repos in quickstart example 2012-10-14 11:48:24 -07:00
Matei Zaharia 64dbf8d372 Made ShuffleDependency automatically find a shuffle ID for itself 2012-10-14 10:00:22 -07:00
Matei Zaharia 64b52166ee Changed default Hadoop version back to 0.20.205 2012-10-14 09:51:34 -07:00
Matei Zaharia 4be12d97ec Some doc fixes, including showing version number in nav bar again 2012-10-13 19:05:11 -07:00
Matei Zaharia 19910c00c3 tweaks 2012-10-13 16:22:39 -07:00
Matei Zaharia 4a3e9cf69c Document how to configure SPARK_MEM & co on a per-job basis 2012-10-13 16:20:25 -07:00
Matei Zaharia ce6b5a3ee5 Uncomment Maven publishing stuff and set version to 0.6.0 2012-10-13 15:55:39 -07:00
Matei Zaharia 8815aeba0c Take executor environment vars as an arguemnt to SparkContext 2012-10-13 15:31:11 -07:00
Josh Rosen 33cd3a0c12 Remove map-side combining from ShuffleMapTask.
This separation of concerns simplifies the 
ShuffleDependency and ShuffledRDD interfaces.

Map-side combining can be performed in a
mapPartitions() call prior to shuffling the RDD.

I don't anticipate this having much of a 
performance impact: in both approaches, each tuple
is hashed twice: once in the bucket partitioning
and once in the combiner's hashtable.  The same
steps are being performed, but in a different
order and through one extra Iterator.
2012-10-13 14:59:20 -07:00
Josh Rosen 10bcd217d2 Remove mapSideCombine field from Aggregator.
Instead, the presence or absense of a ShuffleDependency's aggregator
will control whether map-side combining is performed.
2012-10-13 14:59:20 -07:00
Josh Rosen 4775c55641 Change ShuffleFetcher to return an Iterator. 2012-10-13 14:59:20 -07:00
Josh Rosen 110832e88f Add helper methods to Aggregator. 2012-10-13 14:57:56 -07:00
Matei Zaharia 84979499db Merge pull request #273 from dennybritz/executorVars
Let the user specify environment variables to be passed to the Executors
2012-10-13 14:52:14 -07:00
Denny 0700d1920a Protect from null env variables in mesos. 2012-10-13 13:57:59 -07:00
Denny 21047d923e Protect from setting null environment variables. 2012-10-13 13:44:24 -07:00
Denny fa41d50f7d Don't use system envs for Mesos. 2012-10-13 13:15:50 -07:00
Denny 67c42a41d0 Let the user specify environment variables to be passed to the Executors.
Also removed unused variables in the ExecutorRunner.
2012-10-13 13:08:44 -07:00
Matei Zaharia 5b7ee173e1 Update EC2 scripts for Spark 0.6 2012-10-12 19:53:03 -07:00
Matei Zaharia b4067cbad4 More doc updates, and moved Serializer to a subpackage. 2012-10-12 18:19:21 -07:00
Matei Zaharia 8d7b77bcb5 Some doc and usability improvements:
- Added a StorageLevels class for easy access to StorageLevel constants
  in Java
- Added doc comments on Function classes in Java
- Updated Accumulator and HadoopWriter docs slightly
2012-10-12 17:53:20 -07:00
Matei Zaharia 682b2d9329 Added a test for when an RDD only partially fits in memory 2012-10-12 14:58:26 -07:00
Matei Zaharia dca496bb77 Document cartesian() operation 2012-10-12 14:46:41 -07:00
Matei Zaharia 1183b30941 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-12 14:40:07 -07:00
Matei Zaharia 603b419fdf Tweak 2012-10-12 14:40:00 -07:00
Matei Zaharia 23015ccac0 Merge pull request #271 from shivaram/block-manager-npe-fix
Change block manager to accept a ArrayBuffer
2012-10-12 14:36:28 -07:00
Shivaram Venkataraman 8577523f37 Add test to verify if RDD is computed even if block manager has insufficient
memory
2012-10-12 14:14:57 -07:00
Matei Zaharia bd78bbb2cf Merge pull request #270 from pwendell/java-javadoc
Adding Java documentation
2012-10-11 12:21:47 -07:00
Patrick Wendell dc8adbd359 Adding Java documentation 2012-10-11 00:49:03 -07:00
Shivaram Venkataraman 2cf40c5fd5 Change block manager to accept a ArrayBuffer instead of an iterator to ensure
that the computation can proceed even if we run out of memory to cache the
block. Update CacheTracker to use this new interface
2012-10-11 00:42:46 -07:00
Matei Zaharia cce56835cd Comment out Sonatype publishing stuff so publish-local works 2012-10-10 22:11:31 -07:00
Matei Zaharia 4001cbdec1 Merge pull request #268 from pwendell/sonatype
Adding code for publishing to Sonatype.
2012-10-10 18:57:32 -07:00
Patrick Wendell 6d328f54d0 Changing tabs to spaces 2012-10-10 18:54:22 -07:00