Commit graph

1020 commits

Author SHA1 Message Date
Imran Rashid edc6972f8e move Vector class into core and spark.util package 2012-07-28 20:15:42 -07:00
Imran Rashid 83659af11c Accumulator now inherits from Accumulable, whcih simplifies a bunch of other things (eg., no +:=)
Conflicts:

	core/src/main/scala/spark/Accumulators.scala
2012-07-28 20:13:51 -07:00
Imran Rashid 79d58ed20a improve scaladoc 2012-07-28 20:12:41 -07:00
Imran Rashid ae07f3864c add Accumulatable, add corresponding docs & tests for accumulators 2012-07-28 20:12:41 -07:00
Matei Zaharia dc8763fcf7 Fixed SPARK_MEM not being passed when runner is java 2012-07-28 19:53:31 -07:00
Matei Zaharia f6f917bd00 Add a sleep to prevent a failing test.
The BlockManager's put seems to be slightly asynchronous, which can
cause it to fail this test by not removing stuff from the cache before
we put the next value. We should probably change the semantics of put()
in this case but it's hard right now. It will also be hard for
asynchronously replicated puts.
2012-07-27 16:59:36 -07:00
Matei Zaharia c0c78d2119 Renamed test more descriptively 2012-07-27 16:28:18 -07:00
Matei Zaharia dee8ff1b9d Added a second version of union() without varargs. 2012-07-27 16:27:52 -07:00
Tathagata Das cf429699e1 Updated the new checkpoint RDD to remember partitioning of the original RDD. 2012-07-27 23:16:37 +00:00
Mosharaf Chowdhury b5be936d7c Broadcasts using BlockManager instead of BoundedMemoryCache 2012-07-27 15:38:46 -07:00
Mosharaf Chowdhury 1f19fbb8db Merge remote-tracking branch 'upstream/dev' into dev
Conflicts:
	core/src/main/scala/spark/broadcast/Broadcast.scala
2012-07-27 15:18:23 -07:00
Matei Zaharia b51d733a57 Fixed Java union methods having same erasure.
Changed union() methods on lists to take a separate "first element"
argument in order to differentiate them to the compiler, because Java 7
considered it an error to have them all take Lists parameterized with
different types.
2012-07-27 12:23:27 -07:00
Tathagata Das 3e271c3b61 Merge branch 'dev' of github.com:tdas/spark into dev 2012-07-27 12:01:04 -07:00
Tathagata Das 024905f682 Added BlockRDD and a first-cut version of checkpoint() to RDD class. 2012-07-27 12:00:49 -07:00
Tathagata Das d1eee44a03 Fixed more stuff in BoundedMemoryCache. 2012-07-27 18:33:32 +00:00
Tathagata Das d1b7f41671 Fixed bug in BoundedMemoryCache. 2012-07-27 09:00:45 -07:00
Tathagata Das 435d129bec Fixed bugs in block dropping code of MemoryStore and changed synchronized HashMap to ConcurrentHashMap in BlockManager. 2012-07-27 10:02:26 +00:00
Tathagata Das 0426769f89 Modified the block dropping code for better performance. 2012-07-26 20:53:45 -07:00
Matei Zaharia 5c5aa2ff81 Merge pull request #153 from JoshRosen/new-java-api
Java API
2012-07-26 17:20:52 -07:00
Josh Rosen c5e2810dc7 Add persist(), splits(), glom(), and mapPartitions() to Java API. 2012-07-26 12:46:47 -07:00
Josh Rosen bf61c10072 Detect non-zero exit status from PipedRDD process. 2012-07-26 11:32:59 -07:00
Josh Rosen 2a60c998cc Remove StringOps.split() from Java WordCount. 2012-07-25 10:13:06 -07:00
Josh Rosen 6a78e88237 Minor cleanup and optimizations in Java API.
- Add override keywords.
- Cache RDDs and counts in TC example.
- Clean up JavaRDDLike's abstract methods.
2012-07-24 09:47:00 -07:00
Denny 4f4a34c025 Stlystic changes
Conflicts:

	core/src/test/scala/spark/MesosSchedulerSuite.scala
2012-07-23 16:32:20 -07:00
Denny 866e6949df Always destroy SparkContext in after block for the unit tests.
Conflicts:

	core/src/test/scala/spark/ShuffleSuite.scala
2012-07-23 16:29:17 -07:00
Matei Zaharia 600e99728d Fix a bug where an input path was added to a Hadoop job configuration twice 2012-07-23 16:16:19 -07:00
Josh Rosen 042dcbde33 Add type annotations to Java API methods.
Add missing Scala Map to java.util.Map conversions.
2012-07-22 17:35:29 -07:00
Josh Rosen e23938c3be Use mapValues() in JavaPairRDD.cogroupResultToJava(). 2012-07-22 15:10:01 -07:00
Josh Rosen 460da878fc Improve Java API examples
- Replace JavaLR example with JavaHdfsLR example.
- Use anonymous classes in JavaWordCount; add options.
- Remove @Override annotations.
2012-07-22 14:40:39 -07:00
Josh Rosen 01dce3f569 Add Java API
Add distinct() method to RDD.

Fix bug in DoubleRDDFunctions.
2012-07-18 17:34:29 -07:00
Mosharaf Chowdhury 85cd9979f2 Fix for isLocal 2012-07-13 01:13:14 -07:00
Mosharaf Chowdhury 1c83fd4b66 Merged with Upstream dev 2012-07-13 01:08:28 -07:00
Mosharaf Chowdhury bb4ee580fa Cleaning BitTorrentBroadcast code... 2012-07-13 01:04:01 -07:00
Mosharaf Chowdhury 8ccffe21da Cleaned TreeBroadcast 2012-07-13 00:54:25 -07:00
Matei Zaharia 628bb5ca7f Allow null keys in Spark's reduce and group by 2012-07-12 18:36:02 -07:00
Matei Zaharia e2a67a8024 Fixes to coarse-grained Mesos scheduler in dealing with failed nodes 2012-07-12 18:21:52 -07:00
Matei Zaharia be622cf867 Formatting 2012-07-11 17:31:44 -07:00
Matei Zaharia e8ae77df24 Added more methods for loading/saving with new Hadoop API 2012-07-11 17:31:33 -07:00
Mosharaf Chowdhury 34999d97f5 Added stop() to the Broadcast subsystem 2012-07-10 01:03:47 -07:00
Mosharaf Chowdhury d6a9680604 Slightly better check for isLocal 2012-07-10 00:16:47 -07:00
Mosharaf Chowdhury 701f49e0d9 Refactoring 2012-07-09 22:39:47 -07:00
Mosharaf Chowdhury cf1c60a1de Refactoring 2012-07-09 22:07:46 -07:00
Mosharaf Chowdhury e71f69ad3d Refactoring 2012-07-09 22:07:17 -07:00
Mosharaf Chowdhury ca02a92332 Refactored TrackMultipleValues out. 2012-07-09 21:35:39 -07:00
Mosharaf Chowdhury 654576ef1a Tweaks 2012-07-09 21:12:42 -07:00
Mosharaf Chowdhury 425c247269 Removed some unused stuff 2012-07-08 14:29:04 -07:00
Matei Zaharia 0a47284003 More work to allow Spark to run on the standalone deploy cluster. 2012-07-08 14:00:04 -07:00
Mosharaf Chowdhury c7c5258e25 Compiles without Dfs 2012-07-08 13:22:12 -07:00
Mosharaf Chowdhury 178bb29f05 Removed Chained and Dfs broadcast implementations 2012-07-08 11:57:00 -07:00
Matei Zaharia 1aa63f775b Added back coarse-grained Mesos scheduler based on StandaloneScheduler. 2012-07-08 10:52:13 -07:00