Commit graph

313 commits

Author SHA1 Message Date
Shivaram Venkataraman 0cc6642b7c Rename to zipPartitions and style changes 2013-04-28 05:11:03 -07:00
Shivaram Venkataraman c9c4954d99 Add an interface to zip iterators of multiple RDDs
The current code supports 2, 3 or 4 arguments but can be extended
to more arguments if required.
2013-04-26 16:57:46 -07:00
Mridul Muralidharan 19652a44be Fix issue with FileSuite failing 2013-04-15 19:16:36 +05:30
Mridul Muralidharan d90d2af103 Checkpoint commit - compiles and passes a lot of tests - not all though, looking into FileSuite issues 2013-04-15 18:12:11 +05:30
Stephen Haberman dd854d5b9f Use Boolean in the Java API, and != for assert. 2013-03-23 11:49:45 -05:00
Stephen Haberman 4ca273edc4 Merge branch 'master' into shufflecoalesce
Conflicts:
	core/src/test/scala/spark/RDDSuite.scala
2013-03-23 11:45:45 -05:00
Matei Zaharia fd53f2fc7b Merge pull request #510 from markhamstra/WithThing
mapWith, flatMapWith and filterWith
2013-03-23 07:13:21 -07:00
Stephen Haberman 1c67c7dfd1 Add a shuffle parameter to coalesce.
This is useful for when you want just 1 output file (part-00000) but
still up the upstream RDD to be computed in parallel.
2013-03-22 08:54:44 -05:00
Matei Zaharia 35588490cb Merge pull request #538 from rxin/cogroup
Added mapSideCombine flag to CoGroupedRDD. Added unit test for CoGroupedRDD.
2013-03-20 19:27:47 -07:00
Reynold Xin 00a11304fd Added mapSideCombine flag to CoGroupedRDD. Added unit test for
CoGroupedRDD.
2013-03-20 13:49:51 +08:00
Mark Hamstra 1fb192ef40 Merge branch 'master' of https://github.com/mesos/spark into foldByKey 2013-03-16 12:17:13 -07:00
Mark Hamstra 80fc8c82ed _With[Matei] 2013-03-16 12:16:29 -07:00
Mark Hamstra 38454c4aed Merge branch 'master' of https://github.com/mesos/spark into WithThing 2013-03-16 11:54:44 -07:00
Matei Zaharia c1e9cdc49f Merge pull request #525 from stephenh/subtractByKey
Add PairRDDFunctions.subtractByKey.
2013-03-16 11:47:45 -07:00
Mark Hamstra ef75be3bf7 Merge branch 'master' of https://github.com/mesos/spark into foldByKey 2013-03-15 21:41:24 -07:00
Matei Zaharia cdbfd1e196 Merge pull request #516 from squito/fix_local_metrics
Fix local metrics
2013-03-15 15:13:28 -07:00
Mark Hamstra 1a4070477d whitespace cleanup 2013-03-15 11:28:28 -07:00
Mark Hamstra 16a4ca4537 restrict V type of foldByKey in order to retain ClassManifest; added foldByKey to Java API and test 2013-03-14 13:58:37 -07:00
Stephen Haberman 7d8bb4df3a Allow subtractByKey's other argument to have a different value type. 2013-03-14 14:44:15 -05:00
Stephen Haberman 4632c45af1 Finished subtractByKeys. 2013-03-14 10:35:34 -05:00
Stephen Haberman e7f1a69c6b Add a test for NextIterator. 2013-03-13 10:46:33 -05:00
Mark Hamstra 562893bea3 deleted excess curly braces 2013-03-10 22:43:08 -07:00
Imran Rashid 8a11ac3dc7 increase sleep time 2013-03-10 22:31:44 -07:00
Imran Rashid 9f97f2f9d8 add a small wait to one task to make sure some task runtime really is non-zero 2013-03-10 22:30:18 -07:00
Mark Hamstra 1289e7176b refactored _With API and added foreachPartition 2013-03-10 22:27:13 -07:00
Mark Hamstra b57df1f5e3 Merge branch 'master' of https://github.com/mesos/spark into WithThing 2013-03-10 16:56:31 -07:00
Matei Zaharia 2e1bbc4e7e Merge remote-tracking branch 'woggling/dag-sched-driver-port'
Conflicts:
	core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
2013-03-10 16:52:54 -07:00
Matei Zaharia 91a9d093bd Merge pull request #512 from patelh/fix-kryo-serializer
Fix reference bug in Kryo serializer, add test, update version
2013-03-10 15:48:23 -07:00
Matei Zaharia a59cc6060f Merge remote-tracking branch 'stephenh/nomocks'
Conflicts:
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
	core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
2013-03-10 13:39:10 -07:00
Imran Rashid 20f01a0a1b enable task metrics in local mode, add tests 2013-03-09 21:17:31 -08:00
Charles Reiss d0216cb38b Prevent DAGSchedulerSuite from corrupting driver.port.
Use the LocalSparkContext abstraction to properly manage clearing
spark.driver.port.
2013-03-09 10:49:02 -08:00
Hiral Patel 664e5fd24b Fix reference bug in Kryo serializer, add test, update version 2013-03-07 22:16:11 -08:00
Mark Hamstra 5ff0810b11 refactor mapWith, flatMapWith and filterWith to each use two parameter lists 2013-03-05 12:25:44 -08:00
Mark Hamstra d046d8ad32 whitespace formatting 2013-03-05 00:48:13 -08:00
Mark Hamstra 9148b968cf mapWith, flatMapWith and filterWith 2013-03-04 15:48:47 -08:00
Matei Zaharia 04fb81ffe5 Merge pull request #506 from rxin/spark-706
Fixed SPARK-706: Failures in block manager put leads to read task hanging.
2013-03-03 17:20:07 -08:00
Imran Rashid d36abdb053 Merge branch 'master' into stageInfo 2013-03-03 15:20:46 -08:00
Reynold Xin 44134e12bb Fixed SPARK-706: Failures in block manager put leads to read task
hanging.
2013-02-28 15:14:59 -08:00
Stephen Haberman db957e5bd7 Fix MapOutputTrackerSuite. 2013-02-26 01:38:50 -06:00
Stephen Haberman a65aa549ff Override DAGScheduler.runLocally so we can remove the Thread.sleep. 2013-02-25 23:49:32 -06:00
Stephen Haberman a4adeb255c Merge branch 'master' into nomocks
Conflicts:
	core/src/test/scala/spark/scheduler/DAGSchedulerSuite.scala
2013-02-25 23:48:52 -06:00
Tathagata Das c02e064938 Fixed replication bug in BlockManager 2013-02-25 17:27:46 -08:00
Matei Zaharia d6e6abece3 Merge pull request #459 from stephenh/bettersplits
Change defaultPartitioner to use upstream split size.
2013-02-25 09:22:04 -08:00
Stephen Haberman c44ccf2862 Use default parallelism if its set. 2013-02-24 23:54:03 -06:00
Stephen Haberman 44032bc476 Merge branch 'master' into bettersplits
Conflicts:
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/test/scala/spark/ShuffleSuite.scala
2013-02-24 22:08:14 -06:00
Tathagata Das dff53d1b94 Merge branch 'mesos-master' into streaming 2013-02-24 12:17:22 -08:00
Stephen Haberman f442e7d83c Update for split->partition rename. 2013-02-24 00:27:14 -06:00
Stephen Haberman cec87a0653 Merge branch 'master' into subtract 2013-02-23 23:27:55 -06:00
Charles Reiss 50cf8c8b79 Add fault tolerance test that uses replicated RDDs. 2013-02-22 16:11:53 -08:00
Imran Rashid ff127cfcd3 Merge branch 'master' into stageInfo
Conflicts:
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/storage/BlockManager.scala
2013-02-21 15:16:21 -08:00