Commit graph

25030 commits

Author SHA1 Message Date
root c4366eb764 Fixes to ShuffleFetcher 2012-08-31 00:34:24 +00:00
Mosharaf Chowdhury 8f2bd399da Merge remote-tracking branch 'upstream/dev' into dev 2012-08-30 15:21:08 -07:00
Matei Zaharia bf3212615a Merge pull request #184 from rxin/dev
Disable running combiners on map tasks when mergeCombiners function is not specified by the user.
2012-08-30 14:12:40 -07:00
Reynold Xin a8a2a08a1a Added a test for testing map-side combine on/off switch. 2012-08-30 12:34:28 -07:00
Matei Zaharia 62e5326af0 Wording 2012-08-30 08:37:43 -07:00
Matei Zaharia e8ac9221dc Update sbt build command to create JARs 2012-08-30 08:36:39 -07:00
Reynold Xin 5945bcdcc5 Added a new flag in Aggregator to indicate applying map side combiners. 2012-08-29 23:32:08 -07:00
Reynold Xin c68e820b2a Merge branch 'dev' of github.com:mesos/spark into dev 2012-08-29 23:01:19 -07:00
Reynold Xin 940869dfda Disable running combiners on map tasks when mergeCombiners function is
not specified by the user.
2012-08-29 23:00:02 -07:00
Tathagata Das 4db3a96766 Made minor changes to reduce compilation errors in Eclipse. Twirl stuff still does not compile in Eclipse. 2012-08-29 13:04:01 -07:00
Matei Zaharia 84bf7924d6 Made region used by spark-ec2 configurable. 2012-08-28 22:40:48 -07:00
Matei Zaharia 47507d69d9 Made region used by spark-ec2 configurable. 2012-08-28 22:40:00 -07:00
root 1f8085b8d0 Compile fixes 2012-08-29 03:20:56 +00:00
Mosharaf Chowdhury c74455f309 Merge remote-tracking branch 'upstream/dev' into dev 2012-08-28 14:56:57 -07:00
Tathagata Das 43e66146f7 Merge branch 'dev' of github.com/radlab/spark into dev 2012-08-28 13:51:05 -07:00
Tathagata Das b5b93a621c Added capabllity to take streaming input from network. Renamed SparkStreamContext to StreamingContext. 2012-08-28 12:35:19 -07:00
Matei Zaharia bf2e9cb08e Fault tolerance and block store fixes discovered through streaming tests. 2012-08-27 23:07:50 -07:00
Matei Zaharia 17af2df0cd Log levels 2012-08-27 23:07:32 -07:00
Matei Zaharia a0b34d826a Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 22:49:52 -07:00
Matei Zaharia b4a2214218 More fault tolerance fixes to catch lost tasks 2012-08-27 22:49:29 -07:00
Matei Zaharia 291abc2c28 Merge pull request #181 from rxin/dev
Removed the deserialization cache for ShuffleMapTask
2012-08-27 22:38:22 -07:00
Reynold Xin 3a6a95dc24 Removed the deserialization cache for ShuffleMapTask because it was
causing concurrency problems (some variables in Shark get set to null).
The cost of task deserialization on slaves is trivial compared with the
execution time of the task anyway.
2012-08-27 22:33:15 -07:00
Josh Rosen 4143678509 Fix minor bugs in Python API examples. 2012-08-27 00:24:47 -07:00
Josh Rosen bff6a46359 Add pipe(), saveAsTextFile(), sc.union() to Python API. 2012-08-27 00:24:47 -07:00
Josh Rosen 200d248dcc Simplify Python worker; pipeline the map step of partitionBy(). 2012-08-27 00:24:39 -07:00
Josh Rosen 6904cb77d4 Use local combiners in Python API combineByKey(). 2012-08-27 00:19:26 -07:00
Josh Rosen 8b64b7ecd8 Add countByKey(), reduceByKeyLocally() to Python API 2012-08-27 00:19:22 -07:00
Josh Rosen 08b201d810 Add mapPartitions(), glom(), countByValue() to Python API. 2012-08-27 00:19:14 -07:00
Josh Rosen f79a1e4d2a Add broadcast variables to Python API. 2012-08-27 00:16:47 -07:00
Josh Rosen 65e8406029 Implement fold() in Python API. 2012-08-27 00:16:47 -07:00
root e2cf197a0a Made WordCount2 even more configurable 2012-08-27 03:34:15 +00:00
root 9635823947 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 03:08:25 +00:00
Matei Zaharia b914cd0dfa Serialize generation correctly in ShuffleMapTask 2012-08-26 20:07:59 -07:00
root 20f6b0cfc9 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 03:01:03 +00:00
Matei Zaharia 69c2ab0408 logging 2012-08-26 20:00:58 -07:00
root 89c5c03035 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 02:53:07 +00:00
Matei Zaharia 117e3f8c86 Fix a bug that was causing FetchFailedException not to be thrown 2012-08-26 19:52:56 -07:00
root beb6456442 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 02:37:49 +00:00
Matei Zaharia 3c9c44a8d3 More helpful log messages 2012-08-26 19:37:43 -07:00
root 7b59943d79 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 01:57:12 +00:00
Matei Zaharia 26dfd20c9a Detect disconnected slaves in StandaloneScheduler 2012-08-26 18:56:56 -07:00
root b78c5ae803 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 01:16:39 +00:00
Matei Zaharia 29e83f39e9 Fix replication with MEMORY_ONLY_DESER_2 2012-08-26 18:16:25 -07:00
root 9de1c3abf9 Tweaks to WordCount2 2012-08-27 00:57:00 +00:00
Matei Zaharia 57796b183e Code style 2012-08-26 17:25:22 -07:00
Matei Zaharia 22b1a20e61 Made Time and Interval immutable 2012-08-26 17:04:34 -07:00
Matei Zaharia 23a29b6d19 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-26 16:45:37 -07:00
Matei Zaharia b120e24fe0 Add equals and hashCode to Time 2012-08-26 16:45:14 -07:00
root b08ff710af Added sliding word count, and some fixes to reduce window DStream 2012-08-26 23:40:50 +00:00
Matei Zaharia 06ef7c3d1b Less debug info 2012-08-26 16:29:20 -07:00