Commit graph

11302 commits

Author SHA1 Message Date
Josh Rosen f79a1e4d2a Add broadcast variables to Python API. 2012-08-27 00:16:47 -07:00
Josh Rosen 65e8406029 Implement fold() in Python API. 2012-08-27 00:16:47 -07:00
root e2cf197a0a Made WordCount2 even more configurable 2012-08-27 03:34:15 +00:00
root 9635823947 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 03:08:25 +00:00
Matei Zaharia b914cd0dfa Serialize generation correctly in ShuffleMapTask 2012-08-26 20:07:59 -07:00
root 20f6b0cfc9 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 03:01:03 +00:00
Matei Zaharia 69c2ab0408 logging 2012-08-26 20:00:58 -07:00
root 89c5c03035 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 02:53:07 +00:00
Matei Zaharia 117e3f8c86 Fix a bug that was causing FetchFailedException not to be thrown 2012-08-26 19:52:56 -07:00
root beb6456442 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 02:37:49 +00:00
Matei Zaharia 3c9c44a8d3 More helpful log messages 2012-08-26 19:37:43 -07:00
root 7b59943d79 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 01:57:12 +00:00
Matei Zaharia 26dfd20c9a Detect disconnected slaves in StandaloneScheduler 2012-08-26 18:56:56 -07:00
root b78c5ae803 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-27 01:16:39 +00:00
Matei Zaharia 29e83f39e9 Fix replication with MEMORY_ONLY_DESER_2 2012-08-26 18:16:25 -07:00
root 9de1c3abf9 Tweaks to WordCount2 2012-08-27 00:57:00 +00:00
Matei Zaharia 57796b183e Code style 2012-08-26 17:25:22 -07:00
Matei Zaharia 22b1a20e61 Made Time and Interval immutable 2012-08-26 17:04:34 -07:00
Matei Zaharia 23a29b6d19 Merge branch 'dev' of github.com:radlab/spark into dev 2012-08-26 16:45:37 -07:00
Matei Zaharia b120e24fe0 Add equals and hashCode to Time 2012-08-26 16:45:14 -07:00
root b08ff710af Added sliding word count, and some fixes to reduce window DStream 2012-08-26 23:40:50 +00:00
Matei Zaharia 06ef7c3d1b Less debug info 2012-08-26 16:29:20 -07:00
Matei Zaharia ad6537321e Make Time serializable 2012-08-26 16:27:23 -07:00
Matei Zaharia 741899b21e Fix sendMessageReliablySync 2012-08-26 16:26:06 -07:00
Matei Zaharia 51453eb87b Merge pull request #179 from JoshRosen/fix/sparklr-caching
Cache points in SparkLR example
2012-08-26 15:32:50 -07:00
Josh Rosen 566feafe1d Cache points in SparkLR example. 2012-08-26 15:24:43 -07:00
Josh Rosen f3b852ce66 Refactor Python MappedRDD to use iterator pipelines. 2012-08-24 19:44:14 -07:00
Josh Rosen 4b52300487 Fix options parsing in Python pi example. 2012-08-24 19:42:47 -07:00
Matei Zaharia e7a5cbb543 Reduce log4j verbosity for streaming 2012-08-24 16:45:01 -07:00
Matei Zaharia 091b1438f5 Fix WordCount job name 2012-08-24 16:43:59 -07:00
Matei Zaharia 5a8015d2db Merge remote-tracking branch 'public/dev' into dev 2012-08-24 16:11:44 -07:00
Mosharaf Chowdhury edd1a740a6 Merge remote-tracking branch 'upstream/dev' into dev 2012-08-23 20:43:27 -07:00
Matei Zaharia 2c16ae36d7 Set log level in tests to WARN 2012-08-23 20:38:14 -07:00
Matei Zaharia deedb9e7b7 Fix further issues with tests and broadcast.
The broadcast fix is to store values as MEMORY_ONLY_DESER instead of
MEMORY_ONLY, which will save substantial time on serialization.
2012-08-23 20:31:49 -07:00
Mosharaf Chowdhury 3b1f5480a4 Merge remote-tracking branch 'upstream/dev' into dev 2012-08-23 20:16:50 -07:00
Matei Zaharia 59b831b9d1 Fixed test failures due to broadcast not stopping correctly 2012-08-23 19:59:55 -07:00
Matei Zaharia 7310a6f499 Merge pull request #147 from mosharaf/dev
Broadcast refactoring/cleaning up
2012-08-23 19:38:28 -07:00
Mosharaf Chowdhury 995ad6ba36 Merge remote-tracking branch 'upstream/dev' into dev 2012-08-23 09:51:38 -07:00
Josh Rosen 607b53abfc Use numpy in Python k-means example. 2012-08-22 00:43:55 -07:00
Matei Zaharia 79c82b6cfd Merge pull request #173 from squito/accum_localValue
make accumulator.localValue public, add tests
2012-08-22 00:11:21 -07:00
Josh Rosen fd94e5443c Use only cPickle for serialization in Python API.
Objects serialized with JSON can be compared for equality, but JSON can be slow
to serialize and only supports a limited range of data types.
2012-08-21 14:01:27 -07:00
Imran Rashid 4d2efe9555 change tests to show utility of localValue 2012-08-20 15:17:31 -07:00
Matei Zaharia 25a6a39e6d Added other SparkContext constructors to JavaSparkContext 2012-08-19 18:59:16 -07:00
Josh Rosen 13b9514966 Bundle cloudpickle with pyspark. 2012-08-19 17:17:42 -07:00
Josh Rosen 886b39de55 Add Python API. 2012-08-18 22:33:51 -07:00
Imran Rashid 823878c77f add accumulators for mutable collections, with correct typing! 2012-08-17 15:52:42 -07:00
Imran Rashid 206a3833ce make accumulator.localValue public, add tests 2012-08-14 14:08:22 -07:00
Matei Zaharia 9a0c128fec Merge pull request #172 from dennybritz/dev
Rsync root directory in EC2 script
2012-08-14 13:05:22 -07:00
Denny 8dc7242544 Use root login in standalone AMI 2012-08-14 10:18:24 -07:00
Denny 7152c7c12d rsync root directory in EC2 script 2012-08-14 09:26:47 -07:00