Josh Rosen
|
4143678509
|
Fix minor bugs in Python API examples.
|
2012-08-27 00:24:47 -07:00 |
|
Josh Rosen
|
bff6a46359
|
Add pipe(), saveAsTextFile(), sc.union() to Python API.
|
2012-08-27 00:24:47 -07:00 |
|
Josh Rosen
|
200d248dcc
|
Simplify Python worker; pipeline the map step of partitionBy().
|
2012-08-27 00:24:39 -07:00 |
|
Josh Rosen
|
6904cb77d4
|
Use local combiners in Python API combineByKey().
|
2012-08-27 00:19:26 -07:00 |
|
Josh Rosen
|
8b64b7ecd8
|
Add countByKey(), reduceByKeyLocally() to Python API
|
2012-08-27 00:19:22 -07:00 |
|
Josh Rosen
|
08b201d810
|
Add mapPartitions(), glom(), countByValue() to Python API.
|
2012-08-27 00:19:14 -07:00 |
|
Josh Rosen
|
f79a1e4d2a
|
Add broadcast variables to Python API.
|
2012-08-27 00:16:47 -07:00 |
|
Josh Rosen
|
65e8406029
|
Implement fold() in Python API.
|
2012-08-27 00:16:47 -07:00 |
|
root
|
e2cf197a0a
|
Made WordCount2 even more configurable
|
2012-08-27 03:34:15 +00:00 |
|
root
|
9635823947
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-27 03:08:25 +00:00 |
|
Matei Zaharia
|
b914cd0dfa
|
Serialize generation correctly in ShuffleMapTask
|
2012-08-26 20:07:59 -07:00 |
|
root
|
20f6b0cfc9
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-27 03:01:03 +00:00 |
|
Matei Zaharia
|
69c2ab0408
|
logging
|
2012-08-26 20:00:58 -07:00 |
|
root
|
89c5c03035
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-27 02:53:07 +00:00 |
|
Matei Zaharia
|
117e3f8c86
|
Fix a bug that was causing FetchFailedException not to be thrown
|
2012-08-26 19:52:56 -07:00 |
|
root
|
beb6456442
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-27 02:37:49 +00:00 |
|
Matei Zaharia
|
3c9c44a8d3
|
More helpful log messages
|
2012-08-26 19:37:43 -07:00 |
|
root
|
7b59943d79
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-27 01:57:12 +00:00 |
|
Matei Zaharia
|
26dfd20c9a
|
Detect disconnected slaves in StandaloneScheduler
|
2012-08-26 18:56:56 -07:00 |
|
root
|
b78c5ae803
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-27 01:16:39 +00:00 |
|
Matei Zaharia
|
29e83f39e9
|
Fix replication with MEMORY_ONLY_DESER_2
|
2012-08-26 18:16:25 -07:00 |
|
root
|
9de1c3abf9
|
Tweaks to WordCount2
|
2012-08-27 00:57:00 +00:00 |
|
Matei Zaharia
|
57796b183e
|
Code style
|
2012-08-26 17:25:22 -07:00 |
|
Matei Zaharia
|
22b1a20e61
|
Made Time and Interval immutable
|
2012-08-26 17:04:34 -07:00 |
|
Matei Zaharia
|
23a29b6d19
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-08-26 16:45:37 -07:00 |
|
Matei Zaharia
|
b120e24fe0
|
Add equals and hashCode to Time
|
2012-08-26 16:45:14 -07:00 |
|
root
|
b08ff710af
|
Added sliding word count, and some fixes to reduce window DStream
|
2012-08-26 23:40:50 +00:00 |
|
Matei Zaharia
|
06ef7c3d1b
|
Less debug info
|
2012-08-26 16:29:20 -07:00 |
|
Matei Zaharia
|
ad6537321e
|
Make Time serializable
|
2012-08-26 16:27:23 -07:00 |
|
Matei Zaharia
|
741899b21e
|
Fix sendMessageReliablySync
|
2012-08-26 16:26:06 -07:00 |
|
Matei Zaharia
|
51453eb87b
|
Merge pull request #179 from JoshRosen/fix/sparklr-caching
Cache points in SparkLR example
|
2012-08-26 15:32:50 -07:00 |
|
Josh Rosen
|
566feafe1d
|
Cache points in SparkLR example.
|
2012-08-26 15:24:43 -07:00 |
|
Josh Rosen
|
f3b852ce66
|
Refactor Python MappedRDD to use iterator pipelines.
|
2012-08-24 19:44:14 -07:00 |
|
Josh Rosen
|
4b52300487
|
Fix options parsing in Python pi example.
|
2012-08-24 19:42:47 -07:00 |
|
Matei Zaharia
|
e7a5cbb543
|
Reduce log4j verbosity for streaming
|
2012-08-24 16:45:01 -07:00 |
|
Matei Zaharia
|
091b1438f5
|
Fix WordCount job name
|
2012-08-24 16:43:59 -07:00 |
|
Matei Zaharia
|
5a8015d2db
|
Merge remote-tracking branch 'public/dev' into dev
|
2012-08-24 16:11:44 -07:00 |
|
Mosharaf Chowdhury
|
edd1a740a6
|
Merge remote-tracking branch 'upstream/dev' into dev
|
2012-08-23 20:43:27 -07:00 |
|
Matei Zaharia
|
2c16ae36d7
|
Set log level in tests to WARN
|
2012-08-23 20:38:14 -07:00 |
|
Matei Zaharia
|
deedb9e7b7
|
Fix further issues with tests and broadcast.
The broadcast fix is to store values as MEMORY_ONLY_DESER instead of
MEMORY_ONLY, which will save substantial time on serialization.
|
2012-08-23 20:31:49 -07:00 |
|
Mosharaf Chowdhury
|
3b1f5480a4
|
Merge remote-tracking branch 'upstream/dev' into dev
|
2012-08-23 20:16:50 -07:00 |
|
Matei Zaharia
|
59b831b9d1
|
Fixed test failures due to broadcast not stopping correctly
|
2012-08-23 19:59:55 -07:00 |
|
Matei Zaharia
|
7310a6f499
|
Merge pull request #147 from mosharaf/dev
Broadcast refactoring/cleaning up
|
2012-08-23 19:38:28 -07:00 |
|
Mosharaf Chowdhury
|
995ad6ba36
|
Merge remote-tracking branch 'upstream/dev' into dev
|
2012-08-23 09:51:38 -07:00 |
|
Josh Rosen
|
607b53abfc
|
Use numpy in Python k-means example.
|
2012-08-22 00:43:55 -07:00 |
|
Matei Zaharia
|
79c82b6cfd
|
Merge pull request #173 from squito/accum_localValue
make accumulator.localValue public, add tests
|
2012-08-22 00:11:21 -07:00 |
|
Josh Rosen
|
fd94e5443c
|
Use only cPickle for serialization in Python API.
Objects serialized with JSON can be compared for equality, but JSON can be slow
to serialize and only supports a limited range of data types.
|
2012-08-21 14:01:27 -07:00 |
|
Imran Rashid
|
4d2efe9555
|
change tests to show utility of localValue
|
2012-08-20 15:17:31 -07:00 |
|
Matei Zaharia
|
25a6a39e6d
|
Added other SparkContext constructors to JavaSparkContext
|
2012-08-19 18:59:16 -07:00 |
|
Josh Rosen
|
13b9514966
|
Bundle cloudpickle with pyspark.
|
2012-08-19 17:17:42 -07:00 |
|