Matei Zaharia
17af2df0cd
Log levels
2012-08-27 23:07:32 -07:00
Matei Zaharia
a0b34d826a
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 22:49:52 -07:00
Matei Zaharia
b4a2214218
More fault tolerance fixes to catch lost tasks
2012-08-27 22:49:29 -07:00
Matei Zaharia
291abc2c28
Merge pull request #181 from rxin/dev
...
Removed the deserialization cache for ShuffleMapTask
2012-08-27 22:38:22 -07:00
Reynold Xin
3a6a95dc24
Removed the deserialization cache for ShuffleMapTask because it was
...
causing concurrency problems (some variables in Shark get set to null).
The cost of task deserialization on slaves is trivial compared with the
execution time of the task anyway.
2012-08-27 22:33:15 -07:00
Josh Rosen
4143678509
Fix minor bugs in Python API examples.
2012-08-27 00:24:47 -07:00
Josh Rosen
bff6a46359
Add pipe(), saveAsTextFile(), sc.union() to Python API.
2012-08-27 00:24:47 -07:00
Josh Rosen
200d248dcc
Simplify Python worker; pipeline the map step of partitionBy().
2012-08-27 00:24:39 -07:00
Josh Rosen
6904cb77d4
Use local combiners in Python API combineByKey().
2012-08-27 00:19:26 -07:00
Josh Rosen
8b64b7ecd8
Add countByKey(), reduceByKeyLocally() to Python API
2012-08-27 00:19:22 -07:00
Josh Rosen
08b201d810
Add mapPartitions(), glom(), countByValue() to Python API.
2012-08-27 00:19:14 -07:00
Josh Rosen
f79a1e4d2a
Add broadcast variables to Python API.
2012-08-27 00:16:47 -07:00
Josh Rosen
65e8406029
Implement fold() in Python API.
2012-08-27 00:16:47 -07:00
root
e2cf197a0a
Made WordCount2 even more configurable
2012-08-27 03:34:15 +00:00
root
9635823947
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 03:08:25 +00:00
Matei Zaharia
b914cd0dfa
Serialize generation correctly in ShuffleMapTask
2012-08-26 20:07:59 -07:00
root
20f6b0cfc9
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 03:01:03 +00:00
Matei Zaharia
69c2ab0408
logging
2012-08-26 20:00:58 -07:00
root
89c5c03035
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 02:53:07 +00:00
Matei Zaharia
117e3f8c86
Fix a bug that was causing FetchFailedException not to be thrown
2012-08-26 19:52:56 -07:00
root
beb6456442
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 02:37:49 +00:00
Matei Zaharia
3c9c44a8d3
More helpful log messages
2012-08-26 19:37:43 -07:00
root
7b59943d79
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 01:57:12 +00:00
Matei Zaharia
26dfd20c9a
Detect disconnected slaves in StandaloneScheduler
2012-08-26 18:56:56 -07:00
root
b78c5ae803
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-27 01:16:39 +00:00
Matei Zaharia
29e83f39e9
Fix replication with MEMORY_ONLY_DESER_2
2012-08-26 18:16:25 -07:00
root
9de1c3abf9
Tweaks to WordCount2
2012-08-27 00:57:00 +00:00
Matei Zaharia
57796b183e
Code style
2012-08-26 17:25:22 -07:00
Matei Zaharia
22b1a20e61
Made Time and Interval immutable
2012-08-26 17:04:34 -07:00
Matei Zaharia
23a29b6d19
Merge branch 'dev' of github.com:radlab/spark into dev
2012-08-26 16:45:37 -07:00
Matei Zaharia
b120e24fe0
Add equals and hashCode to Time
2012-08-26 16:45:14 -07:00
root
b08ff710af
Added sliding word count, and some fixes to reduce window DStream
2012-08-26 23:40:50 +00:00
Matei Zaharia
06ef7c3d1b
Less debug info
2012-08-26 16:29:20 -07:00
Matei Zaharia
ad6537321e
Make Time serializable
2012-08-26 16:27:23 -07:00
Matei Zaharia
741899b21e
Fix sendMessageReliablySync
2012-08-26 16:26:06 -07:00
Matei Zaharia
51453eb87b
Merge pull request #179 from JoshRosen/fix/sparklr-caching
...
Cache points in SparkLR example
2012-08-26 15:32:50 -07:00
Josh Rosen
566feafe1d
Cache points in SparkLR example.
2012-08-26 15:24:43 -07:00
Josh Rosen
f3b852ce66
Refactor Python MappedRDD to use iterator pipelines.
2012-08-24 19:44:14 -07:00
Josh Rosen
4b52300487
Fix options parsing in Python pi example.
2012-08-24 19:42:47 -07:00
Matei Zaharia
e7a5cbb543
Reduce log4j verbosity for streaming
2012-08-24 16:45:01 -07:00
Matei Zaharia
091b1438f5
Fix WordCount job name
2012-08-24 16:43:59 -07:00
Matei Zaharia
5a8015d2db
Merge remote-tracking branch 'public/dev' into dev
2012-08-24 16:11:44 -07:00
Mosharaf Chowdhury
edd1a740a6
Merge remote-tracking branch 'upstream/dev' into dev
2012-08-23 20:43:27 -07:00
Matei Zaharia
2c16ae36d7
Set log level in tests to WARN
2012-08-23 20:38:14 -07:00
Matei Zaharia
deedb9e7b7
Fix further issues with tests and broadcast.
...
The broadcast fix is to store values as MEMORY_ONLY_DESER instead of
MEMORY_ONLY, which will save substantial time on serialization.
2012-08-23 20:31:49 -07:00
Mosharaf Chowdhury
3b1f5480a4
Merge remote-tracking branch 'upstream/dev' into dev
2012-08-23 20:16:50 -07:00
Matei Zaharia
59b831b9d1
Fixed test failures due to broadcast not stopping correctly
2012-08-23 19:59:55 -07:00
Matei Zaharia
7310a6f499
Merge pull request #147 from mosharaf/dev
...
Broadcast refactoring/cleaning up
2012-08-23 19:38:28 -07:00
Mosharaf Chowdhury
995ad6ba36
Merge remote-tracking branch 'upstream/dev' into dev
2012-08-23 09:51:38 -07:00
Josh Rosen
607b53abfc
Use numpy in Python k-means example.
2012-08-22 00:43:55 -07:00