Tathagata Das
2c87c853ba
Renamed examples
2012-10-22 15:31:19 -07:00
Thomas Dudziak
f595bb53d1
Tweaked run file to live more happily with typesafe's debian package
2012-10-22 13:11:05 -07:00
Matei Zaharia
0967e71a00
Bump up version to 0.7.0-SNAPSHOT for master branch
2012-10-22 11:49:42 -07:00
Matei Zaharia
902a608187
Update version to 0.6.1-SNAPSHOT to show this is in development
2012-10-22 11:43:57 -07:00
Josh Rosen
d4f2e5b0ef
Remove PYTHONPATH from SparkContext's executorEnvs.
...
It makes more sense to pass it in the dictionary
of environment variables that is used to construct
PythonRDD.
2012-10-22 10:28:59 -07:00
Tathagata Das
d85c66636b
Added MapValueDStream, FlatMappedValuesDStream and CoGroupedDStream, and therefore DStream operations mapValue, flatMapValues, cogroup, and join. Also, added tests for DStream operations filter, glom, mapPartitions, groupByKey, mapValues, flatMapValues, cogroup, and join.
2012-10-21 17:40:08 -07:00
Tathagata Das
c4a2b6f636
Fixed some bugs in tests for forgetting RDDs, and made sure that use of manual clock leads to a zeroTime of 0 in the DStreams (more intuitive).
2012-10-21 10:41:25 -07:00
Matei Zaharia
1be335e8fa
Merge branch 'master' into dev
2012-10-21 00:05:02 -07:00
Matei Zaharia
15e95be2fd
Merge pull request #285 from tomdz/cdh4-dev
...
Support for Hadoop 2 distributions such as cdh4
2012-10-20 23:35:01 -07:00
Matei Zaharia
6999724ce8
Fix a path in the web UI
2012-10-20 23:33:37 -07:00
Patrick Wendell
45430f2cb9
Merge pull request #290 from pwendell/dev
...
Two trivial commits to test JIRA integration
2012-10-19 23:18:44 -07:00
Patrick Wendell
cd0936529b
SPARK-581 #resolve Removing whitespace to test JIRA
2012-10-19 23:17:44 -07:00
Patrick Wendell
d50028b345
Adding whitespace to test JIRA integration
2012-10-19 23:17:44 -07:00
Josh Rosen
c23bf1aff4
Add PySpark README and run scripts.
2012-10-20 00:22:27 +00:00
Tathagata Das
6d5eb4b40c
Added functionality to forget RDDs from DStreams.
2012-10-19 12:11:44 -07:00
Josh Rosen
52989c8a2c
Update Python API for v0.6.0 compatibility.
2012-10-19 10:24:49 -07:00
Josh Rosen
e21eb6e00d
Merge tag 'v0.6.0' into python-api
2012-10-19 09:44:32 -07:00
Matei Zaharia
bff5ceff53
Merge pull request #287 from rxin/startslave
...
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:12:05 -07:00
Reynold Xin
f67bcbed07
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:08:23 -07:00
Thomas Dudziak
d9c2a89c57
Support for Hadoop 2 distributions such as cdh4
2012-10-18 16:08:54 -07:00
Josh Rosen
365a4c1e68
Allow EC2 script to stop/destroy cluster after master/slave failures.
2012-10-18 10:36:50 -07:00
Reynold Xin
4a3fb06ac2
Updated Kryo to 2.20.
2012-10-16 01:10:01 -07:00
Reynold Xin
3b97124604
Changed Spark version back to 0.6.0
2012-10-15 21:39:51 -07:00
Reynold Xin
63fae9bc23
Serialize accumulator updates in TaskResult for local mode.
2012-10-15 21:38:28 -07:00
Reynold Xin
9087a1abef
Changed version to 0.6.0-rxin.
2012-10-15 13:54:04 -07:00
Tathagata Das
b760d6426a
Minor modifications.
2012-10-15 12:26:44 -07:00
Matei Zaharia
388a111153
Fix sbt assembly's merge rules
2012-10-15 10:21:16 -07:00
Reynold Xin
42d20fa8da
Added a method to report slave memory status.
2012-10-14 22:30:53 -07:00
Tathagata Das
3f1aae5c71
Refactored DStreamSuiteBase to create CheckpointSuite- testsuite for testing checkpointing under different operations.
2012-10-14 21:39:30 -07:00
Matei Zaharia
63fe4e9d33
Merge pull request #279 from pwendell/dev
...
Removing credentials line in build.
2012-10-14 19:36:41 -07:00
Patrick Wendell
629dd2691e
Removing credentials line in build.
2012-10-14 19:33:39 -07:00
Matei Zaharia
f8768da418
Comment out PGP stuff for publish-local to work
2012-10-14 17:37:21 -07:00
Matei Zaharia
1f06445b03
tweak
2012-10-14 12:04:58 -07:00
Matei Zaharia
4947bd0958
tweak
2012-10-14 12:02:58 -07:00
Matei Zaharia
6c766a9187
tweak
2012-10-14 12:02:32 -07:00
Matei Zaharia
8192fe0325
Merge branch 'dev' of github.com:mesos/spark into dev
2012-10-14 12:01:38 -07:00
Matei Zaharia
1c73d8974d
Update README
2012-10-14 12:00:25 -07:00
Matei Zaharia
7855bacd26
Merge pull request #278 from pwendell/quickstart-fix
...
Adding dependency repos in quickstart example
2012-10-14 11:52:24 -07:00
Patrick Wendell
7a03a0e35d
Adding dependency repos in quickstart example
2012-10-14 11:48:24 -07:00
Matei Zaharia
64dbf8d372
Made ShuffleDependency automatically find a shuffle ID for itself
2012-10-14 10:00:22 -07:00
Matei Zaharia
64b52166ee
Changed default Hadoop version back to 0.20.205
2012-10-14 09:51:34 -07:00
Tathagata Das
b08708e6fc
Fixed bugs in the streaming testsuites.
2012-10-13 21:02:24 -07:00
Tathagata Das
e95ff45b53
Implemented checkpointing of StreamingContext and DStream graph.
2012-10-13 20:10:49 -07:00
Matei Zaharia
4be12d97ec
Some doc fixes, including showing version number in nav bar again
2012-10-13 19:05:11 -07:00
Matei Zaharia
19910c00c3
tweaks
2012-10-13 16:22:39 -07:00
Matei Zaharia
4a3e9cf69c
Document how to configure SPARK_MEM & co on a per-job basis
2012-10-13 16:20:25 -07:00
Matei Zaharia
ce6b5a3ee5
Uncomment Maven publishing stuff and set version to 0.6.0
2012-10-13 15:55:39 -07:00
Matei Zaharia
8815aeba0c
Take executor environment vars as an arguemnt to SparkContext
2012-10-13 15:31:11 -07:00
Josh Rosen
33cd3a0c12
Remove map-side combining from ShuffleMapTask.
...
This separation of concerns simplifies the
ShuffleDependency and ShuffledRDD interfaces.
Map-side combining can be performed in a
mapPartitions() call prior to shuffling the RDD.
I don't anticipate this having much of a
performance impact: in both approaches, each tuple
is hashed twice: once in the bucket partitioning
and once in the combiner's hashtable. The same
steps are being performed, but in a different
order and through one extra Iterator.
2012-10-13 14:59:20 -07:00
Josh Rosen
10bcd217d2
Remove mapSideCombine field from Aggregator.
...
Instead, the presence or absense of a ShuffleDependency's aggregator
will control whether map-side combining is performed.
2012-10-13 14:59:20 -07:00