Matei Zaharia
4a47d1a476
Merge pull request #297 from JoshRosen/fix/ec2-spot-instances
...
Cancel spot instance requests when exiting spark-ec2
2012-11-01 11:31:18 -07:00
Shivaram Venkataraman
a7d967a1ca
Remove unnecessary hash-map put in MemoryStore
2012-11-01 10:46:38 -07:00
Tathagata Das
34e569f40e
Added 'synchronized' to RDD serialization to ensure checkpoint-related changes are reflected atomically in the task closure. Added to tests to ensure that jobs running on an RDD on which checkpointing is in progress does hurt the result of the job.
2012-10-31 00:56:40 -07:00
Josh Rosen
96c9bcfd8d
Cancel spot instance requests when exiting spark-ec2.
2012-10-30 23:32:38 -07:00
Tathagata Das
0dcd770fdc
Added checkpointing support to all RDDs, along with CheckpointSuite to test checkpointing in them.
2012-10-30 16:09:37 -07:00
Tathagata Das
ac12abc17f
Modified RDD API to make dependencies a var (therefore can be changed to checkpointed hadoop rdd) and othere references to parent RDDs either through dependencies or through a weak reference (to allow finalizing when dependencies do not refer to it any more).
2012-10-29 11:55:27 -07:00
Josh Rosen
2ccf3b6652
Fix PySpark hash partitioning bug.
...
A Java array's hashCode is based on its object
identify, not its elements, so this was causing
serialized keys to be hashed incorrectly.
This commit adds a PySpark-specific workaround
and adds more tests.
2012-10-28 22:30:28 -07:00
Josh Rosen
7859879aaa
Bump required Py4J version and add test for large broadcast variables.
2012-10-28 16:48:25 -07:00
Tathagata Das
1b900183c8
Added save operations to DStreams.
2012-10-27 18:55:50 -07:00
Matei Zaharia
51477e8874
Merge pull request #294 from JoshRosen/docs/quickstart
...
Fix minor typos in quickstart and Scala programming guides
2012-10-27 16:56:39 -07:00
Josh Rosen
33bea24f8e
Fix Spark groupId in Scala Programming Guide.
2012-10-26 15:01:28 -07:00
root
e782187b4a
Don't throw an error in the block manager when a block is cached on the master due to
...
a locally computed operation
Conflicts:
core/src/main/scala/spark/storage/BlockManagerMaster.scala
2012-10-26 00:33:45 -07:00
Tathagata Das
650d717544
Merge branch 'dev' of github.com:radlab/spark into dev
2012-10-25 13:03:18 -07:00
Matei Zaharia
863a55ae42
Merge remote-tracking branch 'public/master' into dev
...
Conflicts:
core/src/main/scala/spark/BlockStoreShuffleFetcher.scala
core/src/main/scala/spark/KryoSerializer.scala
core/src/main/scala/spark/MapOutputTracker.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/executor/Executor.scala
core/src/main/scala/spark/network/Connection.scala
core/src/main/scala/spark/network/ConnectionManagerTest.scala
core/src/main/scala/spark/rdd/BlockRDD.scala
core/src/main/scala/spark/rdd/NewHadoopRDD.scala
core/src/main/scala/spark/scheduler/ShuffleMapTask.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/BlockMessage.scala
core/src/main/scala/spark/storage/BlockStore.scala
core/src/main/scala/spark/storage/StorageLevel.scala
core/src/main/scala/spark/util/AkkaUtils.scala
project/SparkBuild.scala
run
2012-10-24 23:21:00 -07:00
Tathagata Das
926e05b030
Added tests for the file input stream.
2012-10-24 23:14:37 -07:00
Matei Zaharia
f63a40fd99
Strip leading mesos:// in URLs passed to Mesos
2012-10-24 21:52:13 -07:00
Tathagata Das
ed71df46cd
Minor fixes.
2012-10-24 16:49:40 -07:00
Tathagata Das
1ef6ea2513
Added tests for testing network input stream.
2012-10-24 14:44:20 -07:00
Matei Zaharia
d290e964ea
Merge pull request #281 from rxin/memreport
...
Added a method to report slave memory status; force serialize accumulator update in local mode.
2012-10-23 22:04:35 -07:00
Matei Zaharia
0bd20c63e2
Merge remote-tracking branch 'JoshRosen/shuffle_refactoring' into dev
...
Conflicts:
core/src/main/scala/spark/Dependency.scala
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/rdd/ShuffledRDD.scala
2012-10-23 22:01:45 -07:00
Matei Zaharia
7849216bba
Merge pull request #286 from JoshRosen/ec2-error-handling
...
Allow EC2 script to stop/destroy cluster after master/slave failures
2012-10-23 21:15:43 -07:00
Matei Zaharia
46b87dfc3a
Merge pull request #292 from tomdz/tweaked-run-file
...
Tweaked run file to live more happily with typesafe's debian package
2012-10-23 21:14:06 -07:00
Tathagata Das
020d643484
Renamed the streaming testsuites.
2012-10-23 16:24:05 -07:00
Tathagata Das
0e5d9be4df
Renamed APIs to create queueStream and fileStream.
2012-10-23 15:17:05 -07:00
Tathagata Das
c2731dd3ef
Updated StateDStream api to use Options instead of nulls.
2012-10-23 15:10:27 -07:00
Tathagata Das
19191d178d
Renamed the network input streams.
2012-10-23 14:40:24 -07:00
Josh Rosen
c4aa10154e
Fix minor typos in quick start guide.
2012-10-23 13:49:52 -07:00
Tathagata Das
a6de5758f1
Modified API of NetworkInputDStreams and got ObjectInputDStream and RawInputDStream working.
2012-10-23 01:41:13 -07:00
Tathagata Das
2c87c853ba
Renamed examples
2012-10-22 15:31:19 -07:00
Thomas Dudziak
f595bb53d1
Tweaked run file to live more happily with typesafe's debian package
2012-10-22 13:11:05 -07:00
Matei Zaharia
0967e71a00
Bump up version to 0.7.0-SNAPSHOT for master branch
2012-10-22 11:49:42 -07:00
Matei Zaharia
902a608187
Update version to 0.6.1-SNAPSHOT to show this is in development
2012-10-22 11:43:57 -07:00
Josh Rosen
d4f2e5b0ef
Remove PYTHONPATH from SparkContext's executorEnvs.
...
It makes more sense to pass it in the dictionary
of environment variables that is used to construct
PythonRDD.
2012-10-22 10:28:59 -07:00
Tathagata Das
d85c66636b
Added MapValueDStream, FlatMappedValuesDStream and CoGroupedDStream, and therefore DStream operations mapValue, flatMapValues, cogroup, and join. Also, added tests for DStream operations filter, glom, mapPartitions, groupByKey, mapValues, flatMapValues, cogroup, and join.
2012-10-21 17:40:08 -07:00
Tathagata Das
c4a2b6f636
Fixed some bugs in tests for forgetting RDDs, and made sure that use of manual clock leads to a zeroTime of 0 in the DStreams (more intuitive).
2012-10-21 10:41:25 -07:00
Matei Zaharia
1be335e8fa
Merge branch 'master' into dev
2012-10-21 00:05:02 -07:00
Matei Zaharia
15e95be2fd
Merge pull request #285 from tomdz/cdh4-dev
...
Support for Hadoop 2 distributions such as cdh4
2012-10-20 23:35:01 -07:00
Matei Zaharia
6999724ce8
Fix a path in the web UI
2012-10-20 23:33:37 -07:00
Patrick Wendell
45430f2cb9
Merge pull request #290 from pwendell/dev
...
Two trivial commits to test JIRA integration
2012-10-19 23:18:44 -07:00
Patrick Wendell
cd0936529b
SPARK-581 #resolve Removing whitespace to test JIRA
2012-10-19 23:17:44 -07:00
Patrick Wendell
d50028b345
Adding whitespace to test JIRA integration
2012-10-19 23:17:44 -07:00
Josh Rosen
c23bf1aff4
Add PySpark README and run scripts.
2012-10-20 00:22:27 +00:00
Tathagata Das
6d5eb4b40c
Added functionality to forget RDDs from DStreams.
2012-10-19 12:11:44 -07:00
Josh Rosen
52989c8a2c
Update Python API for v0.6.0 compatibility.
2012-10-19 10:24:49 -07:00
Josh Rosen
e21eb6e00d
Merge tag 'v0.6.0' into python-api
2012-10-19 09:44:32 -07:00
Matei Zaharia
bff5ceff53
Merge pull request #287 from rxin/startslave
...
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:12:05 -07:00
Reynold Xin
f67bcbed07
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:08:23 -07:00
Thomas Dudziak
d9c2a89c57
Support for Hadoop 2 distributions such as cdh4
2012-10-18 16:08:54 -07:00
Josh Rosen
365a4c1e68
Allow EC2 script to stop/destroy cluster after master/slave failures.
2012-10-18 10:36:50 -07:00
Reynold Xin
4a3fb06ac2
Updated Kryo to 2.20.
2012-10-16 01:10:01 -07:00