Commit graph

1664 commits

Author SHA1 Message Date
Tathagata Das 34e569f40e Added 'synchronized' to RDD serialization to ensure checkpoint-related changes are reflected atomically in the task closure. Added to tests to ensure that jobs running on an RDD on which checkpointing is in progress does hurt the result of the job. 2012-10-31 00:56:40 -07:00
Josh Rosen 96c9bcfd8d Cancel spot instance requests when exiting spark-ec2. 2012-10-30 23:32:38 -07:00
Tathagata Das 0dcd770fdc Added checkpointing support to all RDDs, along with CheckpointSuite to test checkpointing in them. 2012-10-30 16:09:37 -07:00
Tathagata Das ac12abc17f Modified RDD API to make dependencies a var (therefore can be changed to checkpointed hadoop rdd) and othere references to parent RDDs either through dependencies or through a weak reference (to allow finalizing when dependencies do not refer to it any more). 2012-10-29 11:55:27 -07:00
Tathagata Das 1b900183c8 Added save operations to DStreams. 2012-10-27 18:55:50 -07:00
Matei Zaharia 51477e8874 Merge pull request #294 from JoshRosen/docs/quickstart
Fix minor typos in quickstart and Scala programming guides
2012-10-27 16:56:39 -07:00
Josh Rosen 33bea24f8e Fix Spark groupId in Scala Programming Guide. 2012-10-26 15:01:28 -07:00
root e782187b4a Don't throw an error in the block manager when a block is cached on the master due to
a locally computed operation

Conflicts:

	core/src/main/scala/spark/storage/BlockManagerMaster.scala
2012-10-26 00:33:45 -07:00
Tathagata Das 650d717544 Merge branch 'dev' of github.com:radlab/spark into dev 2012-10-25 13:03:18 -07:00
Matei Zaharia 863a55ae42 Merge remote-tracking branch 'public/master' into dev
Conflicts:
	core/src/main/scala/spark/BlockStoreShuffleFetcher.scala
	core/src/main/scala/spark/KryoSerializer.scala
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/executor/Executor.scala
	core/src/main/scala/spark/network/Connection.scala
	core/src/main/scala/spark/network/ConnectionManagerTest.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/NewHadoopRDD.scala
	core/src/main/scala/spark/scheduler/ShuffleMapTask.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockMessage.scala
	core/src/main/scala/spark/storage/BlockStore.scala
	core/src/main/scala/spark/storage/StorageLevel.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	project/SparkBuild.scala
	run
2012-10-24 23:21:00 -07:00
Tathagata Das 926e05b030 Added tests for the file input stream. 2012-10-24 23:14:37 -07:00
Matei Zaharia f63a40fd99 Strip leading mesos:// in URLs passed to Mesos 2012-10-24 21:52:13 -07:00
Tathagata Das ed71df46cd Minor fixes. 2012-10-24 16:49:40 -07:00
Tathagata Das 1ef6ea2513 Added tests for testing network input stream. 2012-10-24 14:44:20 -07:00
Matei Zaharia d290e964ea Merge pull request #281 from rxin/memreport
Added a method to report slave memory status; force serialize accumulator update in local mode.
2012-10-23 22:04:35 -07:00
Matei Zaharia 0bd20c63e2 Merge remote-tracking branch 'JoshRosen/shuffle_refactoring' into dev
Conflicts:
	core/src/main/scala/spark/Dependency.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
2012-10-23 22:01:45 -07:00
Matei Zaharia 7849216bba Merge pull request #286 from JoshRosen/ec2-error-handling
Allow EC2 script to stop/destroy cluster after master/slave failures
2012-10-23 21:15:43 -07:00
Matei Zaharia 46b87dfc3a Merge pull request #292 from tomdz/tweaked-run-file
Tweaked run file to live more happily with typesafe's debian package
2012-10-23 21:14:06 -07:00
Tathagata Das 020d643484 Renamed the streaming testsuites. 2012-10-23 16:24:05 -07:00
Tathagata Das 0e5d9be4df Renamed APIs to create queueStream and fileStream. 2012-10-23 15:17:05 -07:00
Tathagata Das c2731dd3ef Updated StateDStream api to use Options instead of nulls. 2012-10-23 15:10:27 -07:00
Tathagata Das 19191d178d Renamed the network input streams. 2012-10-23 14:40:24 -07:00
Josh Rosen c4aa10154e Fix minor typos in quick start guide. 2012-10-23 13:49:52 -07:00
Tathagata Das a6de5758f1 Modified API of NetworkInputDStreams and got ObjectInputDStream and RawInputDStream working. 2012-10-23 01:41:13 -07:00
Tathagata Das 2c87c853ba Renamed examples 2012-10-22 15:31:19 -07:00
Thomas Dudziak f595bb53d1 Tweaked run file to live more happily with typesafe's debian package 2012-10-22 13:11:05 -07:00
Matei Zaharia 0967e71a00 Bump up version to 0.7.0-SNAPSHOT for master branch 2012-10-22 11:49:42 -07:00
Matei Zaharia 902a608187 Update version to 0.6.1-SNAPSHOT to show this is in development 2012-10-22 11:43:57 -07:00
Tathagata Das d85c66636b Added MapValueDStream, FlatMappedValuesDStream and CoGroupedDStream, and therefore DStream operations mapValue, flatMapValues, cogroup, and join. Also, added tests for DStream operations filter, glom, mapPartitions, groupByKey, mapValues, flatMapValues, cogroup, and join. 2012-10-21 17:40:08 -07:00
Tathagata Das c4a2b6f636 Fixed some bugs in tests for forgetting RDDs, and made sure that use of manual clock leads to a zeroTime of 0 in the DStreams (more intuitive). 2012-10-21 10:41:25 -07:00
Matei Zaharia 1be335e8fa Merge branch 'master' into dev 2012-10-21 00:05:02 -07:00
Matei Zaharia 15e95be2fd Merge pull request #285 from tomdz/cdh4-dev
Support for Hadoop 2 distributions such as cdh4
2012-10-20 23:35:01 -07:00
Matei Zaharia 6999724ce8 Fix a path in the web UI 2012-10-20 23:33:37 -07:00
Patrick Wendell 45430f2cb9 Merge pull request #290 from pwendell/dev
Two trivial commits to test JIRA integration
2012-10-19 23:18:44 -07:00
Patrick Wendell cd0936529b SPARK-581 #resolve Removing whitespace to test JIRA 2012-10-19 23:17:44 -07:00
Patrick Wendell d50028b345 Adding whitespace to test JIRA integration 2012-10-19 23:17:44 -07:00
Tathagata Das 6d5eb4b40c Added functionality to forget RDDs from DStreams. 2012-10-19 12:11:44 -07:00
Matei Zaharia bff5ceff53 Merge pull request #287 from rxin/startslave
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:12:05 -07:00
Reynold Xin f67bcbed07 Use SPARK_MASTER_IP if it is set in start-slaves.sh. 2012-10-19 01:08:23 -07:00
Thomas Dudziak d9c2a89c57 Support for Hadoop 2 distributions such as cdh4 2012-10-18 16:08:54 -07:00
Josh Rosen 365a4c1e68 Allow EC2 script to stop/destroy cluster after master/slave failures. 2012-10-18 10:36:50 -07:00
Reynold Xin 4a3fb06ac2 Updated Kryo to 2.20. 2012-10-16 01:10:01 -07:00
Reynold Xin 3b97124604 Changed Spark version back to 0.6.0 2012-10-15 21:39:51 -07:00
Reynold Xin 63fae9bc23 Serialize accumulator updates in TaskResult for local mode. 2012-10-15 21:38:28 -07:00
Reynold Xin 9087a1abef Changed version to 0.6.0-rxin. 2012-10-15 13:54:04 -07:00
Tathagata Das b760d6426a Minor modifications. 2012-10-15 12:26:44 -07:00
Matei Zaharia 388a111153 Fix sbt assembly's merge rules 2012-10-15 10:21:16 -07:00
Reynold Xin 42d20fa8da Added a method to report slave memory status. 2012-10-14 22:30:53 -07:00
Tathagata Das 3f1aae5c71 Refactored DStreamSuiteBase to create CheckpointSuite- testsuite for testing checkpointing under different operations. 2012-10-14 21:39:30 -07:00
Matei Zaharia 63fe4e9d33 Merge pull request #279 from pwendell/dev
Removing credentials line in build.
2012-10-14 19:36:41 -07:00