Commit graph

50 commits

Author SHA1 Message Date
Patrick Wendell b0974e6c1d Remving main method from tests 2013-01-14 09:42:36 -08:00
Patrick Wendell 867a7455e2 Adding some initial tests to streaming API. 2013-01-14 09:42:36 -08:00
Tathagata Das 6cc8592f26 Fixed bug 2013-01-13 21:20:49 -08:00
Tathagata Das 365506fb03 Changed variable name form ***Time to ***Duration to keep things consistent. 2013-01-09 14:29:25 -08:00
Tathagata Das 4719e6d8fe Changed locations for unit test logs. 2013-01-07 16:06:07 -08:00
Tathagata Das e60514d79e Fixed bug 2013-01-07 15:16:16 -08:00
Tathagata Das 237bac36e9 Renamed examples and added documentation. 2013-01-07 14:37:21 -08:00
Tathagata Das 7e0271b438 Refactored a whole lot to push all DStreams into the spark.streaming.dstream package. 2012-12-30 15:19:55 -08:00
Tathagata Das 9e644402c1 Improved jekyll and scala docs. Made many classes and method private to remove them from scala docs. 2012-12-29 18:31:51 -08:00
Tathagata Das 0bc0a60d30 Modifications to make sure LocalScheduler terminate cleanly without errors when SparkContext is shutdown, to minimize spurious exception during master failure tests. 2012-12-27 15:37:33 -08:00
Patrick Wendell 3ff9710265 Adding Flume InputDStream 2012-12-07 16:42:39 -08:00
Denny 15df4b0e52 Merge branch 'dev' into kafka
Conflicts:
	streaming/src/main/scala/spark/streaming/DStream.scala
2012-12-05 10:16:56 -08:00
Tathagata Das 0fe2fc4d5e Merged branch mesos/master to branch dev. 2012-11-26 13:16:59 -08:00
Tathagata Das fd11d23bb3 Modified StreamingContext API to make constructor accept the batch size (since it is always needed, Patrick's suggestion). Added description to DStream and StreamingContext. 2012-11-19 19:04:39 -08:00
Denny 2aceae25be Merge branch 'dev' into kafka
Conflicts:
	streaming/src/main/scala/spark/streaming/DStream.scala
2012-11-13 13:16:18 -08:00
Tathagata Das 8a25d530ed Optimized checkpoint writing by reusing FileSystem object. Fixed bug in updating of checkpoint data in DStream where the checkpointed RDDs, upon recovery, were not recognized as checkpointed RDDs and therefore deleted from HDFS. Made InputStreamsSuite more robust to timing delays. 2012-11-13 02:16:28 -08:00
Denny 255b3e44c1 Merge branch 'dev' into kafka 2012-11-12 19:39:29 -08:00
Tathagata Das 564dd8c3f4 Speeded up CheckpointSuite 2012-11-12 14:22:05 -08:00
Tathagata Das 46222dc56d Fixed bug in FileInputDStream that allowed it to miss new files. Added tests in the InputStreamsSuite to test checkpointing of file and network streams. 2012-11-11 13:20:09 -08:00
Denny d006109e95 Kafka Stream comments. 2012-11-11 11:06:49 -08:00
tdas cc2a65f547 Fixed bug in InputStreamsSuite 2012-11-08 11:17:57 +00:00
Tathagata Das fc3d0b602a Added FailureTestsuite for testing multiple, repeated master failures. 2012-11-06 17:23:31 -08:00
Tathagata Das 395167f2b2 Made more bug fixes for checkpointing. 2012-11-05 16:11:50 -08:00
Tathagata Das 72b2303f99 Fixed major bugs in checkpointing. 2012-11-05 11:41:36 -08:00
Tathagata Das d154238789 Made checkpointing of dstream graph to work with checkpointing of RDDs. For streams requiring checkpointing of its RDD, the default checkpoint interval is set to 10 seconds. 2012-11-04 12:12:06 -08:00
Tathagata Das 3fb5c9ee24 Fixed serialization bug in countByWindow, added countByKey and countByKeyAndWindow, and added testcases for them. 2012-11-02 12:12:25 -07:00
Tathagata Das 650d717544 Merge branch 'dev' of github.com:radlab/spark into dev 2012-10-25 13:03:18 -07:00
Matei Zaharia 863a55ae42 Merge remote-tracking branch 'public/master' into dev
Conflicts:
	core/src/main/scala/spark/BlockStoreShuffleFetcher.scala
	core/src/main/scala/spark/KryoSerializer.scala
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/executor/Executor.scala
	core/src/main/scala/spark/network/Connection.scala
	core/src/main/scala/spark/network/ConnectionManagerTest.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/NewHadoopRDD.scala
	core/src/main/scala/spark/scheduler/ShuffleMapTask.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockMessage.scala
	core/src/main/scala/spark/storage/BlockStore.scala
	core/src/main/scala/spark/storage/StorageLevel.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	project/SparkBuild.scala
	run
2012-10-24 23:21:00 -07:00
Tathagata Das 926e05b030 Added tests for the file input stream. 2012-10-24 23:14:37 -07:00
Tathagata Das ed71df46cd Minor fixes. 2012-10-24 16:49:40 -07:00
Tathagata Das 1ef6ea2513 Added tests for testing network input stream. 2012-10-24 14:44:20 -07:00
Tathagata Das 020d643484 Renamed the streaming testsuites. 2012-10-23 16:24:05 -07:00
Tathagata Das c2731dd3ef Updated StateDStream api to use Options instead of nulls. 2012-10-23 15:10:27 -07:00
Tathagata Das d85c66636b Added MapValueDStream, FlatMappedValuesDStream and CoGroupedDStream, and therefore DStream operations mapValue, flatMapValues, cogroup, and join. Also, added tests for DStream operations filter, glom, mapPartitions, groupByKey, mapValues, flatMapValues, cogroup, and join. 2012-10-21 17:40:08 -07:00
Tathagata Das c4a2b6f636 Fixed some bugs in tests for forgetting RDDs, and made sure that use of manual clock leads to a zeroTime of 0 in the DStreams (more intuitive). 2012-10-21 10:41:25 -07:00
Tathagata Das 6d5eb4b40c Added functionality to forget RDDs from DStreams. 2012-10-19 12:11:44 -07:00
Tathagata Das b760d6426a Minor modifications. 2012-10-15 12:26:44 -07:00
Tathagata Das 3f1aae5c71 Refactored DStreamSuiteBase to create CheckpointSuite- testsuite for testing checkpointing under different operations. 2012-10-14 21:39:30 -07:00
Tathagata Das b08708e6fc Fixed bugs in the streaming testsuites. 2012-10-13 21:02:24 -07:00
Tathagata Das 4a7bde6865 Fixed bugs and added testcases for naive reduceByKeyAndWindow. 2012-09-06 19:06:59 -07:00
Tathagata Das babb7e3ce2 Re-implemented ReducedWindowedDSteam to simplify and fix bugs. Added slice operator to DStream. Also, refactored DStream testsuites and added tests for reduceByKeyAndWindow. 2012-09-06 05:28:29 -07:00
Tathagata Das 7c09ad0e04 Changed DStream member access permissions from private to protected. Updated StateDStream to checkpoint RDDs and forget lineage. 2012-09-04 19:11:49 -07:00
Tathagata Das 7419d2c7ea Added transformRDD DStream operation and TransformedDStream. Added sbt assembly option for streaming project. 2012-09-02 02:35:17 -07:00
Tathagata Das 2d01d38a41 Added StateDStream, corresponding stateful stream operations, and testcases. Also refactored few PairDStreamFunctions methods. 2012-08-31 03:47:34 -07:00
Tathagata Das 43e66146f7 Merge branch 'dev' of github.com/radlab/spark into dev 2012-08-28 13:51:05 -07:00
Tathagata Das b5b93a621c Added capabllity to take streaming input from network. Renamed SparkStreamContext to StreamingContext. 2012-08-28 12:35:19 -07:00
Matei Zaharia e7a5cbb543 Reduce log4j verbosity for streaming 2012-08-24 16:45:01 -07:00
Tathagata Das cae894ee7a Added new Clock interface that is used by RecurringTimer to scheduler events on system time or manually-configured time. 2012-08-06 14:52:46 -07:00
Matei Zaharia 43b81eb271 Renamed RDS to DStream, plus minor style fixes 2012-08-02 14:05:51 -04:00
Tathagata Das 3be54c2a8a 1. Refactored SparkStreamContext, Scheduler, InputRDS, FileInputRDS and a few other files.
2. Modified Time class to represent milliseconds (long) directly, instead of LongTime.
3. Added new files QueueInputRDS, RecurringTimer, etc.
4. Added RDDSuite as the skeleton for testcases.
5. Added two examples in spark.streaming.examples.
6. Removed all past examples and a few unnecessary files. Moved a number of files to spark.streaming.util.
2012-08-01 22:09:27 -07:00