Tathagata Das
|
af8738dfb5
|
Moved Spark Streaming examples to examples sub-project.
|
2013-01-06 19:31:54 -08:00 |
|
Tathagata Das
|
02497f0cd4
|
Updated Streaming Programming Guide.
|
2013-01-01 12:21:32 -08:00 |
|
Tathagata Das
|
18b9b3b99f
|
More classes made private[streaming] to hide from scala docs.
|
2012-12-30 20:00:42 -08:00 |
|
Tathagata Das
|
7e0271b438
|
Refactored a whole lot to push all DStreams into the spark.streaming.dstream package.
|
2012-12-30 15:19:55 -08:00 |
|
Tathagata Das
|
9e644402c1
|
Improved jekyll and scala docs. Made many classes and method private to remove them from scala docs.
|
2012-12-29 18:31:51 -08:00 |
|
Tathagata Das
|
0bc0a60d30
|
Modifications to make sure LocalScheduler terminate cleanly without errors when SparkContext is shutdown, to minimize spurious exception during master failure tests.
|
2012-12-27 15:37:33 -08:00 |
|
Tathagata Das
|
8512dd3225
|
Merge branch 'dev' of github.com:radlab/spark into dev-checkpoint
Conflicts:
core/src/main/scala/spark/ParallelCollection.scala
core/src/test/scala/spark/CheckpointSuite.scala
streaming/src/main/scala/spark/streaming/DStream.scala
|
2012-12-20 14:24:19 -08:00 |
|
Tathagata Das
|
8e74fac215
|
Made checkpoint data in RDDs optional to further reduce serialized size.
|
2012-12-11 15:36:12 -08:00 |
|
Patrick Wendell
|
3e796bdd57
|
Changes in response to TD's review.
|
2012-12-07 19:34:05 -08:00 |
|
Patrick Wendell
|
3ff9710265
|
Adding Flume InputDStream
|
2012-12-07 16:42:39 -08:00 |
|
Denny
|
556c38ed91
|
Added kafka JAR
|
2012-12-05 11:54:42 -08:00 |
|
Denny
|
a23462191f
|
Adjust Kafka code to work with new streaming changes.
|
2012-12-05 10:30:40 -08:00 |
|
Denny
|
15df4b0e52
|
Merge branch 'dev' into kafka
Conflicts:
streaming/src/main/scala/spark/streaming/DStream.scala
|
2012-12-05 10:16:56 -08:00 |
|
Tathagata Das
|
21a0852976
|
Refactored RDD checkpointing to minimize extra fields in RDD class.
|
2012-12-04 22:10:25 -08:00 |
|
Tathagata Das
|
609e00d599
|
Minor mods
|
2012-12-02 02:39:08 +00:00 |
|
Tathagata Das
|
b4dba55f78
|
Made RDD checkpoint not create a new thread. Fixed bug in detecting when spark.cleaner.delay is insufficient.
|
2012-12-02 02:03:05 +00:00 |
|
Tathagata Das
|
477de94894
|
Minor modifications.
|
2012-12-01 13:15:06 -08:00 |
|
Tathagata Das
|
62965c5d8e
|
Added ssc.union
|
2012-12-01 08:26:10 -08:00 |
|
Tathagata Das
|
d5e7aad039
|
Bug fixes
|
2012-11-28 08:36:55 +00:00 |
|
Tathagata Das
|
b18d70870a
|
Modified bunch HashMaps in Spark to use TimeStampedHashMap and made various modules use CleanupTask to periodically clean up metadata.
|
2012-11-27 15:08:49 -08:00 |
|
Tathagata Das
|
0fe2fc4d5e
|
Merged branch mesos/master to branch dev.
|
2012-11-26 13:16:59 -08:00 |
|
Tathagata Das
|
fd11d23bb3
|
Modified StreamingContext API to make constructor accept the batch size (since it is always needed, Patrick's suggestion). Added description to DStream and StreamingContext.
|
2012-11-19 19:04:39 -08:00 |
|
Tathagata Das
|
c97ebf6437
|
Fixed bug in the number of splits in RDD after checkpointing. Modified reduceByKeyAndWindow (naive) computation from window+reduceByKey to reduceByKey+window+reduceByKey.
|
2012-11-19 23:22:07 +00:00 |
|
Denny
|
5e2b0a3bf6
|
Added Kafka Wordcount producer
|
2012-11-19 10:17:58 -08:00 |
|
Denny
|
6757ed6a40
|
Comment out code for fault-tolerance.
|
2012-11-19 09:42:35 -08:00 |
|
Denny
|
f56befa914
|
Merge branch 'dev' into kafka
|
2012-11-19 09:29:54 -08:00 |
|
Tathagata Das
|
3fd7b8319b
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-11-17 17:27:07 -08:00 |
|
Tathagata Das
|
10c1abcb6a
|
Fixed checkpointing bug in CoGroupedRDD. CoGroupSplits kept around the RDD splits of its parent RDDs, thus checkpointing its parents did not release the references to the parent splits.
|
2012-11-17 17:27:00 -08:00 |
|
Patrick Wendell
|
efa93fd0e6
|
Merge pull request #4 from radlab/streaming-example
A "streaming page view" example.
|
2012-11-16 20:40:27 -08:00 |
|
Patrick Wendell
|
720cb0f467
|
A "streaming page view" example.
|
2012-11-16 12:11:22 -08:00 |
|
Denny
|
2aceae25be
|
Merge branch 'dev' into kafka
Conflicts:
streaming/src/main/scala/spark/streaming/DStream.scala
|
2012-11-13 13:16:18 -08:00 |
|
Denny
|
b6f7ba813e
|
change import for example function
|
2012-11-13 13:15:32 -08:00 |
|
Tathagata Das
|
26fec8f0b8
|
Fixed bug in MappedValuesRDD, and set default graph checkpoint interval to be batch duration.
|
2012-11-13 11:05:57 -08:00 |
|
Tathagata Das
|
c3ccd14cf8
|
Replaced StateRDD in StateDStream with MapPartitionsRDD.
|
2012-11-13 02:43:03 -08:00 |
|
Tathagata Das
|
8a25d530ed
|
Optimized checkpoint writing by reusing FileSystem object. Fixed bug in updating of checkpoint data in DStream where the checkpointed RDDs, upon recovery, were not recognized as checkpointed RDDs and therefore deleted from HDFS. Made InputStreamsSuite more robust to timing delays.
|
2012-11-13 02:16:28 -08:00 |
|
Denny
|
255b3e44c1
|
Merge branch 'dev' into kafka
|
2012-11-12 19:39:29 -08:00 |
|
Tathagata Das
|
564dd8c3f4
|
Speeded up CheckpointSuite
|
2012-11-12 14:22:05 -08:00 |
|
Tathagata Das
|
b9bfd1456f
|
Changed default level on calling DStream.persist() to be MEMORY_ONLY_SER. Also changed the persist level of StateDStream to be MEMORY_ONLY_SER.
|
2012-11-12 21:51:42 +00:00 |
|
Tathagata Das
|
ae61ebaee6
|
Fixed bugs in RawNetworkInputDStream and in its examples. Made the ReducedWindowedDStream persist RDDs to MEMOERY_SER_ONLY by default. Removed unncessary examples. Added streaming-env.sh.template to add recommended setting for streaming.
|
2012-11-12 21:45:16 +00:00 |
|
tdas
|
052d0b800f
|
Merge branch 'dev' of github.com:radlab/spark into dev
|
2012-11-11 22:56:14 +00:00 |
|
Tathagata Das
|
46222dc56d
|
Fixed bug in FileInputDStream that allowed it to miss new files. Added tests in the InputStreamsSuite to test checkpointing of file and network streams.
|
2012-11-11 13:20:09 -08:00 |
|
Denny
|
0fd4c93f1c
|
Updated comment.
|
2012-11-11 11:15:31 -08:00 |
|
Denny
|
deb2c4df72
|
Add comment.
|
2012-11-11 11:11:49 -08:00 |
|
Denny
|
d006109e95
|
Kafka Stream comments.
|
2012-11-11 11:06:49 -08:00 |
|
Denny
|
2e8f2ee4ad
|
Merge branch 'dev' of github.com:radlab/spark into kafka
Conflicts:
streaming/src/main/scala/spark/streaming/DStream.scala
|
2012-11-09 12:26:17 -08:00 |
|
Denny
|
e5a0936787
|
Kafka Stream.
|
2012-11-09 12:23:46 -08:00 |
|
tdas
|
52d21cb682
|
Removed unnecessary files.
|
2012-11-08 11:35:40 +00:00 |
|
tdas
|
cc2a65f547
|
Fixed bug in InputStreamsSuite
|
2012-11-08 11:17:57 +00:00 |
|
Tathagata Das
|
fc3d0b602a
|
Added FailureTestsuite for testing multiple, repeated master failures.
|
2012-11-06 17:23:31 -08:00 |
|
Denny
|
485803d740
|
Merge branch 'dev' of github.com:radlab/spark into kafka
|
2012-11-06 09:41:45 -08:00 |
|