Denny
b6f7ba813e
change import for example function
2012-11-13 13:15:32 -08:00
Tathagata Das
26fec8f0b8
Fixed bug in MappedValuesRDD, and set default graph checkpoint interval to be batch duration.
2012-11-13 11:05:57 -08:00
Tathagata Das
c3ccd14cf8
Replaced StateRDD in StateDStream with MapPartitionsRDD.
2012-11-13 02:43:03 -08:00
Tathagata Das
8a25d530ed
Optimized checkpoint writing by reusing FileSystem object. Fixed bug in updating of checkpoint data in DStream where the checkpointed RDDs, upon recovery, were not recognized as checkpointed RDDs and therefore deleted from HDFS. Made InputStreamsSuite more robust to timing delays.
2012-11-13 02:16:28 -08:00
Denny
255b3e44c1
Merge branch 'dev' into kafka
2012-11-12 19:39:29 -08:00
Tathagata Das
564dd8c3f4
Speeded up CheckpointSuite
2012-11-12 14:22:05 -08:00
Tathagata Das
b9bfd1456f
Changed default level on calling DStream.persist() to be MEMORY_ONLY_SER. Also changed the persist level of StateDStream to be MEMORY_ONLY_SER.
2012-11-12 21:51:42 +00:00
Tathagata Das
ae61ebaee6
Fixed bugs in RawNetworkInputDStream and in its examples. Made the ReducedWindowedDStream persist RDDs to MEMOERY_SER_ONLY by default. Removed unncessary examples. Added streaming-env.sh.template to add recommended setting for streaming.
2012-11-12 21:45:16 +00:00
Denny
05e3807354
Merge branch 'master' into blockmanagerUI
2012-11-12 10:56:54 -08:00
Denny
4a1be7e0db
Refactor BlockManager UI and adding worker details.
2012-11-12 10:56:35 -08:00
Matei Zaharia
173e0354c0
Detect correctly when one has disconnected from a standalone cluster.
...
SPARK-617 #resolve
2012-11-11 21:06:57 -08:00
tdas
052d0b800f
Merge branch 'dev' of github.com:radlab/spark into dev
2012-11-11 22:56:14 +00:00
Denny
68e0a88282
Merge branch 'master' into blockmanagerUI
2012-11-11 14:00:02 -08:00
Denny
b829fba749
Merge branch 'master' into blockmanagerUI
...
Conflicts:
core/src/main/twirl/spark/deploy/worker/index.scala.html
2012-11-11 13:59:40 -08:00
Tathagata Das
46222dc56d
Fixed bug in FileInputDStream that allowed it to miss new files. Added tests in the InputStreamsSuite to test checkpointing of file and network streams.
2012-11-11 13:20:09 -08:00
Denny
0fd4c93f1c
Updated comment.
2012-11-11 11:15:31 -08:00
Denny
deb2c4df72
Add comment.
2012-11-11 11:11:49 -08:00
Denny
d006109e95
Kafka Stream comments.
2012-11-11 11:06:49 -08:00
Tathagata Das
04e9e9d93c
Refactored BlockManagerMaster (not BlockManagerMasterActor) to simplify the code and fix live lock problem in unlimited attempts to contact the master. Also added testcases in the BlockManagerSuite to test BlockManagerMaster methods getPeers and getLocations.
2012-11-11 08:54:21 -08:00
root
acf8272324
Fix K-means example a little
2012-11-10 23:07:21 -08:00
Matei Zaharia
d0f0fc8c1e
Merge pull request #302 from tdas/blockmanager-fix
...
Blockmanager fix
2012-11-09 20:27:20 -08:00
Tathagata Das
62af376863
Merge branch 'dev' of github.com:radlab/spark into dev
2012-11-09 16:29:11 -08:00
Tathagata Das
355c8e4b17
Fixed deadlock in BlockManager.
2012-11-09 16:28:45 -08:00
Tathagata Das
9915989bfa
Incorporated Matei's suggestions. Tested with 5 producer(consumer) threads each doing 50k puts (gets), took 15 minutes to run, no errors or deadlocks.
2012-11-09 15:46:15 -08:00
Tathagata Das
de00bc63db
Fixed deadlock in BlockManager.
...
1. Changed the lock structure of BlockManager by replacing the 337 coarse-grained locks to use BlockInfo objects as per-block fine-grained locks.
2. Changed the MemoryStore lock structure by making the block putting threads lock on a different object (not the memory store) thus making sure putting threads minimally blocks to the getting treads.
3. Added spark.storage.ThreadingTest to stress test the BlockManager using 5 block producer and 5 block consumer threads.
2012-11-09 14:09:37 -08:00
Denny
2e8f2ee4ad
Merge branch 'dev' of github.com:radlab/spark into kafka
...
Conflicts:
streaming/src/main/scala/spark/streaming/DStream.scala
2012-11-09 12:26:17 -08:00
Denny
e5a0936787
Kafka Stream.
2012-11-09 12:23:46 -08:00
Matei Zaharia
6607f546cc
Added an option to spread out jobs in the standalone mode.
2012-11-08 23:13:12 -08:00
Matei Zaharia
66cbdee941
Fix for connections not being reused (from Josh Rosen)
2012-11-08 09:53:40 -08:00
tdas
52d21cb682
Removed unnecessary files.
2012-11-08 11:35:40 +00:00
tdas
cc2a65f547
Fixed bug in InputStreamsSuite
2012-11-08 11:17:57 +00:00
Imran Rashid
809b2bb1fe
fix bug in getting slave id out of mesos
2012-11-08 00:34:28 -08:00
Matei Zaharia
bb1bce7924
Various fixes to standalone mode and web UI:
...
- Don't report a job as finishing multiple times
- Don't show state of workers as LOADING when they're running
- Show start and finish times in web UI
- Sort web UI tables by ID and time by default
2012-11-07 16:49:53 -08:00
Tathagata Das
fc3d0b602a
Added FailureTestsuite for testing multiple, repeated master failures.
2012-11-06 17:23:31 -08:00
Matei Zaharia
e2b8477487
Made Akka timeout and message frame size configurable, and upped the defaults
2012-11-06 15:58:05 -08:00
Denny
485803d740
Merge branch 'dev' of github.com:radlab/spark into kafka
2012-11-06 09:41:45 -08:00
Denny
0c1de43fc7
Working on kafka.
2012-11-06 09:41:42 -08:00
Tathagata Das
f8bb719cd2
Added a few more comments to the checkpoint-related functions.
2012-11-05 17:53:56 -08:00
Tathagata Das
395167f2b2
Made more bug fixes for checkpointing.
2012-11-05 16:11:50 -08:00
Tathagata Das
72b2303f99
Fixed major bugs in checkpointing.
2012-11-05 11:41:36 -08:00
Tathagata Das
d154238789
Made checkpointing of dstream graph to work with checkpointing of RDDs. For streams requiring checkpointing of its RDD, the default checkpoint interval is set to 10 seconds.
2012-11-04 12:12:06 -08:00
Matei Zaharia
dfce7e74a7
Merge pull request #298 from JoshRosen/fix/ec2-existing-cluster-check
...
Fix check for existing instances during spark-ec2 launch
2012-11-03 18:35:26 -07:00
Josh Rosen
594eed31c4
Fix check for existing instances during EC2 launch.
2012-11-03 17:02:47 -07:00
Tathagata Das
596154eabe
Merge branch 'dev-checkpoint' into dev
2012-11-02 17:05:22 -07:00
Tathagata Das
3fb5c9ee24
Fixed serialization bug in countByWindow, added countByKey and countByKeyAndWindow, and added testcases for them.
2012-11-02 12:12:25 -07:00
Matei Zaharia
590e4aa9cb
Merge pull request #296 from shivaram/block-manager-fix
...
Remove unnecessary hash-map put in MemoryStore
2012-11-01 11:54:23 -07:00
Matei Zaharia
4a47d1a476
Merge pull request #297 from JoshRosen/fix/ec2-spot-instances
...
Cancel spot instance requests when exiting spark-ec2
2012-11-01 11:31:18 -07:00
Shivaram Venkataraman
a7d967a1ca
Remove unnecessary hash-map put in MemoryStore
2012-11-01 10:46:38 -07:00
Tathagata Das
34e569f40e
Added 'synchronized' to RDD serialization to ensure checkpoint-related changes are reflected atomically in the task closure. Added to tests to ensure that jobs running on an RDD on which checkpointing is in progress does hurt the result of the job.
2012-10-31 00:56:40 -07:00
Josh Rosen
96c9bcfd8d
Cancel spot instance requests when exiting spark-ec2.
2012-10-30 23:32:38 -07:00