Commit graph

2626 commits

Author SHA1 Message Date
Josh Rosen 7d71b9a56a Fix NullPointerException caused by unregistered map outputs. 2012-11-27 20:51:51 -08:00
Matei Zaharia 935c468b71 Merge pull request #311 from woggling/map-output-npe
Fix NullPointerException when map output unregistered from MapOutputTracker twice
2012-11-27 20:50:48 -08:00
Reynold Xin bd6dd1a3a6 Added a partition preserving flag to MapPartitionsWithSplitRDD. 2012-11-27 19:43:30 -08:00
Matei Zaharia 60cb3e9380 Merge pull request #312 from rxin/pde_size_compress
For size compression, compress non zero values into non zero values.
2012-11-27 19:25:45 -08:00
Reynold Xin f24bfd2dd1 For size compression, compress non zero values into non zero values. 2012-11-27 19:20:45 -08:00
Thomas Dudziak 3b643e86bc Updated versions in the pom.xml files to match current master 2012-11-27 17:50:42 -08:00
Charles Reiss cf79de425d Fix NullPointerException when unregistering a map output twice. 2012-11-27 16:12:05 -08:00
Charles Reiss 5fa868b98b Tests for MapOutputTracker. 2012-11-27 16:05:36 -08:00
Thomas Dudziak 69297c64be Addressed code review comments 2012-11-27 15:45:16 -08:00
Tathagata Das b18d70870a Modified bunch HashMaps in Spark to use TimeStampedHashMap and made various modules use CleanupTask to periodically clean up metadata. 2012-11-27 15:08:49 -08:00
Tathagata Das 0fe2fc4d5e Merged branch mesos/master to branch dev. 2012-11-26 13:16:59 -08:00
Thomas Dudziak 24e1e425cd Include the configuration templates in the debian package 2012-11-20 16:19:56 -08:00
Thomas Dudziak 811a32257b Added maven and debian build files 2012-11-20 16:19:51 -08:00
Tathagata Das fd11d23bb3 Modified StreamingContext API to make constructor accept the batch size (since it is always needed, Patrick's suggestion). Added description to DStream and StreamingContext. 2012-11-19 19:04:39 -08:00
Matei Zaharia cd16eab0db Merge pull request #309 from admobius/use-boto-config-file
Improved use of Boto
2012-11-19 15:51:41 -08:00
Tathagata Das c97ebf6437 Fixed bug in the number of splits in RDD after checkpointing. Modified reduceByKeyAndWindow (naive) computation from window+reduceByKey to reduceByKey+window+reduceByKey. 2012-11-19 23:22:07 +00:00
Peter Sankauskas dc2fb3c4b6 Allow Boto to use the other config options it supports, and gracefully
handling Boto connection exceptions (like AuthFailure)
2012-11-19 14:21:16 -08:00
Matei Zaharia 85ce5f27c1 Merge pull request #308 from admobius/multi-zone
Let EC2 script launch slaves in multiple availability zones
2012-11-19 13:24:09 -08:00
Matei Zaharia 3ff6f4bdee Merge pull request #304 from mbautin/configurable_local_ip
SPARK-624: make the default local IP customizable
2012-11-19 13:23:39 -08:00
mbautin 00f4e3ff9c Addressing Matei's comment: SPARK_LOCAL_IP environment variable 2012-11-19 11:52:10 -08:00
Denny 5e2b0a3bf6 Added Kafka Wordcount producer 2012-11-19 10:17:58 -08:00
Denny 6757ed6a40 Comment out code for fault-tolerance. 2012-11-19 09:42:35 -08:00
Denny f56befa914 Merge branch 'dev' into kafka 2012-11-19 09:29:54 -08:00
Peter Sankauskas 606d252d26 Adding comment about additional bandwidth charges 2012-11-17 23:09:11 -08:00
Tathagata Das 3fd7b8319b Merge branch 'dev' of github.com:radlab/spark into dev 2012-11-17 17:27:07 -08:00
Tathagata Das 10c1abcb6a Fixed checkpointing bug in CoGroupedRDD. CoGroupSplits kept around the RDD splits of its parent RDDs, thus checkpointing its parents did not release the references to the parent splits. 2012-11-17 17:27:00 -08:00
Matei Zaharia 20a1058dd5 Merge pull request #305 from woggling/exit-on-uncaught
Set default uncaught exception handler to exit
2012-11-16 20:56:09 -08:00
Matei Zaharia fcc0ba7da1 Merge pull request #306 from admobius/master
Delete security groups when destroying cluster
2012-11-16 20:54:28 -08:00
Matei Zaharia 6adc7c965f Doc fix 2012-11-16 20:49:02 -08:00
Patrick Wendell efa93fd0e6 Merge pull request #4 from radlab/streaming-example
A "streaming page view" example.
2012-11-16 20:40:27 -08:00
Charles Reiss 12c24e786c Set default uncaught exception handler to exit.
Among other things, should prevent OutOfMemoryErrors in some daemon threads
(such as the network manager) from causing a spark executor to enter a state
where it cannot make progress but does not report an error.
2012-11-16 20:12:31 -08:00
Peter Sankauskas 32442ee1e1 Giving the Spark EC2 script the ability to launch instances spread
across multiple availability zones in order to make the cluster more
resilient to failure
2012-11-16 17:25:28 -08:00
Peter Sankauskas 6d22f7ccb8 Delete security groups when deleting the cluster. As many operations
are done on instances in specific security groups, this seems like a
reasonable thing to clean up.
2012-11-16 14:02:43 -08:00
Patrick Wendell 720cb0f467 A "streaming page view" example. 2012-11-16 12:11:22 -08:00
mbautin 1f5a7e0e64 SPARK-624: make the default local IP customizable 2012-11-15 13:57:47 -08:00
Matei Zaharia c23a74df0a Use DNS names instead of IP addresses in standalone mode, to allow
matching with data locality hints from storage systems.
2012-11-15 00:10:52 -08:00
Matei Zaharia 59e648c081 Fix Java/Scala home having spaces on Windows 2012-11-14 22:37:05 -08:00
Patrick Wendell 9563f7aba9 Merge pull request #3 from radlab/streaming-docs
Streaming programming guide. STREAMING-2 #resolve
2012-11-14 22:00:48 -08:00
Patrick Wendell d39ac5fbc1 Streaming programming guide. STREAMING-2 #resolve 2012-11-13 21:19:58 -08:00
Denny 2aceae25be Merge branch 'dev' into kafka
Conflicts:
	streaming/src/main/scala/spark/streaming/DStream.scala
2012-11-13 13:16:18 -08:00
Denny b6f7ba813e change import for example function 2012-11-13 13:15:32 -08:00
Tathagata Das 26fec8f0b8 Fixed bug in MappedValuesRDD, and set default graph checkpoint interval to be batch duration. 2012-11-13 11:05:57 -08:00
Tathagata Das c3ccd14cf8 Replaced StateRDD in StateDStream with MapPartitionsRDD. 2012-11-13 02:43:03 -08:00
Tathagata Das 8a25d530ed Optimized checkpoint writing by reusing FileSystem object. Fixed bug in updating of checkpoint data in DStream where the checkpointed RDDs, upon recovery, were not recognized as checkpointed RDDs and therefore deleted from HDFS. Made InputStreamsSuite more robust to timing delays. 2012-11-13 02:16:28 -08:00
Denny 255b3e44c1 Merge branch 'dev' into kafka 2012-11-12 19:39:29 -08:00
Tathagata Das 564dd8c3f4 Speeded up CheckpointSuite 2012-11-12 14:22:05 -08:00
Tathagata Das b9bfd1456f Changed default level on calling DStream.persist() to be MEMORY_ONLY_SER. Also changed the persist level of StateDStream to be MEMORY_ONLY_SER. 2012-11-12 21:51:42 +00:00
Tathagata Das ae61ebaee6 Fixed bugs in RawNetworkInputDStream and in its examples. Made the ReducedWindowedDStream persist RDDs to MEMOERY_SER_ONLY by default. Removed unncessary examples. Added streaming-env.sh.template to add recommended setting for streaming. 2012-11-12 21:45:16 +00:00
Denny 05e3807354 Merge branch 'master' into blockmanagerUI 2012-11-12 10:56:54 -08:00
Denny 4a1be7e0db Refactor BlockManager UI and adding worker details. 2012-11-12 10:56:35 -08:00