Commit graph

1987 commits

Author SHA1 Message Date
Tathagata Das 609e00d599 Minor mods 2012-12-02 02:39:08 +00:00
Josh Rosen cdaa0fad51 Use external addresses in standalone WebUI on EC2. 2012-12-01 18:19:13 -08:00
Tathagata Das b4dba55f78 Made RDD checkpoint not create a new thread. Fixed bug in detecting when spark.cleaner.delay is insufficient. 2012-12-02 02:03:05 +00:00
Tathagata Das 477de94894 Minor modifications. 2012-12-01 13:15:06 -08:00
Tathagata Das 62965c5d8e Added ssc.union 2012-12-01 08:26:10 -08:00
Tathagata Das 6fcd09f499 Added TimeStampedHashSet and used that to cleanup the list of registered RDD IDs in CacheTracker. 2012-11-29 02:06:33 -08:00
Tathagata Das c9789751bf Added metadata cleaner to BlockManager to remove old blocks completely. 2012-11-28 23:18:24 -08:00
Matei Zaharia 8d3713c221 Merge pull request #314 from pwendell/quickstart-bugfix
Adding multi-jar constructor in quickstart
2012-11-28 22:59:28 -08:00
Thomas Dudziak 84e584fa8c Code review feedback fix 2012-11-28 19:46:06 -08:00
Tathagata Das 9e9e9e1d89 Renamed CleanupTask to MetadataCleaner. 2012-11-28 18:48:14 -08:00
Tathagata Das e463ae4920 Modified StorageLevel and BlockManagerId to cache common objects and use cached object while deserializing. 2012-11-28 14:05:01 -08:00
Tathagata Das d5e7aad039 Bug fixes 2012-11-28 08:36:55 +00:00
Patrick Wendell 6ceb559994 Adding multi-jar constructor in quickstart 2012-11-27 23:32:24 -08:00
Matei Zaharia f86960cba9 Merge pull request #313 from rxin/pde_size_compress
Added a partition preserving flag to MapPartitionsWithSplitRDD.
2012-11-27 22:39:25 -08:00
Matei Zaharia 3ebd8e1885 Added zip to Java API 2012-11-27 22:38:09 -08:00
Matei Zaharia 27e43abd19 Added a zip() operation for RDDs with the same shape (number of
partitions and number of elements in each partition)
2012-11-27 22:27:47 -08:00
Matei Zaharia 59c0a9ad16 Use hostname instead of IP in deploy scripts to let Akka connect properly 2012-11-27 21:00:04 -08:00
Matei Zaharia f410a111ad Merge branch 'master' of github.com:mesos/spark 2012-11-27 20:51:58 -08:00
Josh Rosen 7d71b9a56a Fix NullPointerException caused by unregistered map outputs. 2012-11-27 20:51:51 -08:00
Matei Zaharia 935c468b71 Merge pull request #311 from woggling/map-output-npe
Fix NullPointerException when map output unregistered from MapOutputTracker twice
2012-11-27 20:50:48 -08:00
Reynold Xin bd6dd1a3a6 Added a partition preserving flag to MapPartitionsWithSplitRDD. 2012-11-27 19:43:30 -08:00
Matei Zaharia 60cb3e9380 Merge pull request #312 from rxin/pde_size_compress
For size compression, compress non zero values into non zero values.
2012-11-27 19:25:45 -08:00
Reynold Xin f24bfd2dd1 For size compression, compress non zero values into non zero values. 2012-11-27 19:20:45 -08:00
Thomas Dudziak 3b643e86bc Updated versions in the pom.xml files to match current master 2012-11-27 17:50:42 -08:00
Charles Reiss cf79de425d Fix NullPointerException when unregistering a map output twice. 2012-11-27 16:12:05 -08:00
Charles Reiss 5fa868b98b Tests for MapOutputTracker. 2012-11-27 16:05:36 -08:00
Thomas Dudziak 69297c64be Addressed code review comments 2012-11-27 15:45:16 -08:00
Tathagata Das b18d70870a Modified bunch HashMaps in Spark to use TimeStampedHashMap and made various modules use CleanupTask to periodically clean up metadata. 2012-11-27 15:08:49 -08:00
Tathagata Das 0fe2fc4d5e Merged branch mesos/master to branch dev. 2012-11-26 13:16:59 -08:00
Thomas Dudziak 24e1e425cd Include the configuration templates in the debian package 2012-11-20 16:19:56 -08:00
Thomas Dudziak 811a32257b Added maven and debian build files 2012-11-20 16:19:51 -08:00
Tathagata Das fd11d23bb3 Modified StreamingContext API to make constructor accept the batch size (since it is always needed, Patrick's suggestion). Added description to DStream and StreamingContext. 2012-11-19 19:04:39 -08:00
Matei Zaharia cd16eab0db Merge pull request #309 from admobius/use-boto-config-file
Improved use of Boto
2012-11-19 15:51:41 -08:00
Tathagata Das c97ebf6437 Fixed bug in the number of splits in RDD after checkpointing. Modified reduceByKeyAndWindow (naive) computation from window+reduceByKey to reduceByKey+window+reduceByKey. 2012-11-19 23:22:07 +00:00
Peter Sankauskas dc2fb3c4b6 Allow Boto to use the other config options it supports, and gracefully
handling Boto connection exceptions (like AuthFailure)
2012-11-19 14:21:16 -08:00
Matei Zaharia 85ce5f27c1 Merge pull request #308 from admobius/multi-zone
Let EC2 script launch slaves in multiple availability zones
2012-11-19 13:24:09 -08:00
Matei Zaharia 3ff6f4bdee Merge pull request #304 from mbautin/configurable_local_ip
SPARK-624: make the default local IP customizable
2012-11-19 13:23:39 -08:00
mbautin 00f4e3ff9c Addressing Matei's comment: SPARK_LOCAL_IP environment variable 2012-11-19 11:52:10 -08:00
Denny 5e2b0a3bf6 Added Kafka Wordcount producer 2012-11-19 10:17:58 -08:00
Denny 6757ed6a40 Comment out code for fault-tolerance. 2012-11-19 09:42:35 -08:00
Denny f56befa914 Merge branch 'dev' into kafka 2012-11-19 09:29:54 -08:00
Peter Sankauskas 606d252d26 Adding comment about additional bandwidth charges 2012-11-17 23:09:11 -08:00
Tathagata Das 3fd7b8319b Merge branch 'dev' of github.com:radlab/spark into dev 2012-11-17 17:27:07 -08:00
Tathagata Das 10c1abcb6a Fixed checkpointing bug in CoGroupedRDD. CoGroupSplits kept around the RDD splits of its parent RDDs, thus checkpointing its parents did not release the references to the parent splits. 2012-11-17 17:27:00 -08:00
Matei Zaharia 20a1058dd5 Merge pull request #305 from woggling/exit-on-uncaught
Set default uncaught exception handler to exit
2012-11-16 20:56:09 -08:00
Matei Zaharia fcc0ba7da1 Merge pull request #306 from admobius/master
Delete security groups when destroying cluster
2012-11-16 20:54:28 -08:00
Matei Zaharia 6adc7c965f Doc fix 2012-11-16 20:49:02 -08:00
Patrick Wendell efa93fd0e6 Merge pull request #4 from radlab/streaming-example
A "streaming page view" example.
2012-11-16 20:40:27 -08:00
Charles Reiss 12c24e786c Set default uncaught exception handler to exit.
Among other things, should prevent OutOfMemoryErrors in some daemon threads
(such as the network manager) from causing a spark executor to enter a state
where it cannot make progress but does not report an error.
2012-11-16 20:12:31 -08:00
Peter Sankauskas 32442ee1e1 Giving the Spark EC2 script the ability to launch instances spread
across multiple availability zones in order to make the cluster more
resilient to failure
2012-11-16 17:25:28 -08:00