Commit graph

2626 commits

Author SHA1 Message Date
Matei Zaharia 597520ae20 Make sure the SSH key we copy to EC2 has permissions 600.
SPARK-539 #resolve
2012-12-10 15:12:06 -08:00
Thomas Dudziak c1d15ae3d5 Shaded repl jar for hadoop1 profile needs to include hadoop classes 2012-12-10 15:06:28 -08:00
Mikhail Bautin 450659079a Bump CDH version for the Hadoop 2 profile to 4.1.2 2012-12-10 11:27:20 -08:00
Matei Zaharia a1a2daa7ef Merge pull request #317 from woggling/block-manager-heartbeat
Implement block manager heartbeat
2012-12-10 11:03:55 -08:00
Matei Zaharia a9ea14d6e7 Merge pull request #318 from tomdz/master
Minor tweaks to the debian build
2012-12-10 10:59:41 -08:00
Matei Zaharia ccff0a089a Use the same output directories that SBT had in subprojects
This will make it easier to make the "run" script work with a Maven build
2012-12-10 10:58:56 -08:00
Thomas Dudziak 0e5b1f7981 Minor tweaks to the debian build 2012-12-10 10:30:30 -08:00
Charles Reiss b6b62d774f Decrease BlockManagerMaster logging verbosity 2012-12-10 00:31:55 -08:00
Charles Reiss 5d3e917d09 Use Akka scheduler for BlockManager heart beats.
Adds required ActorSystem argument to BlockManager constructors.
2012-12-10 00:31:50 -08:00
Charles Reiss b53dd28c90 Changed default block manager heartbeat interval to 5 s 2012-12-09 23:03:34 -08:00
Matei Zaharia beb440089e Merge pull request #310 from tomdz/master-mavenized
Maven build setup
2012-12-09 21:40:05 -08:00
Tathagata Das 4be1cdc8b9 Merge pull request #5 from radlab/flume-integration
Flume integration
2012-12-09 20:31:42 -08:00
Tathagata Das e427216018 Removed unnecessary testcases. 2012-12-08 12:46:59 -08:00
Matei Zaharia e1d7cd2276 Search for a non-loopback address in Utils.getLocalIpAddress 2012-12-08 00:33:11 -08:00
Patrick Wendell 3e796bdd57 Changes in response to TD's review. 2012-12-07 19:34:05 -08:00
Patrick Wendell 3ff9710265 Adding Flume InputDStream 2012-12-07 16:42:39 -08:00
Patrick Wendell c36ca10241 Adding locality aware parallelize 2012-12-07 16:42:36 -08:00
Tathagata Das 1f3a75ae9e Modified checkpoint testsuite to more comprehensively test checkpointing of various RDDs. Fixed checkpoint bug (splits referring to parent RDDs or parent splits) in UnionRDD and CoalescedRDD. Fixed bug in testing ShuffledRDD. Removed unnecessary and useless map-side combining step for narrow dependencies in CoGroupedRDD. Removed unncessary WeakReference stuff from many other RDDs. 2012-12-07 13:45:52 -08:00
Charles Reiss 714c8d32d5 Don't divide by milliseconds by 1000 more. 2012-12-06 18:38:34 -08:00
Charles Reiss 8f0819520c map -> foreach 2012-12-06 18:29:50 -08:00
Charles Reiss 7a033fd795 Make LocalSparkCluster use distinct IPs 2012-12-06 00:03:08 -08:00
Charles Reiss a2a94fdbc7 Tests for block manager heartbeats. 2012-12-05 23:36:05 -08:00
Charles Reiss d21ca010ac Add block manager heart beats.
Renames old message called 'HeartBeat' to 'BlockUpdate'.

The BlockManager periodically sends a heart beat message to the master.
If the manager is currently not registered. The master responds to the
heart beat by indicating whether the BlockManager is currently registered
with the master. Additionally, the master now also responds to block
updates by indicating whether the BlockManager in question is registered.
When the BlockManager detects (by heart beat or failed block update)
that it stopped being registered, it reregisters and sends block
updates for all its blocks.
2012-12-05 23:35:20 -08:00
Charles Reiss c9e54a6755 Track block managers by hostname; handle manager removal. 2012-12-05 23:35:20 -08:00
Charles Reiss 5afa2ee9e9 Actually put millis in _lastSeenMs 2012-12-05 23:35:20 -08:00
Charles Reiss 813ac71459 Don't use bogus port number in notifyADeadHost(). 2012-12-05 23:35:20 -08:00
Denny 556c38ed91 Added kafka JAR 2012-12-05 11:54:42 -08:00
Denny a23462191f Adjust Kafka code to work with new streaming changes. 2012-12-05 10:30:40 -08:00
Denny 15df4b0e52 Merge branch 'dev' into kafka
Conflicts:
	streaming/src/main/scala/spark/streaming/DStream.scala
2012-12-05 10:16:56 -08:00
Tathagata Das 21a0852976 Refactored RDD checkpointing to minimize extra fields in RDD class. 2012-12-04 22:10:25 -08:00
Matei Zaharia ddf6cd012c Merge pull request #316 from JoshRosen/fix/ec2-web-ui-links
Use external addresses in standalone web UI when running on EC2
2012-12-04 15:34:27 -08:00
Tathagata Das a69a82be26 Added metadata cleaner to HttpBroadcast to clean up old broacast files. 2012-12-03 22:37:31 -08:00
Tathagata Das 609e00d599 Minor mods 2012-12-02 02:39:08 +00:00
Josh Rosen cdaa0fad51 Use external addresses in standalone WebUI on EC2. 2012-12-01 18:19:13 -08:00
Tathagata Das b4dba55f78 Made RDD checkpoint not create a new thread. Fixed bug in detecting when spark.cleaner.delay is insufficient. 2012-12-02 02:03:05 +00:00
Tathagata Das 477de94894 Minor modifications. 2012-12-01 13:15:06 -08:00
Tathagata Das 62965c5d8e Added ssc.union 2012-12-01 08:26:10 -08:00
Tathagata Das 6fcd09f499 Added TimeStampedHashSet and used that to cleanup the list of registered RDD IDs in CacheTracker. 2012-11-29 02:06:33 -08:00
Tathagata Das c9789751bf Added metadata cleaner to BlockManager to remove old blocks completely. 2012-11-28 23:18:24 -08:00
Matei Zaharia 8d3713c221 Merge pull request #314 from pwendell/quickstart-bugfix
Adding multi-jar constructor in quickstart
2012-11-28 22:59:28 -08:00
Thomas Dudziak 84e584fa8c Code review feedback fix 2012-11-28 19:46:06 -08:00
Tathagata Das 9e9e9e1d89 Renamed CleanupTask to MetadataCleaner. 2012-11-28 18:48:14 -08:00
Tathagata Das e463ae4920 Modified StorageLevel and BlockManagerId to cache common objects and use cached object while deserializing. 2012-11-28 14:05:01 -08:00
Tathagata Das d5e7aad039 Bug fixes 2012-11-28 08:36:55 +00:00
Patrick Wendell 6ceb559994 Adding multi-jar constructor in quickstart 2012-11-27 23:32:24 -08:00
Matei Zaharia f86960cba9 Merge pull request #313 from rxin/pde_size_compress
Added a partition preserving flag to MapPartitionsWithSplitRDD.
2012-11-27 22:39:25 -08:00
Matei Zaharia 3ebd8e1885 Added zip to Java API 2012-11-27 22:38:09 -08:00
Matei Zaharia 27e43abd19 Added a zip() operation for RDDs with the same shape (number of
partitions and number of elements in each partition)
2012-11-27 22:27:47 -08:00
Matei Zaharia 59c0a9ad16 Use hostname instead of IP in deploy scripts to let Akka connect properly 2012-11-27 21:00:04 -08:00
Matei Zaharia f410a111ad Merge branch 'master' of github.com:mesos/spark 2012-11-27 20:51:58 -08:00