Commit graph

591 commits

Author SHA1 Message Date
Josh Rosen cf52d9cade Add try-finally to handle MapOutputTracker timeouts. 2012-12-13 21:53:30 -08:00
Reynold Xin 21b271f5bd Suppress shuffle block updates when a slave node comes back. 2012-12-10 20:36:03 -08:00
Matei Zaharia a1a2daa7ef Merge pull request #317 from woggling/block-manager-heartbeat
Implement block manager heartbeat
2012-12-10 11:03:55 -08:00
Charles Reiss b6b62d774f Decrease BlockManagerMaster logging verbosity 2012-12-10 00:31:55 -08:00
Charles Reiss 5d3e917d09 Use Akka scheduler for BlockManager heart beats.
Adds required ActorSystem argument to BlockManager constructors.
2012-12-10 00:31:50 -08:00
Charles Reiss b53dd28c90 Changed default block manager heartbeat interval to 5 s 2012-12-09 23:03:34 -08:00
Matei Zaharia e1d7cd2276 Search for a non-loopback address in Utils.getLocalIpAddress 2012-12-08 00:33:11 -08:00
Charles Reiss 714c8d32d5 Don't divide by milliseconds by 1000 more. 2012-12-06 18:38:34 -08:00
Charles Reiss 8f0819520c map -> foreach 2012-12-06 18:29:50 -08:00
Charles Reiss 7a033fd795 Make LocalSparkCluster use distinct IPs 2012-12-06 00:03:08 -08:00
Charles Reiss d21ca010ac Add block manager heart beats.
Renames old message called 'HeartBeat' to 'BlockUpdate'.

The BlockManager periodically sends a heart beat message to the master.
If the manager is currently not registered. The master responds to the
heart beat by indicating whether the BlockManager is currently registered
with the master. Additionally, the master now also responds to block
updates by indicating whether the BlockManager in question is registered.
When the BlockManager detects (by heart beat or failed block update)
that it stopped being registered, it reregisters and sends block
updates for all its blocks.
2012-12-05 23:35:20 -08:00
Charles Reiss c9e54a6755 Track block managers by hostname; handle manager removal. 2012-12-05 23:35:20 -08:00
Charles Reiss 5afa2ee9e9 Actually put millis in _lastSeenMs 2012-12-05 23:35:20 -08:00
Charles Reiss 813ac71459 Don't use bogus port number in notifyADeadHost(). 2012-12-05 23:35:20 -08:00
Josh Rosen cdaa0fad51 Use external addresses in standalone WebUI on EC2. 2012-12-01 18:19:13 -08:00
Matei Zaharia f86960cba9 Merge pull request #313 from rxin/pde_size_compress
Added a partition preserving flag to MapPartitionsWithSplitRDD.
2012-11-27 22:39:25 -08:00
Matei Zaharia 3ebd8e1885 Added zip to Java API 2012-11-27 22:38:09 -08:00
Matei Zaharia 27e43abd19 Added a zip() operation for RDDs with the same shape (number of
partitions and number of elements in each partition)
2012-11-27 22:27:47 -08:00
Matei Zaharia f410a111ad Merge branch 'master' of github.com:mesos/spark 2012-11-27 20:51:58 -08:00
Josh Rosen 7d71b9a56a Fix NullPointerException caused by unregistered map outputs. 2012-11-27 20:51:51 -08:00
Matei Zaharia 935c468b71 Merge pull request #311 from woggling/map-output-npe
Fix NullPointerException when map output unregistered from MapOutputTracker twice
2012-11-27 20:50:48 -08:00
Reynold Xin bd6dd1a3a6 Added a partition preserving flag to MapPartitionsWithSplitRDD. 2012-11-27 19:43:30 -08:00
Reynold Xin f24bfd2dd1 For size compression, compress non zero values into non zero values. 2012-11-27 19:20:45 -08:00
Charles Reiss cf79de425d Fix NullPointerException when unregistering a map output twice. 2012-11-27 16:12:05 -08:00
Matei Zaharia 3ff6f4bdee Merge pull request #304 from mbautin/configurable_local_ip
SPARK-624: make the default local IP customizable
2012-11-19 13:23:39 -08:00
mbautin 00f4e3ff9c Addressing Matei's comment: SPARK_LOCAL_IP environment variable 2012-11-19 11:52:10 -08:00
Charles Reiss 12c24e786c Set default uncaught exception handler to exit.
Among other things, should prevent OutOfMemoryErrors in some daemon threads
(such as the network manager) from causing a spark executor to enter a state
where it cannot make progress but does not report an error.
2012-11-16 20:12:31 -08:00
mbautin 1f5a7e0e64 SPARK-624: make the default local IP customizable 2012-11-15 13:57:47 -08:00
Matei Zaharia c23a74df0a Use DNS names instead of IP addresses in standalone mode, to allow
matching with data locality hints from storage systems.
2012-11-15 00:10:52 -08:00
Matei Zaharia 173e0354c0 Detect correctly when one has disconnected from a standalone cluster.
SPARK-617 #resolve
2012-11-11 21:06:57 -08:00
root acf8272324 Fix K-means example a little 2012-11-10 23:07:21 -08:00
Tathagata Das 9915989bfa Incorporated Matei's suggestions. Tested with 5 producer(consumer) threads each doing 50k puts (gets), took 15 minutes to run, no errors or deadlocks. 2012-11-09 15:46:15 -08:00
Tathagata Das de00bc63db Fixed deadlock in BlockManager.
1. Changed the lock structure of BlockManager by replacing the 337 coarse-grained locks to use BlockInfo objects as per-block fine-grained locks.
2. Changed the MemoryStore lock structure by making the block putting threads lock on a different object (not the memory store) thus making sure putting threads minimally blocks to the getting treads.
3. Added spark.storage.ThreadingTest to stress test the BlockManager using 5 block producer and 5 block consumer threads.
2012-11-09 14:09:37 -08:00
Matei Zaharia 6607f546cc Added an option to spread out jobs in the standalone mode. 2012-11-08 23:13:12 -08:00
Matei Zaharia 66cbdee941 Fix for connections not being reused (from Josh Rosen) 2012-11-08 09:53:40 -08:00
Imran Rashid 809b2bb1fe fix bug in getting slave id out of mesos 2012-11-08 00:34:28 -08:00
Matei Zaharia bb1bce7924 Various fixes to standalone mode and web UI:
- Don't report a job as finishing multiple times
- Don't show state of workers as LOADING when they're running
- Show start and finish times in web UI
- Sort web UI tables by ID and time by default
2012-11-07 16:49:53 -08:00
Matei Zaharia e2b8477487 Made Akka timeout and message frame size configurable, and upped the defaults 2012-11-06 15:58:05 -08:00
Shivaram Venkataraman a7d967a1ca Remove unnecessary hash-map put in MemoryStore 2012-11-01 10:46:38 -07:00
root e782187b4a Don't throw an error in the block manager when a block is cached on the master due to
a locally computed operation

Conflicts:

	core/src/main/scala/spark/storage/BlockManagerMaster.scala
2012-10-26 00:33:45 -07:00
Matei Zaharia f63a40fd99 Strip leading mesos:// in URLs passed to Mesos 2012-10-24 21:52:13 -07:00
Matei Zaharia d290e964ea Merge pull request #281 from rxin/memreport
Added a method to report slave memory status; force serialize accumulator update in local mode.
2012-10-23 22:04:35 -07:00
Matei Zaharia 0bd20c63e2 Merge remote-tracking branch 'JoshRosen/shuffle_refactoring' into dev
Conflicts:
	core/src/main/scala/spark/Dependency.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
2012-10-23 22:01:45 -07:00
Thomas Dudziak d9c2a89c57 Support for Hadoop 2 distributions such as cdh4 2012-10-18 16:08:54 -07:00
Reynold Xin 63fae9bc23 Serialize accumulator updates in TaskResult for local mode. 2012-10-15 21:38:28 -07:00
Reynold Xin 42d20fa8da Added a method to report slave memory status. 2012-10-14 22:30:53 -07:00
Matei Zaharia 64dbf8d372 Made ShuffleDependency automatically find a shuffle ID for itself 2012-10-14 10:00:22 -07:00
Matei Zaharia 8815aeba0c Take executor environment vars as an arguemnt to SparkContext 2012-10-13 15:31:11 -07:00
Josh Rosen 33cd3a0c12 Remove map-side combining from ShuffleMapTask.
This separation of concerns simplifies the 
ShuffleDependency and ShuffledRDD interfaces.

Map-side combining can be performed in a
mapPartitions() call prior to shuffling the RDD.

I don't anticipate this having much of a 
performance impact: in both approaches, each tuple
is hashed twice: once in the bucket partitioning
and once in the combiner's hashtable.  The same
steps are being performed, but in a different
order and through one extra Iterator.
2012-10-13 14:59:20 -07:00
Josh Rosen 10bcd217d2 Remove mapSideCombine field from Aggregator.
Instead, the presence or absense of a ShuffleDependency's aggregator
will control whether map-side combining is performed.
2012-10-13 14:59:20 -07:00