Commit graph

1467 commits

Author SHA1 Message Date
Charles Reiss 12c24e786c Set default uncaught exception handler to exit.
Among other things, should prevent OutOfMemoryErrors in some daemon threads
(such as the network manager) from causing a spark executor to enter a state
where it cannot make progress but does not report an error.
2012-11-16 20:12:31 -08:00
Peter Sankauskas 32442ee1e1 Giving the Spark EC2 script the ability to launch instances spread
across multiple availability zones in order to make the cluster more
resilient to failure
2012-11-16 17:25:28 -08:00
Peter Sankauskas 6d22f7ccb8 Delete security groups when deleting the cluster. As many operations
are done on instances in specific security groups, this seems like a
reasonable thing to clean up.
2012-11-16 14:02:43 -08:00
mbautin 1f5a7e0e64 SPARK-624: make the default local IP customizable 2012-11-15 13:57:47 -08:00
Matei Zaharia c23a74df0a Use DNS names instead of IP addresses in standalone mode, to allow
matching with data locality hints from storage systems.
2012-11-15 00:10:52 -08:00
Matei Zaharia 59e648c081 Fix Java/Scala home having spaces on Windows 2012-11-14 22:37:05 -08:00
Matei Zaharia 173e0354c0 Detect correctly when one has disconnected from a standalone cluster.
SPARK-617 #resolve
2012-11-11 21:06:57 -08:00
root acf8272324 Fix K-means example a little 2012-11-10 23:07:21 -08:00
Matei Zaharia d0f0fc8c1e Merge pull request #302 from tdas/blockmanager-fix
Blockmanager fix
2012-11-09 20:27:20 -08:00
Tathagata Das 9915989bfa Incorporated Matei's suggestions. Tested with 5 producer(consumer) threads each doing 50k puts (gets), took 15 minutes to run, no errors or deadlocks. 2012-11-09 15:46:15 -08:00
Tathagata Das de00bc63db Fixed deadlock in BlockManager.
1. Changed the lock structure of BlockManager by replacing the 337 coarse-grained locks to use BlockInfo objects as per-block fine-grained locks.
2. Changed the MemoryStore lock structure by making the block putting threads lock on a different object (not the memory store) thus making sure putting threads minimally blocks to the getting treads.
3. Added spark.storage.ThreadingTest to stress test the BlockManager using 5 block producer and 5 block consumer threads.
2012-11-09 14:09:37 -08:00
Matei Zaharia 6607f546cc Added an option to spread out jobs in the standalone mode. 2012-11-08 23:13:12 -08:00
Matei Zaharia 66cbdee941 Fix for connections not being reused (from Josh Rosen) 2012-11-08 09:53:40 -08:00
Imran Rashid 809b2bb1fe fix bug in getting slave id out of mesos 2012-11-08 00:34:28 -08:00
Matei Zaharia bb1bce7924 Various fixes to standalone mode and web UI:
- Don't report a job as finishing multiple times
- Don't show state of workers as LOADING when they're running
- Show start and finish times in web UI
- Sort web UI tables by ID and time by default
2012-11-07 16:49:53 -08:00
Matei Zaharia e2b8477487 Made Akka timeout and message frame size configurable, and upped the defaults 2012-11-06 15:58:05 -08:00
Matei Zaharia dfce7e74a7 Merge pull request #298 from JoshRosen/fix/ec2-existing-cluster-check
Fix check for existing instances during spark-ec2 launch
2012-11-03 18:35:26 -07:00
Josh Rosen 594eed31c4 Fix check for existing instances during EC2 launch. 2012-11-03 17:02:47 -07:00
Matei Zaharia 590e4aa9cb Merge pull request #296 from shivaram/block-manager-fix
Remove unnecessary hash-map put in MemoryStore
2012-11-01 11:54:23 -07:00
Matei Zaharia 4a47d1a476 Merge pull request #297 from JoshRosen/fix/ec2-spot-instances
Cancel spot instance requests when exiting spark-ec2
2012-11-01 11:31:18 -07:00
Shivaram Venkataraman a7d967a1ca Remove unnecessary hash-map put in MemoryStore 2012-11-01 10:46:38 -07:00
Josh Rosen 96c9bcfd8d Cancel spot instance requests when exiting spark-ec2. 2012-10-30 23:32:38 -07:00
Matei Zaharia 51477e8874 Merge pull request #294 from JoshRosen/docs/quickstart
Fix minor typos in quickstart and Scala programming guides
2012-10-27 16:56:39 -07:00
Josh Rosen 33bea24f8e Fix Spark groupId in Scala Programming Guide. 2012-10-26 15:01:28 -07:00
root e782187b4a Don't throw an error in the block manager when a block is cached on the master due to
a locally computed operation

Conflicts:

	core/src/main/scala/spark/storage/BlockManagerMaster.scala
2012-10-26 00:33:45 -07:00
Matei Zaharia f63a40fd99 Strip leading mesos:// in URLs passed to Mesos 2012-10-24 21:52:13 -07:00
Matei Zaharia d290e964ea Merge pull request #281 from rxin/memreport
Added a method to report slave memory status; force serialize accumulator update in local mode.
2012-10-23 22:04:35 -07:00
Matei Zaharia 0bd20c63e2 Merge remote-tracking branch 'JoshRosen/shuffle_refactoring' into dev
Conflicts:
	core/src/main/scala/spark/Dependency.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
2012-10-23 22:01:45 -07:00
Matei Zaharia 7849216bba Merge pull request #286 from JoshRosen/ec2-error-handling
Allow EC2 script to stop/destroy cluster after master/slave failures
2012-10-23 21:15:43 -07:00
Matei Zaharia 46b87dfc3a Merge pull request #292 from tomdz/tweaked-run-file
Tweaked run file to live more happily with typesafe's debian package
2012-10-23 21:14:06 -07:00
Josh Rosen c4aa10154e Fix minor typos in quick start guide. 2012-10-23 13:49:52 -07:00
Thomas Dudziak f595bb53d1 Tweaked run file to live more happily with typesafe's debian package 2012-10-22 13:11:05 -07:00
Matei Zaharia 0967e71a00 Bump up version to 0.7.0-SNAPSHOT for master branch 2012-10-22 11:49:42 -07:00
Matei Zaharia 902a608187 Update version to 0.6.1-SNAPSHOT to show this is in development 2012-10-22 11:43:57 -07:00
Matei Zaharia 1be335e8fa Merge branch 'master' into dev 2012-10-21 00:05:02 -07:00
Matei Zaharia 15e95be2fd Merge pull request #285 from tomdz/cdh4-dev
Support for Hadoop 2 distributions such as cdh4
2012-10-20 23:35:01 -07:00
Matei Zaharia 6999724ce8 Fix a path in the web UI 2012-10-20 23:33:37 -07:00
Patrick Wendell 45430f2cb9 Merge pull request #290 from pwendell/dev
Two trivial commits to test JIRA integration
2012-10-19 23:18:44 -07:00
Patrick Wendell cd0936529b SPARK-581 #resolve Removing whitespace to test JIRA 2012-10-19 23:17:44 -07:00
Patrick Wendell d50028b345 Adding whitespace to test JIRA integration 2012-10-19 23:17:44 -07:00
Matei Zaharia bff5ceff53 Merge pull request #287 from rxin/startslave
Use SPARK_MASTER_IP if it is set in start-slaves.sh.
2012-10-19 01:12:05 -07:00
Reynold Xin f67bcbed07 Use SPARK_MASTER_IP if it is set in start-slaves.sh. 2012-10-19 01:08:23 -07:00
Thomas Dudziak d9c2a89c57 Support for Hadoop 2 distributions such as cdh4 2012-10-18 16:08:54 -07:00
Josh Rosen 365a4c1e68 Allow EC2 script to stop/destroy cluster after master/slave failures. 2012-10-18 10:36:50 -07:00
Reynold Xin 4a3fb06ac2 Updated Kryo to 2.20. 2012-10-16 01:10:01 -07:00
Reynold Xin 3b97124604 Changed Spark version back to 0.6.0 2012-10-15 21:39:51 -07:00
Reynold Xin 63fae9bc23 Serialize accumulator updates in TaskResult for local mode. 2012-10-15 21:38:28 -07:00
Reynold Xin 9087a1abef Changed version to 0.6.0-rxin. 2012-10-15 13:54:04 -07:00
Matei Zaharia 388a111153 Fix sbt assembly's merge rules 2012-10-15 10:21:16 -07:00
Reynold Xin 42d20fa8da Added a method to report slave memory status. 2012-10-14 22:30:53 -07:00