Update spark_ec2 to use 0.9.0 by default
Backports change from branch-0.9
Author: Shivaram Venkataraman <shivaram@eecs.berkeley.edu>
Closes#598 and squashes the following commits:
f6d3ed0 [Shivaram Venkataraman] Update spark_ec2 to use 0.9.0 by default Backports change from branch-0.9
The number of disks for the c3 instance types taken from here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html#StorageOnInstanceTypes
Author: Christian Lundgren <christian.lundgren@gameanalytics.com>
Closes#595 from chrisavl/branch-0.9 and squashes the following commits:
c8af5f9 [Christian Lundgren] Add c3 instance types to Spark EC2
(cherry picked from commit 19b4bb2b44)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>
ssh commands need the -t argument repeated twice if there is no local
tty, e.g. if the process running spark-ec2 uses nohup and the parent
process exits.
Under unknown, but occasional, circumstances, reservation.groups is empty
despite reservation.instances each having groups. This means that the
spark_ec2 get_existing_clusters() method would fail to find any instances.
To fix it, we simply use the instances' groups as the source of truth.
Note that this is actually just a revival of PR #827, now that the issue
has been reproduced.
Right now it seems like something has gone wrong when this message is printed out.
Instead, this is a normal condition. So I changed the message a bit.
- Use SPARK_PUBLIC_DNS environment variable if set (for EC2)
- Use a non-ephemeral port (3030 instead of 33000) by default
- Updated test to use non-ephemeral port too