[Spark-4848] Allow different Worker configurations in standalone cluster

This refixes #3699 with the latest code.
This fixes SPARK-4848

I've changed the stand-alone cluster scripts to allow different workers to have different numbers of instances, with both port and web-ui port following allong appropriately.

I did this by moving the loop over instances from start-slaves and stop-slaves (on the master) to start-slave and stop-slave (on the worker).

Wile I was at it, I changed SPARK_WORKER_PORT to work the same way as SPARK_WORKER_WEBUI_PORT, since the new methods work fine for both.

Author: Nathan Kronenfeld <nkronenfeld@oculusinfo.com>

Closes #5140 from nkronenfeld/feature/spark-4848 and squashes the following commits:

cf5f47e [Nathan Kronenfeld] Merge remote branch 'upstream/master' into feature/spark-4848
044ca6f [Nathan Kronenfeld] Documentation and formatting as requested by by andrewor14
d739640 [Nathan Kronenfeld] Move looping through instances from the master to the workers, so that each worker respects its own number of instances and web-ui port
This commit is contained in:
Nathan Kronenfeld 2015-04-13 18:21:16 -07:00 committed by Andrew Or
parent 4898dfa464
commit 435b8779df
4 changed files with 103 additions and 22 deletions

View file

@ -18,15 +18,68 @@
#
# Starts a slave on the machine this script is executed on.
#
# Environment Variables
#
# SPARK_WORKER_INSTANCES The number of worker instances to run on this
# slave. Default is 1.
# SPARK_WORKER_PORT The base port number for the first worker. If set,
# subsequent workers will increment this number. If
# unset, Spark will find a valid port number, but
# with no guarantee of a predictable pattern.
# SPARK_WORKER_WEBUI_PORT The base port for the web interface of the first
# worker. Subsequent workers will increment this
# number. Default is 8081.
usage="Usage: start-slave.sh <worker#> <spark-master-URL> where <spark-master-URL> is like spark://localhost:7077"
usage="Usage: start-slave.sh <spark-master-URL> where <spark-master-URL> is like spark://localhost:7077"
if [ $# -lt 2 ]; then
if [ $# -lt 1 ]; then
echo $usage
echo Called as start-slave.sh $*
exit 1
fi
sbin="`dirname "$0"`"
sbin="`cd "$sbin"; pwd`"
"$sbin"/spark-daemon.sh start org.apache.spark.deploy.worker.Worker "$@"
. "$sbin/spark-config.sh"
. "$SPARK_PREFIX/bin/load-spark-env.sh"
# First argument should be the master; we need to store it aside because we may
# need to insert arguments between it and the other arguments
MASTER=$1
shift
# Determine desired worker port
if [ "$SPARK_WORKER_WEBUI_PORT" = "" ]; then
SPARK_WORKER_WEBUI_PORT=8081
fi
# Start up the appropriate number of workers on this machine.
# quick local function to start a worker
function start_instance {
WORKER_NUM=$1
shift
if [ "$SPARK_WORKER_PORT" = "" ]; then
PORT_FLAG=
PORT_NUM=
else
PORT_FLAG="--port"
PORT_NUM=$(( $SPARK_WORKER_PORT + $WORKER_NUM - 1 ))
fi
WEBUI_PORT=$(( $SPARK_WORKER_WEBUI_PORT + $WORKER_NUM - 1 ))
"$sbin"/spark-daemon.sh start org.apache.spark.deploy.worker.Worker $WORKER_NUM \
--webui-port "$WEBUI_PORT" $PORT_FLAG $PORT_NUM $MASTER "$@"
}
if [ "$SPARK_WORKER_INSTANCES" = "" ]; then
start_instance 1 "$@"
else
for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do
start_instance $(( 1 + $i )) "$@"
done
fi

View file

@ -59,13 +59,4 @@ if [ "$START_TACHYON" == "true" ]; then
fi
# Launch the slaves
if [ "$SPARK_WORKER_INSTANCES" = "" ]; then
exec "$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/start-slave.sh" 1 "spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORT"
else
if [ "$SPARK_WORKER_WEBUI_PORT" = "" ]; then
SPARK_WORKER_WEBUI_PORT=8081
fi
for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do
"$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/start-slave.sh" $(( $i + 1 )) --webui-port $(( $SPARK_WORKER_WEBUI_PORT + $i )) "spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORT"
done
fi
"$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin/start-slave.sh" "spark://$SPARK_MASTER_IP:$SPARK_MASTER_PORT"

43
sbin/stop-slave.sh Executable file
View file

@ -0,0 +1,43 @@
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# A shell script to stop all workers on a single slave
#
# Environment variables
#
# SPARK_WORKER_INSTANCES The number of worker instances that should be
# running on this slave. Default is 1.
# Usage: stop-slave.sh
# Stops all slaves on this worker machine
sbin="`dirname "$0"`"
sbin="`cd "$sbin"; pwd`"
. "$sbin/spark-config.sh"
. "$SPARK_PREFIX/bin/load-spark-env.sh"
if [ "$SPARK_WORKER_INSTANCES" = "" ]; then
"$sbin"/spark-daemon.sh stop org.apache.spark.deploy.worker.Worker 1
else
for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do
"$sbin"/spark-daemon.sh stop org.apache.spark.deploy.worker.Worker $(( $i + 1 ))
done
fi

View file

@ -17,8 +17,8 @@
# limitations under the License.
#
sbin=`dirname "$0"`
sbin=`cd "$sbin"; pwd`
sbin="`dirname "$0"`"
sbin="`cd "$sbin"; pwd`"
. "$sbin/spark-config.sh"
@ -29,10 +29,4 @@ if [ -e "$sbin"/../tachyon/bin/tachyon ]; then
"$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin"/../tachyon/bin/tachyon killAll tachyon.worker.Worker
fi
if [ "$SPARK_WORKER_INSTANCES" = "" ]; then
"$sbin"/spark-daemons.sh stop org.apache.spark.deploy.worker.Worker 1
else
for ((i=0; i<$SPARK_WORKER_INSTANCES; i++)); do
"$sbin"/spark-daemons.sh stop org.apache.spark.deploy.worker.Worker $(( $i + 1 ))
done
fi
"$sbin/slaves.sh" cd "$SPARK_HOME" \; "$sbin"/stop-slave.sh