spark-instrumented-optimizer/sbin/slaves.sh
Aaron Davidson 007a733434 SPARK-1286: Make usage of spark-env.sh idempotent
Various spark scripts load spark-env.sh. This can cause growth of any variables that may be appended to (SPARK_CLASSPATH, SPARK_REPL_OPTS) and it makes the precedence order for options specified in spark-env.sh less clear.

One use-case for the latter is that we want to set options from the command-line of spark-shell, but these options will be overridden by subsequent loading of spark-env.sh. If we were to load the spark-env.sh first and then set our command-line options, we could guarantee correct precedence order.

Note that we use SPARK_CONF_DIR if available to support the sbin/ scripts, which always set this variable from sbin/spark-config.sh. Otherwise, we default to the ../conf/ as usual.

Author: Aaron Davidson <aaron@databricks.com>

Closes #184 from aarondav/idem and squashes the following commits:

e291f91 [Aaron Davidson] Use "private" variables in load-spark-env.sh
8da8360 [Aaron Davidson] Add .sh extension to load-spark-env.sh
93a2471 [Aaron Davidson] SPARK-1286: Make usage of spark-env.sh idempotent
2014-03-24 22:24:21 -07:00

90 lines
2.4 KiB
Bash
Executable file

#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# Run a shell command on all slave hosts.
#
# Environment Variables
#
# SPARK_SLAVES File naming remote hosts.
# Default is ${SPARK_CONF_DIR}/slaves.
# SPARK_CONF_DIR Alternate conf dir. Default is ${SPARK_HOME}/conf.
# SPARK_SLAVE_SLEEP Seconds to sleep between spawning remote commands.
# SPARK_SSH_OPTS Options passed to ssh when running remote commands.
##
usage="Usage: slaves.sh [--config <conf-dir>] command..."
# if no args specified, show usage
if [ $# -le 0 ]; then
echo $usage
exit 1
fi
sbin=`dirname "$0"`
sbin=`cd "$sbin"; pwd`
. "$sbin/spark-config.sh"
# If the slaves file is specified in the command line,
# then it takes precedence over the definition in
# spark-env.sh. Save it here.
HOSTLIST=$SPARK_SLAVES
# Check if --config is passed as an argument. It is an optional parameter.
# Exit if the argument is not a directory.
if [ "$1" == "--config" ]
then
shift
conf_dir=$1
if [ ! -d "$conf_dir" ]
then
echo "ERROR : $conf_dir is not a directory"
echo $usage
exit 1
else
export SPARK_CONF_DIR=$conf_dir
fi
shift
fi
. "$SPARK_PREFIX/bin/load-spark-env.sh"
if [ "$HOSTLIST" = "" ]; then
if [ "$SPARK_SLAVES" = "" ]; then
export HOSTLIST="${SPARK_CONF_DIR}/slaves"
else
export HOSTLIST="${SPARK_SLAVES}"
fi
fi
# By default disable strict host key checking
if [ "$SPARK_SSH_OPTS" = "" ]; then
SPARK_SSH_OPTS="-o StrictHostKeyChecking=no"
fi
for slave in `cat "$HOSTLIST"|sed "s/#.*$//;/^$/d"`; do
ssh $SPARK_SSH_OPTS $slave $"${@// /\\ }" \
2>&1 | sed "s/^/$slave: /" &
if [ "$SPARK_SLAVE_SLEEP" != "" ]; then
sleep $SPARK_SLAVE_SLEEP
fi
done
wait