acb9715779
## What changes were proposed in this pull request? When running SparkR job in yarn-cluster mode, it will download Spark package from apache website which is not necessary. ``` ./bin/spark-submit --master yarn-cluster ./examples/src/main/r/dataframe.R ``` The following is output: ``` Attaching package: ‘SparkR’ The following objects are masked from ‘package:stats’: cov, filter, lag, na.omit, predict, sd, var, window The following objects are masked from ‘package:base’: as.data.frame, colnames, colnames<-, drop, endsWith, intersect, rank, rbind, sample, startsWith, subset, summary, transform, union Spark not found in SPARK_HOME: Spark not found in the cache directory. Installation will start. MirrorUrl not provided. Looking for preferred site from apache website... ...... ``` There's no ```SPARK_HOME``` in yarn-cluster mode since the R process is in a remote host of the yarn cluster rather than in the client host. The JVM comes up first and the R process then connects to it. So in such cases we should never have to download Spark as Spark is already running. ## How was this patch tested? Offline test. Author: Yanbo Liang <ybliang8@gmail.com> Closes #15888 from yanboliang/spark-18444.
47 lines
1.7 KiB
R
47 lines
1.7 KiB
R
#
|
|
# Licensed to the Apache Software Foundation (ASF) under one or more
|
|
# contributor license agreements. See the NOTICE file distributed with
|
|
# this work for additional information regarding copyright ownership.
|
|
# The ASF licenses this file to You under the Apache License, Version 2.0
|
|
# (the "License"); you may not use this file except in compliance with
|
|
# the License. You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
#
|
|
|
|
context("functions in sparkR.R")
|
|
|
|
test_that("sparkCheckInstall", {
|
|
# "local, yarn-client, mesos-client" mode, SPARK_HOME was set correctly,
|
|
# and the SparkR job was submitted by "spark-submit"
|
|
sparkHome <- paste0(tempdir(), "/", "sparkHome")
|
|
dir.create(sparkHome)
|
|
master <- ""
|
|
deployMode <- ""
|
|
expect_true(is.null(sparkCheckInstall(sparkHome, master, deployMode)))
|
|
unlink(sparkHome, recursive = TRUE)
|
|
|
|
# "yarn-cluster, mesos-cluster" mode, SPARK_HOME was not set,
|
|
# and the SparkR job was submitted by "spark-submit"
|
|
sparkHome <- ""
|
|
master <- ""
|
|
deployMode <- ""
|
|
expect_true(is.null(sparkCheckInstall(sparkHome, master, deployMode)))
|
|
|
|
# "yarn-client, mesos-client" mode, SPARK_HOME was not set
|
|
sparkHome <- ""
|
|
master <- "yarn-client"
|
|
deployMode <- ""
|
|
expect_error(sparkCheckInstall(sparkHome, master, deployMode))
|
|
sparkHome <- ""
|
|
master <- ""
|
|
deployMode <- "client"
|
|
expect_error(sparkCheckInstall(sparkHome, master, deployMode))
|
|
})
|