Commit graph

2 commits

Author SHA1 Message Date
Felix Cheung fc472bddd1 [SPARK-20543][SPARKR] skip tests when running on CRAN
## What changes were proposed in this pull request?

General rule on skip or not:
skip if
- RDD tests
- tests could run long or complicated (streaming, hivecontext)
- tests on error conditions
- tests won't likely change/break

## How was this patch tested?

unit tests, `R CMD check --as-cran`, `R CMD check`

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #17817 from felixcheung/rskiptest.
2017-05-03 21:40:18 -07:00
Yanbo Liang acb9715779 [SPARK-18444][SPARKR] SparkR running in yarn-cluster mode should not download Spark package.
## What changes were proposed in this pull request?
When running SparkR job in yarn-cluster mode, it will download Spark package from apache website which is not necessary.
```
./bin/spark-submit --master yarn-cluster ./examples/src/main/r/dataframe.R
```
The following is output:
```
Attaching package: ‘SparkR’

The following objects are masked from ‘package:stats’:

    cov, filter, lag, na.omit, predict, sd, var, window

The following objects are masked from ‘package:base’:

    as.data.frame, colnames, colnames<-, drop, endsWith, intersect,
    rank, rbind, sample, startsWith, subset, summary, transform, union

Spark not found in SPARK_HOME:
Spark not found in the cache directory. Installation will start.
MirrorUrl not provided.
Looking for preferred site from apache website...
......
```
There's no ```SPARK_HOME``` in yarn-cluster mode since the R process is in a remote host of the yarn cluster rather than in the client host. The JVM comes up first and the R process then connects to it. So in such cases we should never have to download Spark as Spark is already running.

## How was this patch tested?
Offline test.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #15888 from yanboliang/spark-18444.
2016-11-22 00:05:30 -08:00