[SPARK-18264][SPARKR] build vignettes with package, update vignettes for CRAN release build and add info on release
## What changes were proposed in this pull request? Changes to DESCRIPTION to build vignettes. Changes the metadata for vignettes to generate the recommended format (which is about <10% of size before). Unfortunately it does not look as nice (before - left, after - right) ![image](https://cloud.githubusercontent.com/assets/8969467/20040492/b75883e6-a40d-11e6-9534-25cdd5d59a8b.png) ![image](https://cloud.githubusercontent.com/assets/8969467/20040490/a40f4d42-a40d-11e6-8c91-af00ddcbdad9.png) Also add information on how to run build/release to CRAN later. ## How was this patch tested? manually, unit tests shivaram We need this for branch-2.1 Author: Felix Cheung <felixcheung_m@hotmail.com> Closes #15790 from felixcheung/rpkgvignettes.
This commit is contained in:
parent
6e95325fc3
commit
ba23f768f7
91
R/CRAN_RELEASE.md
Normal file
91
R/CRAN_RELEASE.md
Normal file
|
@ -0,0 +1,91 @@
|
|||
# SparkR CRAN Release
|
||||
|
||||
To release SparkR as a package to CRAN, we would use the `devtools` package. Please work with the
|
||||
`dev@spark.apache.org` community and R package maintainer on this.
|
||||
|
||||
### Release
|
||||
|
||||
First, check that the `Version:` field in the `pkg/DESCRIPTION` file is updated. Also, check for stale files not under source control.
|
||||
|
||||
Note that while `check-cran.sh` is running `R CMD check`, it is doing so with `--no-manual --no-vignettes`, which skips a few vignettes or PDF checks - therefore it will be preferred to run `R CMD check` on the source package built manually before uploading a release.
|
||||
|
||||
To upload a release, we would need to update the `cran-comments.md`. This should generally contain the results from running the `check-cran.sh` script along with comments on status of all `WARNING` (should not be any) or `NOTE`. As a part of `check-cran.sh` and the release process, the vignettes is build - make sure `SPARK_HOME` is set and Spark jars are accessible.
|
||||
|
||||
Once everything is in place, run in R under the `SPARK_HOME/R` directory:
|
||||
|
||||
```R
|
||||
paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::release(); .libPaths(paths)
|
||||
```
|
||||
|
||||
For more information please refer to http://r-pkgs.had.co.nz/release.html#release-check
|
||||
|
||||
### Testing: build package manually
|
||||
|
||||
To build package manually such as to inspect the resulting `.tar.gz` file content, we would also use the `devtools` package.
|
||||
|
||||
Source package is what get released to CRAN. CRAN would then build platform-specific binary packages from the source package.
|
||||
|
||||
#### Build source package
|
||||
|
||||
To build source package locally without releasing to CRAN, run in R under the `SPARK_HOME/R` directory:
|
||||
|
||||
```R
|
||||
paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::build("pkg"); .libPaths(paths)
|
||||
```
|
||||
|
||||
(http://r-pkgs.had.co.nz/vignettes.html#vignette-workflow-2)
|
||||
|
||||
Similarly, the source package is also created by `check-cran.sh` with `R CMD build pkg`.
|
||||
|
||||
For example, this should be the content of the source package:
|
||||
|
||||
```sh
|
||||
DESCRIPTION R inst tests
|
||||
NAMESPACE build man vignettes
|
||||
|
||||
inst/doc/
|
||||
sparkr-vignettes.html
|
||||
sparkr-vignettes.Rmd
|
||||
sparkr-vignettes.Rman
|
||||
|
||||
build/
|
||||
vignette.rds
|
||||
|
||||
man/
|
||||
*.Rd files...
|
||||
|
||||
vignettes/
|
||||
sparkr-vignettes.Rmd
|
||||
```
|
||||
|
||||
#### Test source package
|
||||
|
||||
To install, run this:
|
||||
|
||||
```sh
|
||||
R CMD INSTALL SparkR_2.1.0.tar.gz
|
||||
```
|
||||
|
||||
With "2.1.0" replaced with the version of SparkR.
|
||||
|
||||
This command installs SparkR to the default libPaths. Once that is done, you should be able to start R and run:
|
||||
|
||||
```R
|
||||
library(SparkR)
|
||||
vignette("sparkr-vignettes", package="SparkR")
|
||||
```
|
||||
|
||||
#### Build binary package
|
||||
|
||||
To build binary package locally, run in R under the `SPARK_HOME/R` directory:
|
||||
|
||||
```R
|
||||
paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); devtools::build("pkg", binary = TRUE); .libPaths(paths)
|
||||
```
|
||||
|
||||
For example, this should be the content of the binary package:
|
||||
|
||||
```sh
|
||||
DESCRIPTION Meta R html tests
|
||||
INDEX NAMESPACE help profile worker
|
||||
```
|
|
@ -6,7 +6,7 @@ SparkR is an R package that provides a light-weight frontend to use Spark from R
|
|||
|
||||
Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
|
||||
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
|
||||
Example:
|
||||
Example:
|
||||
```bash
|
||||
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
|
||||
export R_HOME=/home/username/R
|
||||
|
@ -46,7 +46,7 @@ Sys.setenv(SPARK_HOME="/Users/username/spark")
|
|||
# This line loads SparkR from the installed directory
|
||||
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
|
||||
library(SparkR)
|
||||
sc <- sparkR.init(master="local")
|
||||
sparkR.session()
|
||||
```
|
||||
|
||||
#### Making changes to SparkR
|
||||
|
@ -54,11 +54,11 @@ sc <- sparkR.init(master="local")
|
|||
The [instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) for making contributions to Spark also apply to SparkR.
|
||||
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
|
||||
Once you have made your changes, please include unit tests for them and run existing unit tests using the `R/run-tests.sh` script as described below.
|
||||
|
||||
|
||||
#### Generating documentation
|
||||
|
||||
The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script. Also, you may need to install these [prerequisites](https://github.com/apache/spark/tree/master/docs#prerequisites). See also, `R/DOCUMENTATION.md`
|
||||
|
||||
|
||||
### Examples, Unit tests
|
||||
|
||||
SparkR comes with several sample programs in the `examples/src/main/r` directory.
|
||||
|
|
|
@ -36,11 +36,27 @@ if [ ! -z "$R_HOME" ]
|
|||
fi
|
||||
echo "USING R_HOME = $R_HOME"
|
||||
|
||||
# Build the latest docs
|
||||
# Build the latest docs, but not vignettes, which is built with the package next
|
||||
$FWDIR/create-docs.sh
|
||||
|
||||
# Build a zip file containing the source package
|
||||
"$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg
|
||||
# Build source package with vignettes
|
||||
SPARK_HOME="$(cd "${FWDIR}"/..; pwd)"
|
||||
. "${SPARK_HOME}"/bin/load-spark-env.sh
|
||||
if [ -f "${SPARK_HOME}/RELEASE" ]; then
|
||||
SPARK_JARS_DIR="${SPARK_HOME}/jars"
|
||||
else
|
||||
SPARK_JARS_DIR="${SPARK_HOME}/assembly/target/scala-$SPARK_SCALA_VERSION/jars"
|
||||
fi
|
||||
|
||||
if [ -d "$SPARK_JARS_DIR" ]; then
|
||||
# Build a zip file containing the source package with vignettes
|
||||
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg
|
||||
|
||||
find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
|
||||
else
|
||||
echo "Error Spark JARs not found in $SPARK_HOME"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Run check as-cran.
|
||||
VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`
|
||||
|
@ -54,11 +70,16 @@ fi
|
|||
|
||||
if [ -n "$NO_MANUAL" ]
|
||||
then
|
||||
CRAN_CHECK_OPTIONS=$CRAN_CHECK_OPTIONS" --no-manual"
|
||||
CRAN_CHECK_OPTIONS=$CRAN_CHECK_OPTIONS" --no-manual --no-vignettes"
|
||||
fi
|
||||
|
||||
echo "Running CRAN check with $CRAN_CHECK_OPTIONS options"
|
||||
|
||||
"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
|
||||
|
||||
if [ -n "$NO_TESTS" ] && [ -n "$NO_MANUAL" ]
|
||||
then
|
||||
"$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
|
||||
else
|
||||
# This will run tests and/or build vignettes, and require SPARK_HOME
|
||||
SPARK_HOME="${SPARK_HOME}" "$R_SCRIPT_PATH/"R CMD check $CRAN_CHECK_OPTIONS SparkR_"$VERSION".tar.gz
|
||||
fi
|
||||
popd > /dev/null
|
||||
|
|
|
@ -20,7 +20,7 @@
|
|||
# Script to create API docs and vignettes for SparkR
|
||||
# This requires `devtools`, `knitr` and `rmarkdown` to be installed on the machine.
|
||||
|
||||
# After running this script the html docs can be found in
|
||||
# After running this script the html docs can be found in
|
||||
# $SPARK_HOME/R/pkg/html
|
||||
# The vignettes can be found in
|
||||
# $SPARK_HOME/R/pkg/vignettes/sparkr_vignettes.html
|
||||
|
@ -52,21 +52,4 @@ Rscript -e 'libDir <- "../../lib"; library(SparkR, lib.loc=libDir); library(knit
|
|||
|
||||
popd
|
||||
|
||||
# Find Spark jars.
|
||||
if [ -f "${SPARK_HOME}/RELEASE" ]; then
|
||||
SPARK_JARS_DIR="${SPARK_HOME}/jars"
|
||||
else
|
||||
SPARK_JARS_DIR="${SPARK_HOME}/assembly/target/scala-$SPARK_SCALA_VERSION/jars"
|
||||
fi
|
||||
|
||||
# Only create vignettes if Spark JARs exist
|
||||
if [ -d "$SPARK_JARS_DIR" ]; then
|
||||
# render creates SparkR vignettes
|
||||
Rscript -e 'library(rmarkdown); paths <- .libPaths(); .libPaths(c("lib", paths)); Sys.setenv(SPARK_HOME=tools::file_path_as_absolute("..")); render("pkg/vignettes/sparkr-vignettes.Rmd"); .libPaths(paths)'
|
||||
|
||||
find pkg/vignettes/. -not -name '.' -not -name '*.Rmd' -not -name '*.md' -not -name '*.pdf' -not -name '*.html' -delete
|
||||
else
|
||||
echo "Skipping R vignettes as Spark JARs not found in $SPARK_HOME"
|
||||
fi
|
||||
|
||||
popd
|
||||
|
|
|
@ -1,8 +1,8 @@
|
|||
Package: SparkR
|
||||
Type: Package
|
||||
Title: R Frontend for Apache Spark
|
||||
Version: 2.0.0
|
||||
Date: 2016-08-27
|
||||
Version: 2.1.0
|
||||
Date: 2016-11-06
|
||||
Authors@R: c(person("Shivaram", "Venkataraman", role = c("aut", "cre"),
|
||||
email = "shivaram@cs.berkeley.edu"),
|
||||
person("Xiangrui", "Meng", role = "aut",
|
||||
|
@ -18,7 +18,9 @@ Depends:
|
|||
Suggests:
|
||||
testthat,
|
||||
e1071,
|
||||
survival
|
||||
survival,
|
||||
knitr,
|
||||
rmarkdown
|
||||
Description: The SparkR package provides an R frontend for Apache Spark.
|
||||
License: Apache License (== 2.0)
|
||||
Collate:
|
||||
|
@ -48,3 +50,4 @@ Collate:
|
|||
'utils.R'
|
||||
'window.R'
|
||||
RoxygenNote: 5.0.1
|
||||
VignetteBuilder: knitr
|
||||
|
|
|
@ -1,12 +1,13 @@
|
|||
---
|
||||
title: "SparkR - Practical Guide"
|
||||
output:
|
||||
html_document:
|
||||
theme: united
|
||||
rmarkdown::html_vignette:
|
||||
toc: true
|
||||
toc_depth: 4
|
||||
toc_float: true
|
||||
highlight: textmate
|
||||
vignette: >
|
||||
%\VignetteIndexEntry{SparkR - Practical Guide}
|
||||
%\VignetteEngine{knitr::rmarkdown}
|
||||
\usepackage[utf8]{inputenc}
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
|
Loading…
Reference in a new issue