1c403733b8
## What changes were proposed in this pull request? This PR corrects SparkR to use `shell()` instead of `system2()` on Windows. Using `system2(...)` on Windows does not process windows file separator `\`. `shell(tralsate = TRUE, ...)` can treat this problem. So, this was changed to be chosen according to OS. Existing tests were failed on Windows due to this problem. For example, those were failed. ``` 8. Failure: sparkJars tag in SparkContext (test_includeJAR.R#34) 9. Failure: sparkJars tag in SparkContext (test_includeJAR.R#36) ``` The cases above were due to using of `system2`. In addition, this PR also fixes some tests failed on Windows. ``` 5. Failure: sparkJars sparkPackages as comma-separated strings (test_context.R#128) 6. Failure: sparkJars sparkPackages as comma-separated strings (test_context.R#131) 7. Failure: sparkJars sparkPackages as comma-separated strings (test_context.R#134) ``` The cases above were due to a weird behaviour of `normalizePath()`. On Linux, if the path does not exist, it just prints out the input but it prints out including the current path on Windows. ```r # On Linus path <- normalizePath("aa") print(path) [1] "aa" # On Windows path <- normalizePath("aa") print(path) [1] "C:\\Users\\aa" ``` ## How was this patch tested? Jenkins tests and manually tested in a Window machine as below: Here is the [stdout](https://gist.github.com/HyukjinKwon/4bf35184f3a30f3bce987a58ec2bbbab) of testing. Closes #7025 Author: hyukjinkwon <gurwls223@gmail.com> Author: Hyukjin Kwon <gurwls223@gmail.com> Author: Prakash PC <prakash.chinnu@gmail.com> Closes #13165 from HyukjinKwon/pr/7025.
34 lines
1.9 KiB
Markdown
34 lines
1.9 KiB
Markdown
## Building SparkR on Windows
|
|
|
|
To build SparkR on Windows, the following steps are required
|
|
|
|
1. Install R (>= 3.1) and [Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
|
|
include Rtools and R in `PATH`.
|
|
2. Install
|
|
[JDK7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) and set
|
|
`JAVA_HOME` in the system environment variables.
|
|
3. Download and install [Maven](http://maven.apache.org/download.html). Also include the `bin`
|
|
directory in Maven in `PATH`.
|
|
4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
|
|
5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`
|
|
|
|
## Unit tests
|
|
|
|
To run the SparkR unit tests on Windows, the following steps are required —assuming you are in the Spark root directory and do not have Apache Hadoop installed already:
|
|
|
|
1. Create a folder to download Hadoop related files for Windows. For example, `cd ..` and `mkdir hadoop`.
|
|
|
|
2. Download the relevant Hadoop bin package from [steveloughran/winutils](https://github.com/steveloughran/winutils). While these are not official ASF artifacts, they are built from the ASF release git hashes by a Hadoop PMC member on a dedicated Windows VM. For further reading, consult [Windows Problems on the Hadoop wiki](https://wiki.apache.org/hadoop/WindowsProblems).
|
|
|
|
3. Install the files into `hadoop\bin`; make sure that `winutils.exe` and `hadoop.dll` are present.
|
|
|
|
4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.
|
|
|
|
5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
|
|
|
|
```
|
|
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
|
|
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.default.name="file:///" R\pkg\tests\run-all.R
|
|
```
|
|
|