Commit graph

10 commits

Author SHA1 Message Date
hyukjinkwon c2aeddf9ea [SPARK-22817][R] Use fixed testthat version for SparkR tests in AppVeyor
## What changes were proposed in this pull request?

`testthat` 2.0.0 is released and AppVeyor now started to use it instead of 1.0.2. And then, we started to have R tests failed in AppVeyor. See - https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1967-master

```
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) :
  object 'run_tests' not found
Calls: ::: -> get
```

This seems because we rely on internal `testthat:::run_tests` here:

https://github.com/r-lib/testthat/blob/v1.0.2/R/test-package.R#L62-L75

dc4c351837/R/pkg/tests/run-all.R (L49-L52)

However, seems it was removed out from 2.0.0.  I tried few other exposed APIs like `test_dir` but I failed to make a good compatible fix.

Seems we better fix the `testthat` version first to make the build passed.

## How was this patch tested?

Manually tested and AppVeyor tests.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #20003 from HyukjinKwon/SPARK-22817.
2017-12-17 14:40:41 +09:00
Jakub Nowacki b4edafa99b [SPARK-22495] Fix setup of SPARK_HOME variable on Windows
## What changes were proposed in this pull request?

Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous version was working with the built release download, the set of directories changed slightly for the PySpark `pip` or `conda` install. This has been reflected in Linux files in `bin` but not for Windows `cmd` files.

First fix improves the way how the `jars` directory is found, as this was stoping Windows version of `pip/conda` install from working; JARs were not found by on Session/Context setup.

Second fix is adding `find-spark-home.cmd` script, which uses `find_spark_home.py` script, as the Linux version, to resolve `SPARK_HOME`. It is based on `find-spark-home` bash script, though, some operations are done in different order due to the `cmd` script language limitations. If environment variable is set, the Python script `find_spark_home.py` will not be run. The process can fail if Python is not installed, but it will mostly use this way if PySpark is installed via `pip/conda`, thus, there is some Python in the system.

## How was this patch tested?

Tested on local installation.

Author: Jakub Nowacki <j.s.nowacki@gmail.com>

Closes #19370 from jsnowacki/fix_spark_cmds.
2017-11-23 12:47:38 +09:00
Felix Cheung 828fab0356 [BUILD][TEST][SPARKR] add sparksubmitsuite to appveyor tests
## What changes were proposed in this pull request?

more file regex

## How was this patch tested?

Jenkins, AppVeyor

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #19177 from felixcheung/rmoduletotest.
2017-09-11 09:32:25 +09:00
hyukjinkwon 75a6d05853 [MINOR][R] Add knitr and rmarkdown packages/improve output for version info in AppVeyor tests
## What changes were proposed in this pull request?

This PR proposes three things as below:

**Install packages per documentation** - this does not affect the tests itself (but CRAN which we are not doing via AppVeyor) up to my knowledge.

This adds `knitr` and `rmarkdown` per 45824fb608/R/WINDOWS.md (unit-tests) (please see 45824fb608)

**Improve logs/shorten logs** - actually, long logs can be a problem on AppVeyor (e.g., see https://github.com/apache/spark/pull/17873)

`R -e ...` repeats printing R information for each invocation as below:

```
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
```

It looks reducing the call might be slightly better and print out the versions together looks more readable.

Before:

```
# R information ...
> packageVersion('testthat')
[1] '1.0.2'
>
>

# R information ...
> packageVersion('e1071')
[1] '1.6.8'
>
>
... 3 more times
```

After:

```
# R information ...
> packageVersion('knitr'); packageVersion('rmarkdown'); packageVersion('testthat'); packageVersion('e1071'); packageVersion('survival')
[1] ‘1.16’
[1] ‘1.6’
[1] ‘1.0.2’
[1] ‘1.6.8’
[1] ‘2.41.3’
```

**Add`appveyor.yml`/`dev/appveyor-install-dependencies.ps1` for triggering the test**

Changing this file might break the test, e.g., https://github.com/apache/spark/pull/16927

## How was this patch tested?

Before (please see https://ci.appveyor.com/project/HyukjinKwon/spark/build/169-master)
After (please see the AppVeyor build in this PR):

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #18336 from HyukjinKwon/minor-add-knitr-and-rmarkdown.
2017-06-18 08:43:47 +01:00
Felix Cheung 7087e01194 [SPARK-20543][SPARKR][FOLLOWUP] Don't skip tests on AppVeyor
## What changes were proposed in this pull request?

add environment

## How was this patch tested?

wait for appveyor run

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #17878 from felixcheung/appveyorrcran.
2017-05-07 13:10:10 -07:00
hyukjinkwon b433acae74 [SPARK-20614][PROJECT INFRA] Use the same log4j configuration with Jenkins in AppVeyor
## What changes were proposed in this pull request?

Currently, there are flooding logs in AppVeyor (in the console). This has been fine because we can download all the logs. However, (given my observations so far), logs are truncated when there are too many. It has been grown recently and it started to get truncated. For example, see  https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1209-master

Even after the log is downloaded, it looks truncated as below:

```
[00:44:21] 17/05/04 18:56:18 INFO TaskSetManager: Finished task 197.0 in stage 601.0 (TID 9211) in 0 ms on localhost (executor driver) (194/200)
[00:44:21] 17/05/04 18:56:18 INFO Executor: Running task 199.0 in stage 601.0 (TID 9213)
[00:44:21] 17/05/04 18:56:18 INFO Executor: Finished task 198.0 in stage 601.0 (TID 9212). 2473 bytes result sent to driver
...
```

Probably, it looks better to use the same log4j configuration that we are using for SparkR tests in Jenkins(please see fc472bddd1/R/run-tests.sh (L26) and fc472bddd1/R/log4j.properties)
```
# Set everything to be logged to the file target/unit-tests.log
log4j.rootCategory=INFO, file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.append=true
log4j.appender.file.file=R/target/unit-tests.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss.SSS} %t %p %c{1}: %m%n

# Ignore messages below warning level from Jetty, because it's a bit verbose
log4j.logger.org.eclipse.jetty=WARN
org.eclipse.jetty.LEVEL=WARN
```

## How was this patch tested?

Manually tested with spark-test account
  - https://ci.appveyor.com/project/spark-test/spark/build/672-r-log4j (there is an example for flaky test here)
  - https://ci.appveyor.com/project/spark-test/spark/build/673-r-log4j (I re-ran the build).

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #17873 from HyukjinKwon/appveyor-reduce-logs.
2017-05-05 21:26:55 -07:00
hyukjinkwon 2422c86f2c [SPARK-20092][R][PROJECT INFRA] Add the detection for Scala codes dedicated for R in AppVeyor tests
## What changes were proposed in this pull request?

We are currently detecting the changes in `R/` directory only and then trigger AppVeyor tests.

It seems we need to tests when there are Scala codes dedicated for R in `core/src/main/scala/org/apache/spark/api/r/`, `sql/core/src/main/scala/org/apache/spark/sql/api/r/` and `mllib/src/main/scala/org/apache/spark/ml/r/` too.

This will enables the tests, for example, for SPARK-20088.

## How was this patch tested?

Tests with manually created PRs.

- Changes in `sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala` https://github.com/spark-test/spark/pull/13
- Changes in `core/src/main/scala/org/apache/spark/api/r/SerDe.scala` https://github.com/spark-test/spark/pull/12
- Changes in `README.md` https://github.com/spark-test/spark/pull/14

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #17427 from HyukjinKwon/SPARK-20092.
2017-03-25 23:29:02 -07:00
Yuming Wang 9b8eca65dc [SPARK-19660][CORE][SQL] Replace the configuration property names that are deprecated in the version of Hadoop 2.6
## What changes were proposed in this pull request?

Replace all the Hadoop deprecated configuration property names according to [DeprecatedProperties](https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/DeprecatedProperties.html).

except:
https://github.com/apache/spark/blob/v2.1.0/python/pyspark/sql/tests.py#L1533
https://github.com/apache/spark/blob/v2.1.0/sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala#L987
https://github.com/apache/spark/blob/v2.1.0/sql/core/src/main/scala/org/apache/spark/sql/execution/command/SetCommand.scala#L45
https://github.com/apache/spark/blob/v2.1.0/sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L614

## How was this patch tested?

Existing tests

Author: Yuming Wang <wgyumg@gmail.com>

Closes #16990 from wangyum/HadoopDeprecatedProperties.
2017-02-28 10:13:42 +00:00
Sean Owen e8d3fca450
[SPARK-19464][CORE][YARN][TEST-HADOOP2.6] Remove support for Hadoop 2.5 and earlier
## What changes were proposed in this pull request?

- Remove support for Hadoop 2.5 and earlier
- Remove reflection and code constructs only needed to support multiple versions at once
- Update docs to reflect newer versions
- Remove older versions' builds and profiles.

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #16810 from srowen/SPARK-19464.
2017-02-08 12:20:07 +00:00
hyukjinkwon 78d5d4dd5c [SPARK-17200][PROJECT INFRA][BUILD][SPARKR] Automate building and testing on Windows (currently SparkR only)
## What changes were proposed in this pull request?

This PR adds the build automation on Windows with [AppVeyor](https://www.appveyor.com/) CI tool.

Currently, this only runs the tests for SparkR as we have been having some issues with testing Windows-specific PRs (e.g. https://github.com/apache/spark/pull/14743 and https://github.com/apache/spark/pull/13165) and hard time to verify this.

One concern is, this build is dependent on [steveloughran/winutils](https://github.com/steveloughran/winutils) for pre-built Hadoop bin package (who is a Hadoop PMC member).

## How was this patch tested?

Manually, https://ci.appveyor.com/project/HyukjinKwon/spark/build/88-SPARK-17200-build-profile
This takes roughly 40 mins.

Some tests are already being failed and this was found in https://github.com/apache/spark/pull/14743#issuecomment-241405287.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #14859 from HyukjinKwon/SPARK-17200-build.
2016-09-08 08:26:59 -07:00