Commit graph

5 commits

Author SHA1 Message Date
Liang-Chi Hsieh 33107897ad [SPARK-11215][ML] Add multiple columns support to StringIndexer
## What changes were proposed in this pull request?

This takes over #19621 to add multi-column support to StringIndexer:

1. Supports encoding multiple columns.
2. Previously, when specifying `frequencyDesc` or `frequencyAsc` as `stringOrderType` param in `StringIndexer`, in case of equal frequency, the order of strings is undefined. After this change, the strings with equal frequency are further sorted alphabetically.

## How was this patch tested?

Added tests.

Closes #20146 from viirya/SPARK-11215.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-01-29 09:21:25 -06:00
actuaryzhang 55aa4da285 [SPARK-21622][ML][SPARKR] Support offset in SparkR GLM
## What changes were proposed in this pull request?
Support offset in SparkR GLM #16699

Author: actuaryzhang <actuaryzhang10@gmail.com>

Closes #18831 from actuaryzhang/sparkROffset.
2017-08-06 15:14:12 -07:00
actuaryzhang ad459cfb1d [SPARK-20917][ML][SPARKR] SparkR supports string encoding consistent with R
## What changes were proposed in this pull request?

Add `stringIndexerOrderType` to `spark.glm` and `spark.survreg` to support string encoding that is consistent with default R.

## How was this patch tested?
new tests

Author: actuaryzhang <actuaryzhang10@gmail.com>

Closes #18140 from actuaryzhang/sparkRFormula.
2017-06-21 10:35:16 -07:00
Felix Cheung 9f4ff95524 [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move
## What changes were proposed in this pull request?

clean up after big test move

## How was this patch tested?

unit tests, jenkins

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #18267 from felixcheung/rtestset2.
2017-06-11 03:00:44 -07:00
Felix Cheung dc4c351837 [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN
## What changes were proposed in this pull request?

Move all existing tests to non-installed directory so that it will never run by installing SparkR package

For a follow-up PR:
- remove all skip_on_cran() calls in tests
- clean up test timer
- improve or change basic tests that do run on CRAN (if anyone has suggestion)

It looks like `R CMD build pkg` will still put pkg\tests (ie. the full tests) into the source package but `R CMD INSTALL` on such source package does not install these tests (and so `R CMD check` does not run them)

## How was this patch tested?

- [x] unit tests, Jenkins
- [x] AppVeyor
- [x] make a source package, install it, `R CMD check` it - verify the full tests are not installed or run

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #18264 from felixcheung/rtestset.
2017-06-11 00:00:33 -07:00
Renamed from R/pkg/inst/tests/testthat/test_mllib_regression.R (Browse further)