spark-instrumented-optimizer/R/pkg
Liang-Chi Hsieh 33107897ad [SPARK-11215][ML] Add multiple columns support to StringIndexer
## What changes were proposed in this pull request?

This takes over #19621 to add multi-column support to StringIndexer:

1. Supports encoding multiple columns.
2. Previously, when specifying `frequencyDesc` or `frequencyAsc` as `stringOrderType` param in `StringIndexer`, in case of equal frequency, the order of strings is undefined. After this change, the strings with equal frequency are further sorted alphabetically.

## How was this patch tested?

Added tests.

Closes #20146 from viirya/SPARK-11215.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-01-29 09:21:25 -06:00
..
inst [MINOR][R] Fix indents of sparkR welcome message to be consistent with pyspark and spark-shell 2018-12-13 20:05:49 +08:00
R [SPARK-25981][R] Enables Arrow optimization from R DataFrame to Spark DataFrame 2019-01-27 10:45:49 +08:00
src-native [SPARK-6811] Copy SparkR lib in make-distribution.sh 2015-05-23 00:04:01 -07:00
tests [SPARK-11215][ML] Add multiple columns support to StringIndexer 2019-01-29 09:21:25 -06:00
vignettes [SPARK-19827][R] spark.ml R API for PIC 2018-12-10 18:28:13 -06:00
.lintr [SPARK-22063][R] Fixes lint check failures in R by latest commit sha1 ID of lint-r 2017-10-01 18:42:45 +09:00
.Rbuildignore [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
DESCRIPTION [SPARK-26014][R] Deprecate R prior to version 3.4 in SparkR 2018-11-15 17:20:49 +08:00
NAMESPACE [SPARK-19827][R] spark.ml R API for PIC 2018-12-10 18:28:13 -06:00