spark-instrumented-optimizer/R/pkg/inst
Yanbo Liang 07be232ea1 [SPARK-18412][SPARKR][ML] Fix exception for some SparkR ML algorithms training on libsvm data
## What changes were proposed in this pull request?
* Fix the following exceptions which throws when ```spark.randomForest```(classification), ```spark.gbt```(classification), ```spark.naiveBayes``` and ```spark.glm```(binomial family) were fitted on libsvm data.
```
java.lang.IllegalArgumentException: requirement failed: If label column already exists, forceIndexLabel can not be set with true.
```
See [SPARK-18412](https://issues.apache.org/jira/browse/SPARK-18412) for more detail about how to reproduce this bug.
* Refactor out ```getFeaturesAndLabels``` to RWrapperUtils, since lots of ML algorithm wrappers use this function.
* Drop some unwanted columns when making prediction.

## How was this patch tested?
Add unit test.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #15851 from yanboliang/spark-18412.
2016-11-13 20:25:12 -08:00
..
profile [SPARK-15159][SPARKR] SparkR SparkSession API 2016-06-17 21:36:01 -07:00
tests/testthat [SPARK-18412][SPARKR][ML] Fix exception for some SparkR ML algorithms training on libsvm data 2016-11-13 20:25:12 -08:00
worker [SPARK-17919] Make timeout to RBackend configurable in SparkR 2016-10-30 16:17:23 -07:00