spark-instrumented-optimizer/mllib
Weichen Xu d56c262109 [SPARK-21681][ML] fix bug of MLOR do not work correctly when featureStd contains zero
## What changes were proposed in this pull request?

fix bug of MLOR do not work correctly when featureStd contains zero

We can reproduce the bug through such dataset (features including zero variance), will generate wrong result (all coefficients becomes 0)
```
    val multinomialDatasetWithZeroVar = {
      val nPoints = 100
      val coefficients = Array(
        -0.57997, 0.912083, -0.371077,
        -0.16624, -0.84355, -0.048509)

      val xMean = Array(5.843, 3.0)
      val xVariance = Array(0.6856, 0.0)  // including zero variance

      val testData = generateMultinomialLogisticInput(
        coefficients, xMean, xVariance, addIntercept = true, nPoints, seed)

      val df = sc.parallelize(testData, 4).toDF().withColumn("weight", lit(1.0))
      df.cache()
      df
    }
```
## How was this patch tested?

testcase added.

Author: WeichenXu <WeichenXu123@outlook.com>

Closes #18896 from WeichenXu123/fix_mlor_stdvalue_zero_bug.
2017-08-22 16:55:34 -07:00
..
src [SPARK-21681][ML] fix bug of MLOR do not work correctly when featureStd contains zero 2017-08-22 16:55:34 -07:00
pom.xml [SPARK-15526][ML][FOLLOWUP] Make JPMML provided scope to avoid including unshaded JARs, and repromote to compile in MLlib 2017-07-18 09:53:51 -07:00