spark-instrumented-optimizer/mllib
Teng Peng 293a0f29e3
[Spark-24024][ML] Fix poisson deviance calculations in GLM to handle y = 0
## What changes were proposed in this pull request?

It is reported by Spark users that the deviance calculation for poisson regression does not handle y = 0. Thus, the correct model summary cannot be obtained. The user has confirmed the the issue is in
```
override def deviance(y: Double, mu: Double, weight: Double): Double =
{ 2.0 * weight * (y * math.log(y / mu) - (y - mu)) }
when y = 0.
```

The user also mentioned there are many other places he believe we should check the same thing. However, no other changes are needed, including Gamma distribution.

## How was this patch tested?
Add a comparison with R deviance calculation to the existing unit test.

Author: Teng Peng <josephtengpeng@gmail.com>

Closes #21125 from tengpeng/Spark24024GLM.
2018-04-23 10:29:47 -07:00
..
src [Spark-24024][ML] Fix poisson deviance calculations in GLM to handle y = 0 2018-04-23 10:29:47 -07:00
pom.xml [SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT 2018-01-13 00:37:59 +08:00