spark-instrumented-optimizer

History

Yanbo Liang 1e44dd0044 [SPARK-3181][ML] Implement huber loss for LinearRegression. ## What changes were proposed in this pull request? MLlib ```LinearRegression``` supports _huber_ loss addition to _leastSquares_ loss. The huber loss objective function is: ![image](https://user-images.githubusercontent.com/1962026/29554124-9544d198-8750-11e7-8afa-33579ec419d5.png) Refer Eq.(6) and Eq.(8) in [A robust hybrid of lasso and ridge regression](http://statweb.stanford.edu/~owen/reports/hhu.pdf). This objective is jointly convex as a function of (w, σ) ∈ R × (0,∞), we can use L-BFGS-B to solve it. The current implementation is a straight forward porting for Python scikit-learn [```HuberRegressor```](http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.HuberRegressor.html). There are some differences: * We use mean loss (```lossSum/weightSum```), but sklearn uses total loss (```lossSum```). * We multiply the loss function and L2 regularization by 1/2. It does not affect the result if we multiply the whole formula by a factor, we just keep consistent with _leastSquares_ loss. So if fitting w/o regularization, MLlib and sklearn produce the same output. If fitting w/ regularization, MLlib should set ```regParam``` divide by the number of instances to match the output of sklearn. ## How was this patch tested? Unit tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes #19020 from yanboliang/spark-3181.		2017-12-13 21:19:14 -08:00
..
build.properties	[SPARK-21709][BUILD] sbt 0.13.16 and some plugin updates	2017-08-12 20:01:20 +01:00
MimaBuild.scala	[SPARK-22485][BUILD] Use `exclude[Problem]` instead `excludePackage` in MiMa	2017-11-09 16:40:19 -08:00
MimaExcludes.scala	[SPARK-3181][ML] Implement huber loss for LinearRegression.	2017-12-13 21:19:14 -08:00
plugins.sbt	[SPARK-21708][BUILD] update some sbt plugins	2017-10-31 08:16:54 +00:00
SparkBuild.scala	[SPARK-18278][SCHEDULER] Spark on Kubernetes - Basic Scheduler Backend	2017-11-28 23:02:09 -08:00