32461d4744
## What changes were proposed in this pull request? compute AUC on one pass ## How was this patch tested? existing tests performance tests: ``` import org.apache.spark.mllib.evaluation._ val scoreAndLabels = sc.parallelize(Array.range(0, 100000).map{ i => (i.toDouble / 100000, (i % 2).toDouble) }, 4) scoreAndLabels.persist() scoreAndLabels.count() val tic = System.currentTimeMillis (0 until 100).foreach{i => val metrics = new BinaryClassificationMetrics(scoreAndLabels, 0); val auc = metrics.areaUnderROC; metrics.unpersist} val toc = System.currentTimeMillis toc - tic ``` |New| Existing| |------|----------| |87532|103644| One-pass AUC saves about 16% computation time. Closes #24648 from zhengruifeng/auc_opt. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |