spark-instrumented-optimizer/R/pkg
Sean Owen 79f5f281bb
[SPARK-18678][ML] Skewed reservoir sampling in SamplingUtils
## What changes were proposed in this pull request?

Fix reservoir sampling bias for small k. An off-by-one error meant that the probability of replacement was slightly too high -- k/(l-1) after l element instead of k/l, which matters for small k.

## How was this patch tested?

Existing test plus new test case.

Author: Sean Owen <sowen@cloudera.com>

Closes #16129 from srowen/SPARK-18678.
2016-12-07 17:34:45 +08:00
..
inst [SPARK-18678][ML] Skewed reservoir sampling in SamplingUtils 2016-12-07 17:34:45 +08:00
R [SPARK-18686][SPARKR][ML] Several cleanup and improvements for spark.logit. 2016-12-07 00:31:11 -08:00
src-native [SPARK-6811] Copy SparkR lib in make-distribution.sh 2015-05-23 00:04:01 -07:00
tests [SPARK-12034][SPARKR] Eliminate warnings in SparkR test cases. 2015-12-07 10:38:17 -08:00
vignettes [SPARK-18643][SPARKR] SparkR hangs at session start when installed as a package without Spark 2016-12-04 20:25:11 -08:00
.lintr [SPARK-12327][SPARKR] fix code for lintr warning for commented code 2016-01-03 20:53:35 +05:30
.Rbuildignore [SPARK-16507][SPARKR] Add a CRAN checker, fix Rd aliases 2016-07-16 17:06:44 -07:00
DESCRIPTION [SPARK-18073][DOCS][WIP] Migrate wiki to spark.apache.org web site 2016-11-23 11:25:47 +00:00
NAMESPACE [SPARK-18239][SPARKR] Gradient Boosted Tree for R 2016-11-08 16:00:45 -08:00