spark-instrumented-optimizer/R/pkg/tests/fulltests
Zhenhua Wang 655f6f86f8 [SPARK-22208][SQL] Improve percentile_approx by not rounding up targetError and starting from index 0
## What changes were proposed in this pull request?

Currently percentile_approx never returns the first element when percentile is in (relativeError, 1/N], where relativeError default 1/10000, and N is the total number of elements. But ideally, percentiles in [0, 1/N] should all return the first element as the answer.

For example, given input data 1 to 10, if a user queries 10% (or even less) percentile, it should return 1, because the first value 1 already reaches 10%. Currently it returns 2.

Based on the paper, targetError is not rounded up, and searching index should start from 0 instead of 1. By following the paper, we should be able to fix the cases mentioned above.

## How was this patch tested?

Added a new test case and fix existing test cases.

Author: Zhenhua Wang <wzh_zju@163.com>

Closes #19438 from wzhfy/improve_percentile_approx.
2017-10-11 00:16:12 -07:00
..
jarTest.R [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN 2017-06-11 00:00:33 -07:00
packageInAJarTest.R [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN 2017-06-11 00:00:33 -07:00
test_binary_function.R [SPARK-22063][R] Fixes lint check failures in R by latest commit sha1 ID of lint-r 2017-10-01 18:42:45 +09:00
test_binaryFile.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_broadcast.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_client.R [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10 2017-07-13 17:06:24 +08:00
test_context.R [SPARK-21149][R] Add job description API for R 2017-06-23 09:59:24 -07:00
test_includePackage.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_jvm_api.R [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN 2017-06-11 00:00:33 -07:00
test_mllib_classification.R [SPARK-21381][SPARKR] SparkR: pass on setHandleInvalid for classification algorithms 2017-07-31 20:37:06 -07:00
test_mllib_clustering.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_mllib_fpm.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_mllib_recommendation.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_mllib_regression.R [SPARK-21622][ML][SPARKR] Support offset in SparkR GLM 2017-08-06 15:14:12 -07:00
test_mllib_stat.R [SPARK-20877][SPARKR] refactor tests to basic tests only for CRAN 2017-06-11 00:00:33 -07:00
test_mllib_tree.R [SPARK-21801][SPARKR][TEST] unit test randomly fail with randomforest 2017-08-29 10:09:41 -07:00
test_parallelize_collect.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_rdd.R [SPARK-22063][R] Fixes lint check failures in R by latest commit sha1 ID of lint-r 2017-10-01 18:42:45 +09:00
test_Serde.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_shuffle.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_sparkR.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_sparkSQL.R [SPARK-22208][SQL] Improve percentile_approx by not rounding up targetError and starting from index 0 2017-10-11 00:16:12 -07:00
test_streaming.R [SPARK-21224][R] Specify a schema by using a DDL-formatted string when reading in R 2017-06-28 19:36:00 -07:00
test_take.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_textFile.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_utils.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
test_Windows.R [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00