a8d9ec8a60
## What changes were proposed in this pull request? This PR make `sample(...)` able to omit `withReplacement` defaulting to `FALSE`. In short, the following examples are allowed: ```r > df <- createDataFrame(as.list(seq(10))) > count(sample(df, fraction=0.5, seed=3)) [1] 4 > count(sample(df, fraction=1.0)) [1] 10 ``` In addition, this PR also adds some type checking logics as below: ```r > sample(df, fraction = "a") Error in sample(df, fraction = "a") : fraction must be numeric; however, got character > sample(df, fraction = 1, seed = NULL) Error in sample(df, fraction = 1, seed = NULL) : seed must not be NULL or NA; however, got NULL > sample(df, list(1), 1.0) Error in sample(df, list(1), 1) : withReplacement must be logical; however, got list > sample(df, fraction = -1.0) ... Error in sample : illegal argument - requirement failed: Sampling fraction (-1.0) must be on interval [0, 1] without replacement ``` ## How was this patch tested? Manually tested, unit tests added in `R/pkg/tests/fulltests/test_sparkSQL.R`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #19243 from HyukjinKwon/SPARK-21780. |
||
---|---|---|
.. | ||
jarTest.R | ||
packageInAJarTest.R | ||
test_binary_function.R | ||
test_binaryFile.R | ||
test_broadcast.R | ||
test_client.R | ||
test_context.R | ||
test_includePackage.R | ||
test_jvm_api.R | ||
test_mllib_classification.R | ||
test_mllib_clustering.R | ||
test_mllib_fpm.R | ||
test_mllib_recommendation.R | ||
test_mllib_regression.R | ||
test_mllib_stat.R | ||
test_mllib_tree.R | ||
test_parallelize_collect.R | ||
test_rdd.R | ||
test_Serde.R | ||
test_shuffle.R | ||
test_sparkR.R | ||
test_sparkSQL.R | ||
test_streaming.R | ||
test_take.R | ||
test_textFile.R | ||
test_utils.R | ||
test_Windows.R |