spark-instrumented-optimizer/R/pkg
HyukjinKwon e1d7321034 [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization
### What changes were proposed in this pull request?

This PR proposes to:

1. Fix the error message when the output schema is misbatched with R DataFrame from the given function. For example,

    ```R
    df <- createDataFrame(list(list(a=1L, b="2")))
    count(gapply(df, "a", function(key, group) { group }, structType("a int, b int")))
    ```

    **Before:**

    ```
    Error in handleErrors(returnStatus, conn) :
      ...
      java.lang.UnsupportedOperationException
	    ...
    ```

    **After:**

    ```
    Error in handleErrors(returnStatus, conn) :
     ...
     java.lang.AssertionError: assertion failed: Invalid schema from gapply: expected IntegerType, IntegerType, got IntegerType, StringType
        ...
    ```

2. Update documentation about the schema matching for `gapply` and `dapply`.

### Why are the changes needed?

To show which schema is not matched, and let users know what's going on.

### Does this PR introduce _any_ user-facing change?

Yes, error message is updated as above, and documentation is updated.

### How was this patch tested?

Manually tested and unitttests were added.

Closes #29283 from HyukjinKwon/r-vectorized-error.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-07-30 15:16:02 +09:00
..
inst [SPARK-32073][R] Drop R < 3.5 support 2020-06-24 11:05:27 +09:00
R [SPARK-32451][R] Support Apache Arrow 1.0.0 2020-07-26 18:51:25 -07:00
src-native [SPARK-6811] Copy SparkR lib in make-distribution.sh 2015-05-23 00:04:01 -07:00
tests [SPARK-32478][R][SQL] Error message to show the schema mismatch in gapply with Arrow vectorization 2020-07-30 15:16:02 +09:00
vignettes [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR 2020-04-09 19:38:11 -05:00
.lintr [SPARK-29936][R] Fix SparkR lint errors and add lint-r GitHub Action 2019-11-17 21:01:01 -08:00
.Rbuildignore [SPARK-20877][SPARKR][FOLLOWUP] clean up after test move 2017-06-11 03:00:44 -07:00
DESCRIPTION [SPARK-32452][R][SQL] Bump up the minimum Arrow version as 1.0.0 in SparkR 2020-07-27 14:21:15 +09:00
NAMESPACE [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR 2020-04-09 19:38:11 -05:00