spark-instrumented-optimizer/common
ulysses-you 1be1012497 [SPARK-35005][SQL] Improve error msg if UTF8String concatWs length overflow
### What changes were proposed in this pull request?

Add check if the byte length over `int`.

### Why are the changes needed?

We encounter a very extreme case with expression `concat_ws`, and the error msg is
```
Caused by: java.lang.NegativeArraySizeException
	at org.apache.spark.unsafe.types.UTF8String.concatWs
```
Seems the `UTF8String.concat` has already done the length check at [#21064](https://github.com/apache/spark/pull/21064), so it's better to add in `concatWs`.

### Does this PR introduce _any_ user-facing change?

Yes

### How was this patch tested?

It's too heavy to add the test.

Closes #32106 from ulysses-you/SPARK-35005.

Authored-by: ulysses-you <ulyssesyou18@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-04-12 14:32:15 +03:00
..
kvstore [SPARK-33662][BUILD] Setting version to 3.2.0-SNAPSHOT 2020-12-04 14:10:42 -08:00
network-common [SPARK-34894][CORE] Use 'io.connectionTimeout' as a hint instead of 'spark.network.timeout' for lost connections 2021-03-30 09:58:24 +08:00
network-shuffle [SPARK-34840][SHUFFLE] Fixes cases of corruption in merged shuffle … 2021-03-25 12:47:46 -05:00
network-yarn [SPARK-34828][YARN] Make shuffle service name configurable on client side and allow for classpath-based config override on server side 2021-03-30 10:09:00 -05:00
sketch [SPARK-33662][BUILD] Setting version to 3.2.0-SNAPSHOT 2020-12-04 14:10:42 -08:00
tags [SPARK-34578][SQL][TESTS][TEST-MAVEN] Refactor ORC encryption tests and ignore ORC shim loaded by old Hadoop library 2021-03-02 16:52:27 +09:00
unsafe [SPARK-35005][SQL] Improve error msg if UTF8String concatWs length overflow 2021-04-12 14:32:15 +03:00