spark-instrumented-optimizer/sql/core
dgd-contributor c5b0cb2d94 [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong value on overflow
### What changes were proposed in this pull request?
1/ conv() have inconsistency in behavior where the returned value is different above the 64 char threshold.

```
scala> spark.sql("select conv(repeat('?', 64), 10, 16)").show
+---------------------------+
|conv(repeat(?, 64), 10, 16)|
+---------------------------+
|                          0|
+---------------------------+

scala> spark.sql("select conv(repeat('?', 65), 10, 16)").show // which should be 0
+---------------------------+
|conv(repeat(?, 65), 10, 16)|
+---------------------------+
|           FFFFFFFFFFFFFFFF|
+---------------------------+

scala> spark.sql("select conv(repeat('?', 65), 10, -16)").show // which should be 0
+----------------------------+
|conv(repeat(?, 65), 10, -16)|
+----------------------------+
|                          -1|
+----------------------------+

scala> spark.sql("select conv(repeat('?', 64), 10, -16)").show
+----------------------------+
|conv(repeat(?, 64), 10, -16)|
+----------------------------+
|                           0|
+----------------------------+
```

2/ conv should return result equal to max unsigned long value in base toBase when there is overflow

```
scala> spark.sql(select conv('aaaaaaa0aaaaaaa0a', 16, 10)).show // which should be 18446744073709551615

+-------------------------------+
|conv(aaaaaaa0aaaaaaa0a, 16, 10)|
+-------------------------------+
|           12297828695278266890|
+-------------------------------+
```

### Why are the changes needed?
Bug fix, this pull request aim to make conv function behave similarly with the behavior of conv function from MySQL database
### Does this PR introduce _any_ user-facing change?
change in result of conv() function
### How was this patch tested?
add test

Closes #33459 from dgd-contributor/SPARK-36229_convInconsistencyBehaviorWithMoreThan64Characters.

Authored-by: dgd-contributor <dgd_contributor@viettel.com.vn>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit e1c50ff779)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2021-07-29 00:19:19 +08:00
..
benchmarks [SPARK-34981][SQL][FOLLOWUP] Use SpecificInternalRow in ApplyFunctionExpression 2021-05-24 17:25:24 +09:00
src [SPARK-36229][SQL] conv() inconsistently handles invalid strings with more than 64 invalid characters and return wrong value on overflow 2021-07-29 00:19:19 +08:00
pom.xml [SPARK-35784][SS] Implementation for RocksDB instance 2021-06-29 17:46:45 -07:00