e1c50ff779
### What changes were proposed in this pull request? 1/ conv() have inconsistency in behavior where the returned value is different above the 64 char threshold. ``` scala> spark.sql("select conv(repeat('?', 64), 10, 16)").show +---------------------------+ |conv(repeat(?, 64), 10, 16)| +---------------------------+ | 0| +---------------------------+ scala> spark.sql("select conv(repeat('?', 65), 10, 16)").show // which should be 0 +---------------------------+ |conv(repeat(?, 65), 10, 16)| +---------------------------+ | FFFFFFFFFFFFFFFF| +---------------------------+ scala> spark.sql("select conv(repeat('?', 65), 10, -16)").show // which should be 0 +----------------------------+ |conv(repeat(?, 65), 10, -16)| +----------------------------+ | -1| +----------------------------+ scala> spark.sql("select conv(repeat('?', 64), 10, -16)").show +----------------------------+ |conv(repeat(?, 64), 10, -16)| +----------------------------+ | 0| +----------------------------+ ``` 2/ conv should return result equal to max unsigned long value in base toBase when there is overflow ``` scala> spark.sql(select conv('aaaaaaa0aaaaaaa0a', 16, 10)).show // which should be 18446744073709551615 +-------------------------------+ |conv(aaaaaaa0aaaaaaa0a, 16, 10)| +-------------------------------+ | 12297828695278266890| +-------------------------------+ ``` ### Why are the changes needed? Bug fix, this pull request aim to make conv function behave similarly with the behavior of conv function from MySQL database ### Does this PR introduce _any_ user-facing change? change in result of conv() function ### How was this patch tested? add test Closes #33459 from dgd-contributor/SPARK-36229_convInconsistencyBehaviorWithMoreThan64Characters. Authored-by: dgd-contributor <dgd_contributor@viettel.com.vn> Signed-off-by: Wenchen Fan <wenchen@databricks.com> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |