6c6291b3f6
### What changes were proposed in this pull request? `sha2(input, bit_length)` returns incorrect results when `bit_length == 224` for all inputs. This error can be reproduced by running `spark.sql("SELECT sha2('abc', 224)").show()`, for instance, in spark-shell. Spark currently returns ``` #\t}"4�"�B�w��U�*��你���l�� ``` while the expected result is ``` 23097d223405d8228642a477bda255b32aadbce4bda0b3f7e36c9da7 ``` This appears to happen because the `MessageDigest.digest()` function appears to return bytes intended to be interpreted as a `BigInt` rather than a string. Thus, the output of `MessageDigest.digest()` must first be interpreted as a `BigInt` and then transformed into a hex string rather than directly being interpreted as a hex string. ### Why are the changes needed? `sha2(input, bit_length)` with a `bit_length` input of `224` would previously return the incorrect result. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added new test to `HashExpressionsSuite.scala` which previously failed and now pass Closes #34086 from richardc-db/sha224. Authored-by: Richard Chen <r.chen@databricks.com> Signed-off-by: Gengliang Wang <gengliang@apache.org> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |