spark-instrumented-optimizer

History

Michael Munday 383bb4af00 [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms MurmurHash3 and xxHash64 interpret sequences of bytes as integers encoded in little-endian byte order. This requires a byte reversal on big endian platforms. I've left the hashInt and hashLong functions as-is for now. My interpretation of these functions is that they perform the hash on the integer value as if it were serialized in little-endian byte order. Therefore no byte reversal is necessary. ### What changes were proposed in this pull request? Modify hash functions to produce correct results on big-endian platforms. ### Why are the changes needed? Hash functions produce incorrect results on big-endian platforms which, amongst other potential issues, causes test failures. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests run on the IBM Z (s390x) platform which uses a big-endian byte order. Closes #29762 from mundaym/fix-hashes. Authored-by: Michael Munday <mike.munday@ibm.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2020-09-23 12:36:46 -05:00
..
src	[SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms	2020-09-23 12:36:46 -05:00
pom.xml	[SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT	2020-02-25 19:44:31 -08:00

Michael Munday 383bb4af00 [SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms

MurmurHash3 and xxHash64 interpret sequences of bytes as integers
encoded in little-endian byte order. This requires a byte reversal
on big endian platforms.

I've left the hashInt and hashLong functions as-is for now. My
interpretation of these functions is that they perform the hash on
the integer value as if it were serialized in little-endian byte
order. Therefore no byte reversal is necessary.

### What changes were proposed in this pull request?
Modify hash functions to produce correct results on big-endian platforms.

### Why are the changes needed?
Hash functions produce incorrect results on big-endian platforms which, amongst other potential issues, causes test failures.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Existing tests run on the IBM Z (s390x) platform which uses a big-endian byte order.

Closes #29762 from mundaym/fix-hashes.

Authored-by: Michael Munday <mike.munday@ibm.com>
Signed-off-by: Sean Owen <srowen@gmail.com>

2020-09-23 12:36:46 -05:00

src

[SPARK-32892][CORE][SQL] Fix hash functions on big-endian platforms

2020-09-23 12:36:46 -05:00

pom.xml

[SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT

2020-02-25 19:44:31 -08:00