[MINOR][SQL] Re-use binaryToSQLTimestamp() in ParquetRowConverter

### What changes were proposed in this pull request?
The function `binaryToSQLTimestamp()` is used by Parquet Vectorized reader. Parquet MR reader has similar code for de-serialization of INT96 timestamps. In this PR, I propose to de-duplicate code and re-use `binaryToSQLTimestamp()`.

### Why are the changes needed?
This should improve maintenance, and should allow to avoid errors while changing Vectorized and regular parquet readers.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By existing test suites, for instance `ParquetIOSuite`.

Closes #30069 from MaxGekk/int96-common-serde.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
Max Gekk 2020-10-16 14:27:27 -07:00 committed by Dongjoon Hyun
parent ab0bad9544
commit acb79f52db

View file

@ -300,15 +300,7 @@ private[parquet] class ParquetRowConverter(
new ParquetPrimitiveConverter(updater) {
// Converts nanosecond timestamps stored as INT96
override def addBinary(value: Binary): Unit = {
assert(
value.length() == 12,
"Timestamps (with nanoseconds) are expected to be stored in 12-byte long binaries, " +
s"but got a ${value.length()}-byte binary.")
val buf = value.toByteBuffer.order(ByteOrder.LITTLE_ENDIAN)
val timeOfDayNanos = buf.getLong
val julianDay = buf.getInt
val rawTime = DateTimeUtils.fromJulianDay(julianDay, timeOfDayNanos)
val rawTime = ParquetRowConverter.binaryToSQLTimestamp(value)
val adjTime = convertTz.map(DateTimeUtils.convertTz(rawTime, _, ZoneOffset.UTC))
.getOrElse(rawTime)
updater.setLong(adjTime)