[MINOR][SQL] Re-use binaryToSQLTimestamp()
in ParquetRowConverter
### What changes were proposed in this pull request? The function `binaryToSQLTimestamp()` is used by Parquet Vectorized reader. Parquet MR reader has similar code for de-serialization of INT96 timestamps. In this PR, I propose to de-duplicate code and re-use `binaryToSQLTimestamp()`. ### Why are the changes needed? This should improve maintenance, and should allow to avoid errors while changing Vectorized and regular parquet readers. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By existing test suites, for instance `ParquetIOSuite`. Closes #30069 from MaxGekk/int96-common-serde. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
parent
ab0bad9544
commit
acb79f52db
|
@ -300,15 +300,7 @@ private[parquet] class ParquetRowConverter(
|
|||
new ParquetPrimitiveConverter(updater) {
|
||||
// Converts nanosecond timestamps stored as INT96
|
||||
override def addBinary(value: Binary): Unit = {
|
||||
assert(
|
||||
value.length() == 12,
|
||||
"Timestamps (with nanoseconds) are expected to be stored in 12-byte long binaries, " +
|
||||
s"but got a ${value.length()}-byte binary.")
|
||||
|
||||
val buf = value.toByteBuffer.order(ByteOrder.LITTLE_ENDIAN)
|
||||
val timeOfDayNanos = buf.getLong
|
||||
val julianDay = buf.getInt
|
||||
val rawTime = DateTimeUtils.fromJulianDay(julianDay, timeOfDayNanos)
|
||||
val rawTime = ParquetRowConverter.binaryToSQLTimestamp(value)
|
||||
val adjTime = convertTz.map(DateTimeUtils.convertTz(rawTime, _, ZoneOffset.UTC))
|
||||
.getOrElse(rawTime)
|
||||
updater.setLong(adjTime)
|
||||
|
|
Loading…
Reference in a new issue