[SPARK-32594][SQL] Fix serialization of dates inserted to Hive tables

### What changes were proposed in this pull request?
Fix `DaysWritable` by overriding parent's method `def get(doesTimeMatter: Boolean): Date` from `DateWritable` instead of `Date get()` because the former one uses the first one. The bug occurs because `HiveOutputWriter.write()` call `def get(doesTimeMatter: Boolean): Date` transitively with default implementation from the parent class  `DateWritable` which doesn't respect date rebases and uses not initialized `daysSinceEpoch` (0 which `1970-01-01`).

### Why are the changes needed?
The changes fix the bug:
```sql
spark-sql> CREATE TABLE table1 (d date);
spark-sql> INSERT INTO table1 VALUES (date '2020-08-11');
spark-sql> SELECT * FROM table1;
1970-01-01
```
The expected result of the last SQL statement must be **2020-08-11** but got **1970-01-01**.

### Does this PR introduce _any_ user-facing change?
Yes. After the fix, `INSERT` work correctly:
```sql
spark-sql> SELECT * FROM table1;
2020-08-11
```

### How was this patch tested?
Add new test to `HiveSerDeReadWriteSuite`

Closes #29409 from MaxGekk/insert-date-into-hive-table.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
Max Gekk 2020-08-12 13:32:16 +09:00 committed by HyukjinKwon
parent 5d130f0360
commit 0477d23467
2 changed files with 11 additions and 1 deletions

View file

@ -54,7 +54,9 @@ class DaysWritable(
}
override def getDays: Int = julianDays
override def get(): Date = new Date(DateWritable.daysToMillis(julianDays))
override def get(doesTimeMatter: Boolean): Date = {
new Date(DateWritable.daysToMillis(julianDays, doesTimeMatter))
}
override def set(d: Int): Unit = {
gregorianDays = d

View file

@ -184,4 +184,12 @@ class HiveSerDeReadWriteSuite extends QueryTest with SQLTestUtils with TestHiveS
checkComplexTypes(fileFormat)
}
}
test("SPARK-32594: insert dates to a Hive table") {
withTable("table1") {
sql("CREATE TABLE table1 (d date)")
sql("INSERT INTO table1 VALUES (date '2020-08-11')")
checkAnswer(spark.table("table1"), Row(Date.valueOf("2020-08-11")))
}
}
}