spark-instrumented-optimizer

History

Wenchen Fan 3f03c90a80 [SPARK-18220][SQL] read Hive orc table with varchar column should not fail ## What changes were proposed in this pull request? Spark SQL only has `StringType`, when reading hive table with varchar column, we will read that column as `StringType`. However, we still need to use varchar `ObjectInspector` to read varchar column in hive table, which means we need to know the actual column type at hive side. In Spark 2.1, after https://github.com/apache/spark/pull/14363 , we parse hive type string to catalyst type, which means the actual column type at hive side is erased. Then we may use string `ObjectInspector` to read varchar column and fail. This PR keeps the original hive column type string in the metadata of `StructField`, and use it when we convert it to a hive column. ## How was this patch tested? newly added regression test Author: Wenchen Fan <wenchen@databricks.com> Closes #16060 from cloud-fan/varchar.	2016-11-30 09:47:30 -08:00
..
main	[SPARK-18220][SQL] read Hive orc table with varchar column should not fail	2016-11-30 09:47:30 -08:00
test	[SPARK-18220][SQL] read Hive orc table with varchar column should not fail	2016-11-30 09:47:30 -08:00

Wenchen Fan 3f03c90a80 [SPARK-18220][SQL] read Hive orc table with varchar column should not fail

## What changes were proposed in this pull request?

Spark SQL only has `StringType`, when reading hive table with varchar column, we will read that column as `StringType`. However, we still need to use varchar `ObjectInspector` to read varchar column in hive table, which means we need to know the actual column type at hive side.

In Spark 2.1, after https://github.com/apache/spark/pull/14363 , we parse hive type string to catalyst type, which means the actual column type at hive side is erased. Then we may use string `ObjectInspector` to read varchar column and fail.

This PR keeps the original hive column type string in the metadata of `StructField`, and use it when we convert it to a hive column.

## How was this patch tested?

newly added regression test

Author: Wenchen Fan <wenchen@databricks.com>

Closes #16060 from cloud-fan/varchar.

2016-11-30 09:47:30 -08:00

main

[SPARK-18220][SQL] read Hive orc table with varchar column should not fail

2016-11-30 09:47:30 -08:00

test

[SPARK-18220][SQL] read Hive orc table with varchar column should not fail

2016-11-30 09:47:30 -08:00