spark-instrumented-optimizer

History

Herman van Hovell de8a03e682 [SPARK-19459][SQL] Add Hive datatype (char/varchar) to StructField metadata ## What changes were proposed in this pull request? Reading from an existing ORC table which contains `char` or `varchar` columns can fail with a `ClassCastException` if the table metadata has been created using Spark. This is caused by the fact that spark internally replaces `char` and `varchar` columns with a `string` column. This PR fixes this by adding the hive type to the `StructField's` metadata under the `HIVE_TYPE_STRING` key. This is picked up by the `HiveClient` and the ORC reader, see https://github.com/apache/spark/pull/16060 for more details on how the metadata is used. ## How was this patch tested? Added a regression test to `OrcSourceSuite`. Author: Herman van Hovell <hvanhovell@databricks.com> Closes #16804 from hvanhovell/SPARK-19459.	2017-02-10 11:06:57 -08:00
..
main	[SPARK-19459][SQL] Add Hive datatype (char/varchar) to StructField metadata	2017-02-10 11:06:57 -08:00
test	[SPARK-19543] from_json fails when the input row is empty	2017-02-10 12:55:06 +01:00

Herman van Hovell de8a03e682 [SPARK-19459][SQL] Add Hive datatype (char/varchar) to StructField metadata

## What changes were proposed in this pull request?
Reading from an existing ORC table which contains `char` or `varchar` columns can fail with a `ClassCastException` if the table metadata has been created using Spark. This is caused by the fact that spark internally replaces `char` and `varchar` columns with a `string` column.

This PR fixes this by adding the hive type to the `StructField's` metadata under the `HIVE_TYPE_STRING` key. This is picked up by the `HiveClient` and the ORC reader, see https://github.com/apache/spark/pull/16060 for more details on how the metadata is used.

## How was this patch tested?
Added a regression test to `OrcSourceSuite`.

Author: Herman van Hovell <hvanhovell@databricks.com>

Closes #16804 from hvanhovell/SPARK-19459.

2017-02-10 11:06:57 -08:00

main

[SPARK-19459][SQL] Add Hive datatype (char/varchar) to StructField metadata

2017-02-10 11:06:57 -08:00

test

[SPARK-19543] from_json fails when the input row is empty

2017-02-10 12:55:06 +01:00