spark-instrumented-optimizer

History

Sameer Agarwal 0a64294fcb [SPARK-14015][SQL] Support TimestampType in vectorized parquet reader ## What changes were proposed in this pull request? This PR adds support for TimestampType in the vectorized parquet reader ## How was this patch tested? 1. `VectorizedColumnReader` initially had a gating condition on `primitiveType.getPrimitiveTypeName() == PrimitiveType.PrimitiveTypeName.INT96)` that made us fall back on parquet-mr for handling timestamps. This condition is now removed. 2. The `ParquetHadoopFsRelationSuite` (that tests for all supported hive types -- including `TimestampType`) fails when the gating condition is removed (https://github.com/apache/spark/pull/11808) and should now pass with this change. Similarly, the `ParquetHiveCompatibilitySuite.SPARK-10177 timestamp` test that fails when the gating condition is removed, should now pass as well. 3. Added tests in `HadoopFsRelationTest` that test both the dictionary encoded and non-encoded versions across all supported datatypes. Author: Sameer Agarwal <sameer@databricks.com> Closes #11882 from sameeragarwal/timestamp-parquet.	2016-03-23 12:13:32 -07:00
..
src	[SPARK-14015][SQL] Support TimestampType in vectorized parquet reader	2016-03-23 12:13:32 -07:00
pom.xml	[SPARK-13780][SQL] Add missing dependency to build.	2016-03-11 10:27:38 -08:00

Sameer Agarwal 0a64294fcb [SPARK-14015][SQL] Support TimestampType in vectorized parquet reader

## What changes were proposed in this pull request?

This PR adds support for TimestampType in the vectorized parquet reader

## How was this patch tested?

1. `VectorizedColumnReader` initially had a gating condition on `primitiveType.getPrimitiveTypeName() == PrimitiveType.PrimitiveTypeName.INT96)` that made us fall back on parquet-mr for handling timestamps. This condition is now removed.
2. The `ParquetHadoopFsRelationSuite` (that tests for all supported hive types -- including `TimestampType`) fails when the gating condition is removed (https://github.com/apache/spark/pull/11808) and should now pass with this change. Similarly, the `ParquetHiveCompatibilitySuite.SPARK-10177 timestamp` test that fails when the gating condition is removed, should now pass as well.
3.  Added tests in `HadoopFsRelationTest` that test both the dictionary encoded and non-encoded versions across all supported datatypes.

Author: Sameer Agarwal <sameer@databricks.com>

Closes #11882 from sameeragarwal/timestamp-parquet.

2016-03-23 12:13:32 -07:00

src

[SPARK-14015][SQL] Support TimestampType in vectorized parquet reader

2016-03-23 12:13:32 -07:00

pom.xml

[SPARK-13780][SQL] Add missing dependency to build.

2016-03-11 10:27:38 -08:00