spark-instrumented-optimizer/sql/hive
Cheng Lian 90f304b0c9 [SPARK-7567] [SQL] Migrating Parquet data source to FSBasedRelation
This PR migrates Parquet data source to the newly introduced `FSBasedRelation`. `FSBasedParquetRelation` is created to replace `ParquetRelation2`. Major differences are:

1. Partition discovery code has been factored out to `FSBasedRelation`
1. `AppendingParquetOutputFormat` is not used now. Instead, an anonymous subclass of `ParquetOutputFormat` is used to handle appending and writing dynamic partitions
1. When scanning partitioned tables, `FSBasedParquetRelation.buildScan` only builds an `RDD[Row]` for a single selected partition
1. `FSBasedParquetRelation` doesn't rely on Catalyst expressions for filter push down, thus it doesn't extend `CatalystScan` anymore

   After migrating `JSONRelation` (which extends `CatalystScan`), we can remove `CatalystScan`.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6090)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #6090 from liancheng/parquet-migration and squashes the following commits:

6063f87 [Cheng Lian] Casts to OutputCommitter rather than FileOutputCommtter
bfd1cf0 [Cheng Lian] Fixes compilation error introduced while rebasing
f9ea56e [Cheng Lian] Adds ParquetRelation2 related classes to MiMa check whitelist
261d8c1 [Cheng Lian] Minor bug fix and more tests
db65660 [Cheng Lian] Migrates Parquet data source to FSBasedRelation

(cherry picked from commit 7ff16e8abe)
Signed-off-by: Michael Armbrust <michael@databricks.com>
2015-05-13 11:04:21 -07:00
..
compatibility/src/test/scala/org/apache/spark/sql/hive/execution [SPARK-6908] [SQL] Use isolated Hive client 2015-05-07 19:36:41 -07:00
src [SPARK-7567] [SQL] Migrating Parquet data source to FSBasedRelation 2015-05-13 11:04:21 -07:00
v0.12.0/src/main/scala/org/apache/spark/sql/hive [SPARK-6638] [SQL] Improve performance of StringType in SQL 2015-04-15 13:06:38 -07:00
v0.13.1/src/main/scala/org/apache/spark/sql/hive [SPARK-6505] [SQL] Remove the reflection call in HiveFunctionWrapper 2015-04-27 14:08:05 +08:00
pom.xml [SPARK-7168] [BUILD] Update plugin versions in Maven build and centralize versions 2015-04-28 07:48:34 -04:00