spark-instrumented-optimizer

History

Cheng Lian 90f304b0c9 [SPARK-7567] [SQL] Migrating Parquet data source to FSBasedRelation This PR migrates Parquet data source to the newly introduced `FSBasedRelation`. `FSBasedParquetRelation` is created to replace `ParquetRelation2`. Major differences are: 1. Partition discovery code has been factored out to `FSBasedRelation` 1. `AppendingParquetOutputFormat` is not used now. Instead, an anonymous subclass of `ParquetOutputFormat` is used to handle appending and writing dynamic partitions 1. When scanning partitioned tables, `FSBasedParquetRelation.buildScan` only builds an `RDD[Row]` for a single selected partition 1. `FSBasedParquetRelation` doesn't rely on Catalyst expressions for filter push down, thus it doesn't extend `CatalystScan` anymore After migrating `JSONRelation` (which extends `CatalystScan`), we can remove `CatalystScan`. <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/6090) <!-- Reviewable:end --> Author: Cheng Lian <lian@databricks.com> Closes #6090 from liancheng/parquet-migration and squashes the following commits: 6063f87 [Cheng Lian] Casts to OutputCommitter rather than FileOutputCommtter bfd1cf0 [Cheng Lian] Fixes compilation error introduced while rebasing f9ea56e [Cheng Lian] Adds ParquetRelation2 related classes to MiMa check whitelist 261d8c1 [Cheng Lian] Minor bug fix and more tests db65660 [Cheng Lian] Migrates Parquet data source to FSBasedRelation (cherry picked from commit `7ff16e8abe`) Signed-off-by: Michael Armbrust <michael@databricks.com>		2015-05-13 11:04:21 -07:00
..
compatibility/src/test/scala/org/apache/spark/sql/hive/execution	[SPARK-6908] [SQL] Use isolated Hive client	2015-05-07 19:36:41 -07:00
src	[SPARK-7567] [SQL] Migrating Parquet data source to FSBasedRelation	2015-05-13 11:04:21 -07:00
v0.12.0/src/main/scala/org/apache/spark/sql/hive	[SPARK-6638] [SQL] Improve performance of StringType in SQL	2015-04-15 13:06:38 -07:00
v0.13.1/src/main/scala/org/apache/spark/sql/hive	[SPARK-6505] [SQL] Remove the reflection call in HiveFunctionWrapper	2015-04-27 14:08:05 +08:00
pom.xml	[SPARK-7168] [BUILD] Update plugin versions in Maven build and centralize versions	2015-04-28 07:48:34 -04:00