spark-instrumented-optimizer/sql/hive
Liang-Chi Hsieh 00d176d2fe [SPARK-20392][SQL] Set barrier to prevent re-entering a tree
## What changes were proposed in this pull request?

The SQL `Analyzer` goes through a whole query plan even most part of it is analyzed. This increases the time spent on query analysis for long pipelines in ML, especially.

This patch adds a logical node called `AnalysisBarrier` that wraps an analyzed logical plan to prevent it from analysis again. The barrier is applied to the analyzed logical plan in `Dataset`. It won't change the output of wrapped logical plan and just acts as a wrapper to hide it from analyzer. New operations on the dataset will be put on the barrier, so only the new nodes created will be analyzed.

This analysis barrier will be removed at the end of analysis stage.

## How was this patch tested?

Added tests.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #19873 from viirya/SPARK-20392-reopen.
2017-12-05 21:43:41 -08:00
..
compatibility/src/test/scala/org/apache/spark/sql/hive/execution [SPARK-22675][SQL] Refactoring PropagateTypes in TypeCoercion 2017-12-05 20:43:02 +08:00
src [SPARK-20392][SQL] Set barrier to prevent re-entering a tree 2017-12-05 21:43:41 -08:00
pom.xml [SPARK-21936][SQL] backward compatibility test framework for HiveExternalCatalog 2017-09-07 23:21:49 -07:00