spark-instrumented-optimizer/sql/core
Cheng Lian 61a99f6a11 [SPARK-4937][SQL] Normalizes conjunctions and disjunctions to eliminate common predicates
This PR is a simplified version of several filter optimization rules introduced in #3778 authored by scwf. Newly introduced optimizations include:

1. `a && a` => `a`
2. `a || a` => `a`
3. `(a || b || c || ...) && (a || b || d || ...)` => `a && b && (c || d || ...)`

The 3rd rule is particularly useful for optimizing the following query, which is planned into a cartesian product

```sql
SELECT *
  FROM t1, t2
 WHERE (t1.key = t2.key AND t1.value > 10)
    OR (t1.key = t2.key AND t2.value < 20)
```

to the following one, which is planned into an equi-join:

```sql
SELECT *
  FROM t1, t2
 WHERE t1.key = t2.key
   AND (t1.value > 10 OR t2.value < 20)
```

The example above is quite artificial, but common predicates are likely to appear in real life complex queries (like the one mentioned in #3778).

A difference between this PR and #3778 is that these optimizations are not limited to `Filter`, but are generalized to all logical plan nodes. Thanks to scwf for bringing up these optimizations, and chenghao-intel for the generalization suggestion.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3784)
<!-- Reviewable:end -->

Author: Cheng Lian <lian@databricks.com>

Closes #3784 from liancheng/normalize-filters and squashes the following commits:

caca560 [Cheng Lian] Moves filter normalization into BooleanSimplification rule
4ab3a58 [Cheng Lian] Fixes test failure, adds more tests
5d54349 [Cheng Lian] Fixes typo in comment
2abbf8e [Cheng Lian] Forgot our sacred Apache licence header...
cf95639 [Cheng Lian] Adds an optimization rule for filter normalization
2014-12-30 13:38:27 -08:00
..
src [SPARK-4937][SQL] Normalizes conjunctions and disjunctions to eliminate common predicates 2014-12-30 13:38:27 -08:00
pom.xml Bumping version to 1.3.0-SNAPSHOT. 2014-11-18 21:24:18 -08:00