spark-instrumented-optimizer/sql/catalyst/src/main
jiangxingbo 4ba63b193c [SPARK-17142][SQL] Complex query triggers binding error in HashAggregateExec
## What changes were proposed in this pull request?

In `ReorderAssociativeOperator` rule, we extract foldable expressions with Add/Multiply arithmetics, and replace with eval literal. For example, `(a + 1) + (b + 2)` is optimized to `(a + b + 3)` by this rule.
For aggregate operator, output expressions should be derived from groupingExpressions, current implemenation of `ReorderAssociativeOperator` rule may break this promise. A instance could be:
```
SELECT
  ((t1.a + 1) + (t2.a + 2)) AS out_col
FROM
  testdata2 AS t1
INNER JOIN
  testdata2 AS t2
ON
  (t1.a = t2.a)
GROUP BY (t1.a + 1), (t2.a + 2)
```
`((t1.a + 1) + (t2.a + 2))` is optimized to `(t1.a + t2.a + 3)`, which could not be derived from `ExpressionSet((t1.a +1), (t2.a + 2))`.
Maybe we should improve the rule of `ReorderAssociativeOperator` by adding a GroupingExpressionSet to keep Aggregate.groupingExpressions, and respect these expressions during the optimize stage.

## How was this patch tested?

Add new test case in `ReorderAssociativeOperatorSuite`.

Author: jiangxingbo <jiangxb1987@gmail.com>

Closes #14917 from jiangxb1987/rao.
2016-09-13 17:04:51 +02:00
..
antlr4/org/apache/spark/sql/catalyst/parser [SPARK-17296][SQL] Simplify parser join processing. 2016-09-07 00:44:07 +02:00
java/org/apache/spark/sql [SPARK-17405] RowBasedKeyValueBatch should use default page size to prevent OOMs 2016-09-08 16:47:18 -07:00
scala/org/apache/spark/sql [SPARK-17142][SQL] Complex query triggers binding error in HashAggregateExec 2016-09-13 17:04:51 +02:00