spark-instrumented-optimizer

History

Andrew Ray 37cff1b1a7 [SPARK-11275][SQL] Incorrect results when using rollup/cube Fixes bug with grouping sets (including cube/rollup) where aggregates that included grouping expressions would return the wrong (null) result. Also simplifies the analyzer rule a bit and leaves column pruning to the optimizer. Added multiple unit tests to DataFrameAggregateSuite and verified it passes hive compatibility suite: ``` build/sbt -Phive -Dspark.hive.whitelist='groupby._grouping.' 'test-only org.apache.spark.sql.hive.execution.HiveCompatibilitySuite' ``` This is an alternative to pr https://github.com/apache/spark/pull/9419 but I think its better as it simplifies the analyzer rule instead of adding another special case to it. Author: Andrew Ray <ray.andrew@gmail.com> Closes #9815 from aray/groupingset-agg-fix.	2015-11-19 15:11:30 -08:00
..
java/org/apache/spark/sql	[SPARK-11787][SQL] Improve Parquet scan performance when using flat schemas.	2015-11-18 18:38:45 -08:00
scala/org/apache/spark/sql	[SPARK-11275][SQL] Incorrect results when using rollup/cube	2015-11-19 15:11:30 -08:00

Andrew Ray 37cff1b1a7 [SPARK-11275][SQL] Incorrect results when using rollup/cube

Fixes bug with grouping sets (including cube/rollup) where aggregates that included grouping expressions would return the wrong (null) result.

Also simplifies the analyzer rule a bit and leaves column pruning to the optimizer.

Added multiple unit tests to DataFrameAggregateSuite and verified it passes hive compatibility suite:
```
build/sbt -Phive -Dspark.hive.whitelist='groupby.*_grouping.*' 'test-only org.apache.spark.sql.hive.execution.HiveCompatibilitySuite'
```

This is an alternative to pr https://github.com/apache/spark/pull/9419 but I think its better as it simplifies the analyzer rule instead of adding another special case to it.

Author: Andrew Ray <ray.andrew@gmail.com>

Closes #9815 from aray/groupingset-agg-fix.

2015-11-19 15:11:30 -08:00

java/org/apache/spark/sql

[SPARK-11787][SQL] Improve Parquet scan performance when using flat schemas.

2015-11-18 18:38:45 -08:00

scala/org/apache/spark/sql

[SPARK-11275][SQL] Incorrect results when using rollup/cube

2015-11-19 15:11:30 -08:00