spark-instrumented-optimizer

History

Wenchen Fan 5cb2e33609 [SPARK-14675][SQL] ClassFormatError when use Seq as Aggregator buffer type ## What changes were proposed in this pull request? After https://github.com/apache/spark/pull/12067, we now use expressions to do the aggregation in `TypedAggregateExpression`. To implement buffer merge, we produce a new buffer deserializer expression by replacing `AttributeReference` with right-side buffer attribute, like other `DeclarativeAggregate`s do, and finally combine the left and right buffer deserializer with `Invoke`. However, after https://github.com/apache/spark/pull/12338, we will add loop variable to class members when codegen `MapObjects`. If the `Aggregator` buffer type is `Seq`, which is implemented by `MapObjects` expression, we will add the same loop variable to class members twice(by left and right buffer deserializer), which cause the `ClassFormatError`. This PR fixes this issue by calling `distinct` before declare the class menbers. ## How was this patch tested? new regression test in `DatasetAggregatorSuite` Author: Wenchen Fan <wenchen@databricks.com> Closes #12468 from cloud-fan/bug.		2016-04-19 10:51:58 -07:00
..
antlr4/org/apache/spark/sql/catalyst/parser	[SPARK-14398][SQL] Audit non-reserved keyword list in ANTLR4 parser	2016-04-19 09:09:58 +02:00
java/org/apache/spark/sql	[SPARK-14426][SQL] Merge PerserUtils and ParseUtils	2016-04-06 10:57:46 -07:00
scala/org/apache/spark/sql	[SPARK-14675][SQL] ClassFormatError when use Seq as Aggregator buffer type	2016-04-19 10:51:58 -07:00