spark-instrumented-optimizer

History

Wenchen Fan 7432e7ded4 [SPARK-24935][SQL][FOLLOWUP] support INIT -> UPDATE -> MERGE -> FINISH in Hive UDAF adapter ## What changes were proposed in this pull request? This is a followup of https://github.com/apache/spark/pull/24144 . #24144 missed one case: when hash aggregate fallback to sort aggregate, the life cycle of UDAF is: INIT -> UPDATE -> MERGE -> FINISH. However, not all Hive UDAF can support it. Hive UDAF knows the aggregation mode when creating the aggregation buffer, so that it can create different buffers for different inputs: the original data or the aggregation buffer. Please see an example in the [sketches library](`7f9e76e9e0/src/main/java/com/yahoo/sketches/hive/cpc/DataToSketchUDAF.java (L107)`). The buffer for UPDATE may not support MERGE. This PR updates the Hive UDAF adapter in Spark to support INIT -> UPDATE -> MERGE -> FINISH, by turning it to INIT -> UPDATE -> FINISH + IINIT -> MERGE -> FINISH. ## How was this patch tested? a new test case Closes #24459 from cloud-fan/hive-udaf. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>		2019-04-30 10:35:23 +08:00
..
benchmarks	[SPARK-26584][SQL] Remove `spark.sql.orc.copyBatchToSpark` internal conf	2019-01-10 08:42:23 -08:00
compatibility/src/test/scala/org/apache/spark/sql/hive/execution	Revert [SPARK-19355][SPARK-25352]	2018-09-20 20:18:31 +08:00
src	[SPARK-24935][SQL][FOLLOWUP] support INIT -> UPDATE -> MERGE -> FINISH in Hive UDAF adapter	2019-04-30 10:35:23 +08:00
pom.xml	[SPARK-27176][SQL] Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4	2019-04-08 08:42:21 -07:00