spark-instrumented-optimizer

History

Takeshi Yamamuro 0cb91b8c18 [SPARK-32704][SQL] Logging plan changes for execution ### What changes were proposed in this pull request? Since we only log plan changes for analyzer/optimizer now, this PR intends to add code to log plan changes in the preparation phase in `QueryExecution` for execution. ``` scala> spark.sql("SET spark.sql.optimizer.planChangeLog.level=WARN") scala> spark.range(10).groupBy("id").count().queryExecution.executedPlan ... 20/08/26 09:32:36 WARN PlanChangeLogger: === Applying Rule org.apache.spark.sql.execution.CollapseCodegenStages === !HashAggregate(keys=[id#19L], functions=[count(1)], output=[id#19L, count#23L]) (1) HashAggregate(keys=[id#19L], functions=[count(1)], output=[id#19L, count#23L]) !+- HashAggregate(keys=[id#19L], functions=[partial_count(1)], output=[id#19L, count#27L]) +- (1) HashAggregate(keys=[id#19L], functions=[partial_count(1)], output=[id#19L, count#27L]) ! +- Range (0, 10, step=1, splits=4) +- (1) Range (0, 10, step=1, splits=4) 20/08/26 09:32:36 WARN PlanChangeLogger: === Result of Batch Preparations === !HashAggregate(keys=[id#19L], functions=[count(1)], output=[id#19L, count#23L]) (1) HashAggregate(keys=[id#19L], functions=[count(1)], output=[id#19L, count#23L]) !+- HashAggregate(keys=[id#19L], functions=[partial_count(1)], output=[id#19L, count#27L]) +- (1) HashAggregate(keys=[id#19L], functions=[partial_count(1)], output=[id#19L, count#27L]) ! +- Range (0, 10, step=1, splits=4) +- (1) Range (0, 10, step=1, splits=4) ``` ### Why are the changes needed? Easy debugging for executed plans ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added unit tests. Closes #29544 from maropu/PlanLoggingInPreparations. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: Wenchen Fan <wenchen@databricks.com>		2020-08-28 16:35:47 +00:00
..
benchmarks	[SPARK-30413][SQL] Avoid WrappedArray roundtrip in GenericArrayData constructor, plus related optimization in ParquetMapConverter	2020-01-19 19:12:19 -08:00
src	[SPARK-32704][SQL] Logging plan changes for execution	2020-08-28 16:35:47 +00:00
pom.xml	[SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT	2020-02-25 19:44:31 -08:00