spark-instrumented-optimizer

History

ulysses-you ed7c81dfaa [SPARK-35989][SQL] Only remove redundant shuffle if shuffle origin is REPARTITION_BY_COL in AQE ### What changes were proposed in this pull request? Skip remove shuffle if it's shuffle origin is not `REPARTITION_BY_COL` in AQE. ### Why are the changes needed? `REPARTITION_BY_COL` doesn't guarantee the output partitioning number so we can remove it safely in AQE. For `REPARTITION_BY_NUM`, we should retain the shuffle which partition number is specified by user. For `REBALANCE_PARTITIONS_BY_COL`, it is a special shuffle used to rebalance partitions so we should not remove it. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? add test Closes #33188 from ulysses-you/SPARK-35989. Lead-authored-by: ulysses-you <ulyssesyou18@gmail.com> Co-authored-by: ulysses <ulyssesyou18@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit `7fe4c4a9ad`) Signed-off-by: Wenchen Fan <wenchen@databricks.com>		2021-07-05 17:10:57 +08:00
..
benchmarks	[SPARK-34981][SQL][FOLLOWUP] Use SpecificInternalRow in ApplyFunctionExpression	2021-05-24 17:25:24 +09:00
src	[SPARK-35989][SQL] Only remove redundant shuffle if shuffle origin is REPARTITION_BY_COL in AQE	2021-07-05 17:10:57 +08:00
pom.xml	[SPARK-35784][SS] Implementation for RocksDB instance	2021-06-29 17:46:45 -07:00