spark-instrumented-optimizer/sql/core
allisonwang-db 31bb9e04ad [SPARK-36063][SQL] Optimize OneRowRelation subqueries
### What changes were proposed in this pull request?
This PR adds optimization for scalar and lateral subqueries with OneRowRelation as leaf nodes. It inlines such subqueries before decorrelation to avoid rewriting them as left outer joins. It also introduces a flag to turn on/off this optimization: `spark.sql.optimizer.optimizeOneRowRelationSubquery` (default: True).

For example:
```sql
select (select c1) from t
```
Analyzed plan:
```
Project [scalar-subquery#17 [c1#18] AS scalarsubquery(c1)#22]
:  +- Project [outer(c1#18)]
:     +- OneRowRelation
+- LocalRelation [c1#18, c2#19]
```

Optimized plan before this PR:
```
Project [c1#18#25 AS scalarsubquery(c1)#22]
+- Join LeftOuter, (c1#24 <=> c1#18)
   :- LocalRelation [c1#18]
   +- Aggregate [c1#18], [c1#18 AS c1#18#25, c1#18 AS c1#24]
      +- LocalRelation [c1#18]
```

Optimized plan after this PR:
```
LocalRelation [scalarsubquery(c1)#22]
```

### Why are the changes needed?
To optimize query plans.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Added new unit tests.

Closes #33284 from allisonwang-db/spark-36063-optimize-subquery-one-row-relation.

Authored-by: allisonwang-db <allison.wang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit de8e4be92c)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2021-07-22 10:48:48 +08:00
..
benchmarks [SPARK-34981][SQL][FOLLOWUP] Use SpecificInternalRow in ApplyFunctionExpression 2021-05-24 17:25:24 +09:00
src [SPARK-36063][SQL] Optimize OneRowRelation subqueries 2021-07-22 10:48:48 +08:00
pom.xml [SPARK-35784][SS] Implementation for RocksDB instance 2021-06-29 17:46:45 -07:00