9ec8696f11
### What changes were proposed in this pull request? When we do self join with transform in a CTE, spark will throw AnalysisException. A simple way to reproduce is ``` create temporary view t as select * from values 0, 1, 2 as t(a); WITH temp AS ( SELECT TRANSFORM(a) USING 'cat' AS (b string) FROM t ) SELECT t1.b FROM temp t1 JOIN temp t2 ON t1.b = t2.b ``` before this patch, it throws ``` org.apache.spark.sql.AnalysisException: cannot resolve '`t1.b`' given input columns: [t1.b]; line 6 pos 41; 'Project ['t1.b] +- 'Join Inner, ('t1.b = 't2.b) :- SubqueryAlias t1 : +- SubqueryAlias temp : +- ScriptTransformation [a#1], cat, [b#2], ScriptInputOutputSchema(List(),List(),Some(org.apache.hadoop.hive.serde2.DelimitedJSONSerDe),Some(org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe),List((field.delim, )),List((field.delim, )),Some(org.apache.hadoop.hive.ql.exec.TextRecordReader),Some(org.apache.hadoop.hive.ql.exec.TextRecordWriter),false) : +- SubqueryAlias t : +- Project [a#1] : +- SubqueryAlias t : +- LocalRelation [a#1] +- SubqueryAlias t2 +- SubqueryAlias temp +- ScriptTransformation [a#1], cat, [b#2], ScriptInputOutputSchema(List(),List(),Some(org.apache.hadoop.hive.serde2.DelimitedJSONSerDe),Some(org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe),List((field.delim, )),List((field.delim, )),Some(org.apache.hadoop.hive.ql.exec.TextRecordReader),Some(org.apache.hadoop.hive.ql.exec.TextRecordWriter),false) +- SubqueryAlias t +- Project [a#1] +- SubqueryAlias t +- LocalRelation [a#1] ``` ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? Add a UT Closes #31752 from WangGuangxin/selfjoin-with-transform. Authored-by: wangguangxin.cn <wangguangxin.cn@bytedance.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |