0342dcb628
### What changes were proposed in this pull request? This patch implements `canonicalized` method for `HigherOrderFunction`. Basically it canonicalizes the name of all `NamedLambdaVariable`s and their `ExprId`. The name and `ExprId` of `NamedLambdaVariable` are unque. But to compare semantic equality between `HigherOrderFunction`, we can canonicalize them. ### Why are the changes needed? The default `canonicalized` method does not work for `HigherOrderFunction`. It makes subexpression elimination not work for higher functions. Manual check gen-ed code for: ```scala val df = Seq(Seq(1, 2, 3)).toDF("a") df.select(transform($"a", x => x + 1), transform($"a", x => x + 1)).collect() ``` The code for `transform(input[0, array<int>, true], lambdafunction((lambda x_20#19041 + 1), lambda x_20#19041, false)),transform(input[0, array<int>, true], lambdafunction((lambda x_21#19042 + 1), lambda x_21#19042, false))`, generated by `GenerateUnsafeProjection`. Before: ```java /* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection { ... /* 028 */ public UnsafeRow apply(InternalRow i) { ... /* 034 */ Object obj_0 = ((Expression) references[0]).eval(i); ... /* 062 */ Object obj_1 = ((Expression) references[1]).eval(i); ... /* 093 */ } ``` After: ```java /* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection { ... /* 031 */ public UnsafeRow apply(InternalRow i) { ... /* 033 */ subExpr_0(i); ... /* 086 */ private void subExpr_0(InternalRow i) { /* 087 */ Object obj_0 = ((Expression) references[0]).eval(i); /* 088 */ boolean isNull_0 = obj_0 == null; /* 089 */ ArrayData value_0 = null; /* 090 */ if (!isNull_0) { /* 091 */ value_0 = (ArrayData) obj_0; /* 092 */ } /* 093 */ subExprIsNull_0 = isNull_0; /* 094 */ mutableStateArray_0[0] = value_0; /* 095 */ } /* 096 */ /* 097 */ } ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test and manual check gen-ed code. Closes #32735 from viirya/higher-func-canonicalize. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |