9643eab53e
### What changes were proposed in this pull request? This patch proposes to support subexpression elimination for interpreted predicate. ### Why are the changes needed? Similar to interpreted projection, there are use cases when codegen predicate is not able to work, e.g. too complex schema, non-codegen expression, etc. When there are frequently occurring expressions (subexpressions) among predicate expression, the performance is quite bad as we need to re-compute same expressions. We should be able to support subexpression elimination for interpreted predicate like interpreted projection. ### Does this PR introduce _any_ user-facing change? No, this doesn't change user behavior. ### How was this patch tested? Unit test and benchmark. Closes #30497 from viirya/SPARK-33540. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
26 lines
1.9 KiB
Plaintext
26 lines
1.9 KiB
Plaintext
================================================================================================
|
|
Benchmark for performance of subexpression elimination
|
|
================================================================================================
|
|
|
|
Preparing data for benchmarking ...
|
|
OpenJDK 64-Bit Server VM 11.0.9+11 on Mac OS X 10.15.6
|
|
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
|
|
from_json as subExpr in Project: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
subExprElimination false, codegen: true 24827 25398 562 0.0 248271027.2 1.0X
|
|
subExprElimination false, codegen: false 25052 25704 625 0.0 250518603.6 1.0X
|
|
subExprElimination true, codegen: true 1540 1606 92 0.0 15403083.7 16.1X
|
|
subExprElimination true, codegen: false 1487 1535 53 0.0 14865051.6 16.7X
|
|
|
|
Preparing data for benchmarking ...
|
|
OpenJDK 64-Bit Server VM 11.0.9+11 on Mac OS X 10.15.6
|
|
Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
|
|
from_json as subExpr in Filter: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
subexpressionElimination off, codegen on 37327 38261 809 0.0 373266387.0 1.0X
|
|
subexpressionElimination off, codegen on 36126 37445 1575 0.0 361263987.0 1.0X
|
|
subexpressionElimination off, codegen on 20152 21596 1263 0.0 201522903.8 1.9X
|
|
subexpressionElimination off, codegen on 20799 20940 233 0.0 207993923.0 1.8X
|
|
|
|
|