spark-instrumented-optimizer/sql/core/benchmarks/V2FunctionBenchmark-results.txt
Chao Sun 78221bda95 [SPARK-35361][SQL] Improve performance for ApplyFunctionExpression
### What changes were proposed in this pull request?

In `ApplyFunctionExpression`, move `zipWithIndex` out of the loop for each input row.

### Why are the changes needed?

When the `ScalarFunction` is trivial, `zipWithIndex` could incur significant costs, as shown below:

<img width="899" alt="Screen Shot 2021-05-11 at 10 03 42 AM" src="https://user-images.githubusercontent.com/506679/117866421-fb19de80-b24b-11eb-8c94-d5e8c8b1eda9.png">

By removing it out of the loop, I'm seeing sometimes 2x speedup from `V2FunctionBenchmark`. For instance:

Before:
```
scalar function (long + long) -> long, result_nullable = false codegen = false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
native_long_add                                                                         32437          32896         434         15.4          64.9       1.0X
java_long_add_default                                                                   85675          97045         NaN          5.8         171.3       0.4X
```

After:
```
scalar function (long + long) -> long, result_nullable = false codegen = false:  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
native_long_add                                                                         30182          30387         279         16.6          60.4       1.0X
java_long_add_default                                                                   42862          43009         209         11.7          85.7       0.7X
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests

Closes #32507 from sunchao/SPARK-35361.

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-05-12 10:16:35 +09:00

45 lines
5.4 KiB
Plaintext

OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
scalar function (long + long) -> long, result_nullable = true codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------------------------------------------
native_long_add 9723 11619 1643 51.4 19.4 1.0X
java_long_add_default 38003 38591 513 13.2 76.0 0.3X
java_long_add_magic 12398 13007 792 40.3 24.8 0.8X
java_long_add_static_magic 11551 11711 138 43.3 23.1 0.8X
scala_long_add_default 39482 39762 275 12.7 79.0 0.2X
scala_long_add_magic 12794 12830 33 39.1 25.6 0.8X
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
scalar function (long + long) -> long, result_nullable = false codegen = true: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------------------------------
native_long_add 9984 10285 303 50.1 20.0 1.0X
java_long_add_default 36510 36989 570 13.7 73.0 0.3X
java_long_add_magic 13391 13764 332 37.3 26.8 0.7X
java_long_add_static_magic 10033 10462 388 49.8 20.1 1.0X
scala_long_add_default 35104 35480 375 14.2 70.2 0.3X
scala_long_add_magic 13587 13899 366 36.8 27.2 0.7X
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
scalar function (long + long) -> long, result_nullable = true codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------------------------------------------
native_long_add 32473 32622 247 15.4 64.9 1.0X
java_long_add_default 44108 44120 11 11.3 88.2 0.7X
java_long_add_magic 166139 167629 1828 3.0 332.3 0.2X
java_long_add_static_magic 181452 183355 1668 2.8 362.9 0.2X
scala_long_add_default 42405 42652 330 11.8 84.8 0.8X
scala_long_add_magic 196868 198003 1033 2.5 393.7 0.2X
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
scalar function (long + long) -> long, result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------------------------------------------
native_long_add 30182 30387 279 16.6 60.4 1.0X
java_long_add_default 42862 43009 209 11.7 85.7 0.7X
java_long_add_magic 218295 219387 1078 2.3 436.6 0.1X
java_long_add_static_magic 211812 213150 1898 2.4 423.6 0.1X
scala_long_add_default 42401 42642 234 11.8 84.8 0.7X
scala_long_add_magic 214497 214760 307 2.3 429.0 0.1X