spark-instrumented-optimizer/sql/core/benchmarks/DatasetBenchmark-results.txt
HyukjinKwon ebf01ec3c1 [SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines
### What changes were proposed in this pull request?

https://github.com/apache/spark/pull/32015 added a way to run benchmarks much more easily in the same GitHub Actions build. This PR updates the benchmark results by using the way.

**NOTE** that looks like GitHub Actions use four types of CPU given my observations:

- Intel(R) Xeon(R) Platinum 8171M CPU  2.60GHz
- Intel(R) Xeon(R) CPU E5-2673 v4  2.30GHz
- Intel(R) Xeon(R) CPU E5-2673 v3  2.40GHz
- Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz

Given my quick research, seems like they perform roughly similarly:

![Screen Shot 2021-04-03 at 9 31 23 PM](https://user-images.githubusercontent.com/6477701/113478478-f4b57b80-94c3-11eb-9047-f81ca8c59672.png)

I couldn't find enough information about Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz but the performance seems roughly similar given the numbers.

So shouldn't be a big deal especially given that this way is much easier, encourages contributors to run more and guarantee the same number of cores and same memory with the same softwares.

### Why are the changes needed?

To have a base line of the benchmarks accordingly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

It was generated from:

- [Run benchmarks: * (JDK 11)](https://github.com/HyukjinKwon/spark/actions/runs/713575465)
- [Run benchmarks: * (JDK 8)](https://github.com/HyukjinKwon/spark/actions/runs/713154337)

Closes #32044 from HyukjinKwon/SPARK-34950.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-04-03 23:02:56 +03:00

47 lines
3.8 KiB
Plaintext

================================================================================================
Dataset Benchmark
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
back-to-back map long: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 12276 12399 175 8.1 122.8 1.0X
DataFrame 2017 2094 110 49.6 20.2 6.1X
Dataset 3034 3044 15 33.0 30.3 4.0X
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
back-to-back map: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 16325 16361 50 6.1 163.2 1.0X
DataFrame 8463 8468 6 11.8 84.6 1.9X
Dataset 22525 23091 801 4.4 225.2 0.7X
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
back-to-back filter Long: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 3133 3136 3 31.9 31.3 1.0X
DataFrame 1194 1535 482 83.8 11.9 2.6X
Dataset 3146 3156 13 31.8 31.5 1.0X
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
back-to-back filter: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 5334 5352 25 18.7 53.3 1.0X
DataFrame 190 221 21 527.1 1.9 28.1X
Dataset 10536 10630 133 9.5 105.4 0.5X
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
aggregate: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD sum 4970 5099 181 20.1 49.7 1.0X
DataFrame sum 67 81 9 1483.8 0.7 73.8X
Dataset sum using Aggregator 9474 9771 420 10.6 94.7 0.5X
Dataset complex Aggregator 13975 14701 1028 7.2 139.7 0.4X