spark-instrumented-optimizer/sql/core/benchmarks/DatasetBenchmark-jdk11-results.txt
HyukjinKwon ebf01ec3c1 [SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines
### What changes were proposed in this pull request?

https://github.com/apache/spark/pull/32015 added a way to run benchmarks much more easily in the same GitHub Actions build. This PR updates the benchmark results by using the way.

**NOTE** that looks like GitHub Actions use four types of CPU given my observations:

- Intel(R) Xeon(R) Platinum 8171M CPU  2.60GHz
- Intel(R) Xeon(R) CPU E5-2673 v4  2.30GHz
- Intel(R) Xeon(R) CPU E5-2673 v3  2.40GHz
- Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz

Given my quick research, seems like they perform roughly similarly:

![Screen Shot 2021-04-03 at 9 31 23 PM](https://user-images.githubusercontent.com/6477701/113478478-f4b57b80-94c3-11eb-9047-f81ca8c59672.png)

I couldn't find enough information about Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz but the performance seems roughly similar given the numbers.

So shouldn't be a big deal especially given that this way is much easier, encourages contributors to run more and guarantee the same number of cores and same memory with the same softwares.

### Why are the changes needed?

To have a base line of the benchmarks accordingly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

It was generated from:

- [Run benchmarks: * (JDK 11)](https://github.com/HyukjinKwon/spark/actions/runs/713575465)
- [Run benchmarks: * (JDK 8)](https://github.com/HyukjinKwon/spark/actions/runs/713154337)

Closes #32044 from HyukjinKwon/SPARK-34950.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-04-03 23:02:56 +03:00

47 lines
3.8 KiB
Plaintext

================================================================================================
Dataset Benchmark
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
back-to-back map long: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 13660 13836 249 7.3 136.6 1.0X
DataFrame 2103 2125 30 47.5 21.0 6.5X
Dataset 2899 2910 16 34.5 29.0 4.7X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
back-to-back map: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 14939 14940 2 6.7 149.4 1.0X
DataFrame 5377 5529 216 18.6 53.8 2.8X
Dataset 15861 15923 88 6.3 158.6 0.9X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
back-to-back filter Long: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 3803 3842 56 26.3 38.0 1.0X
DataFrame 1359 1369 14 73.6 13.6 2.8X
Dataset 3667 3668 1 27.3 36.7 1.0X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
back-to-back filter: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD 4572 4595 33 21.9 45.7 1.0X
DataFrame 212 261 45 471.6 2.1 21.6X
Dataset 5629 5776 208 17.8 56.3 0.8X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
aggregate: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
RDD sum 3528 3563 50 28.3 35.3 1.0X
DataFrame sum 81 111 23 1240.3 0.8 43.8X
Dataset sum using Aggregator 5140 5164 34 19.5 51.4 0.7X
Dataset complex Aggregator 9815 9921 150 10.2 98.1 0.4X