spark-instrumented-optimizer/sql/core/benchmarks/DataSourceReadBenchmark-jdk11-results.txt
HyukjinKwon ebf01ec3c1 [SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines
### What changes were proposed in this pull request?

https://github.com/apache/spark/pull/32015 added a way to run benchmarks much more easily in the same GitHub Actions build. This PR updates the benchmark results by using the way.

**NOTE** that looks like GitHub Actions use four types of CPU given my observations:

- Intel(R) Xeon(R) Platinum 8171M CPU  2.60GHz
- Intel(R) Xeon(R) CPU E5-2673 v4  2.30GHz
- Intel(R) Xeon(R) CPU E5-2673 v3  2.40GHz
- Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz

Given my quick research, seems like they perform roughly similarly:

![Screen Shot 2021-04-03 at 9 31 23 PM](https://user-images.githubusercontent.com/6477701/113478478-f4b57b80-94c3-11eb-9047-f81ca8c59672.png)

I couldn't find enough information about Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz but the performance seems roughly similar given the numbers.

So shouldn't be a big deal especially given that this way is much easier, encourages contributors to run more and guarantee the same number of cores and same memory with the same softwares.

### Why are the changes needed?

To have a base line of the benchmarks accordingly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

It was generated from:

- [Run benchmarks: * (JDK 11)](https://github.com/HyukjinKwon/spark/actions/runs/713575465)
- [Run benchmarks: * (JDK 8)](https://github.com/HyukjinKwon/spark/actions/runs/713154337)

Closes #32044 from HyukjinKwon/SPARK-34950.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-04-03 23:02:56 +03:00

253 lines
22 KiB
Plaintext

================================================================================================
SQL Single Numeric Column Scan
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
SQL Single TINYINT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 13405 13422 24 1.2 852.3 1.0X
SQL Json 10723 10788 92 1.5 681.7 1.3X
SQL Parquet Vectorized 164 217 50 95.9 10.4 81.8X
SQL Parquet MR 2349 2440 129 6.7 149.3 5.7X
SQL ORC Vectorized 312 346 23 50.4 19.8 43.0X
SQL ORC MR 1610 1659 69 9.8 102.4 8.3X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Parquet Reader Single TINYINT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------------------------------------
ParquetReader Vectorized 187 209 20 84.3 11.9 1.0X
ParquetReader Vectorized -> Row 89 95 5 177.6 5.6 2.1X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
SQL Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 14214 14549 474 1.1 903.7 1.0X
SQL Json 11866 11934 95 1.3 754.4 1.2X
SQL Parquet Vectorized 294 342 53 53.6 18.7 48.4X
SQL Parquet MR 2929 3004 107 5.4 186.2 4.9X
SQL ORC Vectorized 312 328 15 50.4 19.8 45.5X
SQL ORC MR 2037 2097 84 7.7 129.5 7.0X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Parquet Reader Single SMALLINT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
--------------------------------------------------------------------------------------------------------------------------
ParquetReader Vectorized 249 266 18 63.1 15.8 1.0X
ParquetReader Vectorized -> Row 192 247 36 82.1 12.2 1.3X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
SQL Single INT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 15502 15817 446 1.0 985.6 1.0X
SQL Json 12638 12646 11 1.2 803.5 1.2X
SQL Parquet Vectorized 193 256 44 81.7 12.2 80.5X
SQL Parquet MR 2943 2953 14 5.3 187.1 5.3X
SQL ORC Vectorized 324 370 34 48.5 20.6 47.8X
SQL ORC MR 2110 2163 75 7.5 134.1 7.3X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Parquet Reader Single INT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
ParquetReader Vectorized 276 287 14 57.0 17.6 1.0X
ParquetReader Vectorized -> Row 309 320 9 50.9 19.6 0.9X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
SQL Single BIGINT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 20156 20694 761 0.8 1281.5 1.0X
SQL Json 15228 15380 214 1.0 968.2 1.3X
SQL Parquet Vectorized 325 346 20 48.4 20.7 62.0X
SQL Parquet MR 3144 3228 118 5.0 199.9 6.4X
SQL ORC Vectorized 516 526 7 30.5 32.8 39.0X
SQL ORC MR 2353 2367 19 6.7 149.6 8.6X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Parquet Reader Single BIGINT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
ParquetReader Vectorized 372 396 24 42.3 23.6 1.0X
ParquetReader Vectorized -> Row 437 462 25 36.0 27.8 0.9X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
SQL Single FLOAT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 17413 17599 263 0.9 1107.1 1.0X
SQL Json 14416 14453 53 1.1 916.5 1.2X
SQL Parquet Vectorized 181 225 35 86.8 11.5 96.1X
SQL Parquet MR 2940 2996 78 5.3 186.9 5.9X
SQL ORC Vectorized 470 494 29 33.5 29.9 37.1X
SQL ORC MR 2351 2379 39 6.7 149.5 7.4X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Parquet Reader Single FLOAT Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
ParquetReader Vectorized 268 282 14 58.7 17.0 1.0X
ParquetReader Vectorized -> Row 298 321 18 52.8 18.9 0.9X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
SQL Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 21666 21697 43 0.7 1377.5 1.0X
SQL Json 18307 18363 79 0.9 1163.9 1.2X
SQL Parquet Vectorized 310 337 22 50.7 19.7 69.9X
SQL Parquet MR 3089 3103 19 5.1 196.4 7.0X
SQL ORC Vectorized 589 617 31 26.7 37.5 36.8X
SQL ORC MR 2307 2377 98 6.8 146.7 9.4X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Parquet Reader Single DOUBLE Column Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
ParquetReader Vectorized 400 415 18 39.3 25.4 1.0X
ParquetReader Vectorized -> Row 393 406 11 40.1 25.0 1.0X
================================================================================================
Int and String Scan
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Int and String Scan: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 17703 17719 22 0.6 1688.3 1.0X
SQL Json 13095 13168 103 0.8 1248.9 1.4X
SQL Parquet Vectorized 2253 2266 19 4.7 214.8 7.9X
SQL Parquet MR 4913 4977 91 2.1 468.5 3.6X
SQL ORC Vectorized 2457 2467 14 4.3 234.3 7.2X
SQL ORC MR 4433 4464 44 2.4 422.8 4.0X
================================================================================================
Repeated String Scan
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Repeated String: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 9741 9804 89 1.1 929.0 1.0X
SQL Json 8230 8401 241 1.3 784.9 1.2X
SQL Parquet Vectorized 618 650 31 17.0 58.9 15.8X
SQL Parquet MR 2258 2311 75 4.6 215.4 4.3X
SQL ORC Vectorized 608 629 15 17.3 58.0 16.0X
SQL ORC MR 2466 2479 18 4.3 235.2 4.0X
================================================================================================
Partitioned Table Scan
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Partitioned Table: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Data column - CSV 24195 24573 534 0.7 1538.3 1.0X
Data column - Json 14746 14883 194 1.1 937.5 1.6X
Data column - Parquet Vectorized 352 385 34 44.7 22.4 68.7X
Data column - Parquet MR 3674 3694 27 4.3 233.6 6.6X
Data column - ORC Vectorized 480 505 26 32.8 30.5 50.4X
Data column - ORC MR 2913 3004 128 5.4 185.2 8.3X
Partition column - CSV 7527 7544 23 2.1 478.6 3.2X
Partition column - Json 11955 12051 135 1.3 760.1 2.0X
Partition column - Parquet Vectorized 65 92 29 242.5 4.1 373.0X
Partition column - Parquet MR 1614 1628 21 9.7 102.6 15.0X
Partition column - ORC Vectorized 71 99 29 220.1 4.5 338.5X
Partition column - ORC MR 1761 1769 11 8.9 112.0 13.7X
Both columns - CSV 24077 24127 70 0.7 1530.8 1.0X
Both columns - Json 15286 15479 273 1.0 971.9 1.6X
Both columns - Parquet Vectorized 376 412 40 41.9 23.9 64.4X
Both columns - Parquet MR 3808 3826 26 4.1 242.1 6.4X
Both columns - ORC Vectorized 560 604 42 28.1 35.6 43.2X
Both columns - ORC MR 3046 3080 49 5.2 193.7 7.9X
================================================================================================
String with Nulls Scan
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
String with Nulls Scan (0.0%): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 11805 12021 306 0.9 1125.8 1.0X
SQL Json 12051 12105 77 0.9 1149.3 1.0X
SQL Parquet Vectorized 1474 1545 100 7.1 140.6 8.0X
SQL Parquet MR 4488 4492 4 2.3 428.1 2.6X
ParquetReader Vectorized 1140 1140 1 9.2 108.7 10.4X
SQL ORC Vectorized 1164 1178 20 9.0 111.0 10.1X
SQL ORC MR 3745 3817 102 2.8 357.1 3.2X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
String with Nulls Scan (50.0%): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 9814 9837 33 1.1 936.0 1.0X
SQL Json 9317 9445 182 1.1 888.5 1.1X
SQL Parquet Vectorized 1117 1155 52 9.4 106.6 8.8X
SQL Parquet MR 3463 3538 106 3.0 330.3 2.8X
ParquetReader Vectorized 1033 1039 8 10.1 98.6 9.5X
SQL ORC Vectorized 1307 1353 65 8.0 124.7 7.5X
SQL ORC MR 3644 3690 65 2.9 347.5 2.7X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
String with Nulls Scan (95.0%): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 8145 8270 176 1.3 776.8 1.0X
SQL Json 5714 5764 71 1.8 544.9 1.4X
SQL Parquet Vectorized 235 264 15 44.6 22.4 34.7X
SQL Parquet MR 2398 2412 19 4.4 228.7 3.4X
ParquetReader Vectorized 248 262 11 42.3 23.6 32.9X
SQL ORC Vectorized 430 462 37 24.4 41.0 18.9X
SQL ORC MR 1983 1993 14 5.3 189.1 4.1X
================================================================================================
Single Column Scan From Wide Columns
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Single Column Scan from 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 2448 2461 18 0.4 2334.3 1.0X
SQL Json 3332 3370 53 0.3 3177.6 0.7X
SQL Parquet Vectorized 51 87 25 20.7 48.2 48.4X
SQL Parquet MR 239 278 35 4.4 227.5 10.3X
SQL ORC Vectorized 60 82 19 17.5 57.3 40.8X
SQL ORC MR 197 219 26 5.3 188.3 12.4X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Single Column Scan from 50 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 6034 6061 39 0.2 5754.0 1.0X
SQL Json 12232 12315 118 0.1 11665.4 0.5X
SQL Parquet Vectorized 73 120 30 14.4 69.6 82.6X
SQL Parquet MR 316 368 44 3.3 301.1 19.1X
SQL ORC Vectorized 76 122 36 13.7 72.9 79.0X
SQL ORC MR 206 261 47 5.1 196.5 29.3X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Single Column Scan from 100 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
SQL CSV 10307 10309 4 0.1 9829.0 1.0X
SQL Json 23412 23539 180 0.0 22327.7 0.4X
SQL Parquet Vectorized 105 151 23 10.0 99.9 98.4X
SQL Parquet MR 295 325 29 3.6 281.5 34.9X
SQL ORC Vectorized 85 112 31 12.4 81.0 121.4X
SQL ORC MR 212 255 66 4.9 202.3 48.6X