spark-instrumented-optimizer/sql/core/benchmarks/JsonBenchmark-jdk11-results.txt
HyukjinKwon ebf01ec3c1 [SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines
### What changes were proposed in this pull request?

https://github.com/apache/spark/pull/32015 added a way to run benchmarks much more easily in the same GitHub Actions build. This PR updates the benchmark results by using the way.

**NOTE** that looks like GitHub Actions use four types of CPU given my observations:

- Intel(R) Xeon(R) Platinum 8171M CPU  2.60GHz
- Intel(R) Xeon(R) CPU E5-2673 v4  2.30GHz
- Intel(R) Xeon(R) CPU E5-2673 v3  2.40GHz
- Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz

Given my quick research, seems like they perform roughly similarly:

![Screen Shot 2021-04-03 at 9 31 23 PM](https://user-images.githubusercontent.com/6477701/113478478-f4b57b80-94c3-11eb-9047-f81ca8c59672.png)

I couldn't find enough information about Intel(R) Xeon(R) Platinum 8272CL CPU  2.60GHz but the performance seems roughly similar given the numbers.

So shouldn't be a big deal especially given that this way is much easier, encourages contributors to run more and guarantee the same number of cores and same memory with the same softwares.

### Why are the changes needed?

To have a base line of the benchmarks accordingly.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

It was generated from:

- [Run benchmarks: * (JDK 11)](https://github.com/HyukjinKwon/spark/actions/runs/713575465)
- [Run benchmarks: * (JDK 8)](https://github.com/HyukjinKwon/spark/actions/runs/713154337)

Closes #32044 from HyukjinKwon/SPARK-34950.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-04-03 23:02:56 +03:00

121 lines
10 KiB
Plaintext

================================================================================================
Benchmark for performance of JSON parsing
================================================================================================
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
JSON schema inferring: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 4180 4300 122 1.2 836.1 1.0X
UTF-8 is set 5506 5566 70 0.9 1101.3 0.8X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
count a short column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 2878 2926 58 1.7 575.6 1.0X
UTF-8 is set 4189 4239 43 1.2 837.8 0.7X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
count a wide column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 6729 6876 128 0.1 6728.7 1.0X
UTF-8 is set 10313 10402 126 0.1 10312.6 0.7X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
select wide row: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 15375 15551 201 0.0 307498.9 1.0X
UTF-8 is set 18257 18476 190 0.0 365135.8 0.8X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Select a subset of 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns 2664 2673 11 0.4 2664.2 1.0X
Select 1 column 2335 2353 16 0.4 2335.3 1.1X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
creation of JSON parser per line: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Short column without encoding 845 852 7 1.2 845.0 1.0X
Short column with UTF-8 1149 1161 12 0.9 1148.8 0.7X
Wide column without encoding 9971 9991 29 0.1 9971.1 0.1X
Wide column with UTF-8 14047 14059 14 0.1 14047.3 0.1X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
JSON functions: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Text read 90 91 1 11.1 90.4 1.0X
from_json 2265 2291 25 0.4 2265.3 0.0X
json_tuple 2585 2607 36 0.4 2584.7 0.0X
get_json_object 2381 2388 10 0.4 2381.0 0.0X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Dataset of json strings: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Text read 397 399 2 12.6 79.4 1.0X
schema inferring 3722 3770 43 1.3 744.4 0.1X
parsing 3265 3282 21 1.5 653.0 0.1X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Json files in the per-line mode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Text read 1030 1037 9 4.9 206.0 1.0X
Schema inferring 4515 4560 78 1.1 902.9 0.2X
Parsing without charset 3714 3772 64 1.3 742.7 0.3X
Parsing with UTF-8 5370 5476 97 0.9 1074.1 0.2X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps 174 178 5 5.7 174.4 1.0X
to_json(timestamp) 1354 1368 12 0.7 1353.8 0.1X
write timestamps to files 1215 1226 16 0.8 1214.5 0.1X
Create a dataset of dates 184 188 5 5.4 184.0 0.9X
to_json(date) 898 922 24 1.1 898.5 0.2X
write dates to files 708 716 10 1.4 708.1 0.2X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
read timestamp text from files 265 285 23 3.8 265.0 1.0X
read timestamps from files 3107 3132 23 0.3 3107.1 0.1X
infer timestamps from files 6316 6365 43 0.2 6315.5 0.0X
read date text from files 241 259 19 4.2 240.6 1.1X
read date from files 1259 1278 20 0.8 1259.4 0.2X
timestamp strings 290 293 4 3.4 290.3 0.9X
parse timestamps from Dataset[String] 3324 3359 34 0.3 3324.4 0.1X
infer timestamps from Dataset[String] 6868 6979 113 0.1 6867.7 0.0X
date strings 380 384 7 2.6 379.6 0.7X
parse dates from Dataset[String] 1650 1672 20 0.6 1649.8 0.2X
from_json(timestamp) 4944 4969 33 0.2 4943.7 0.1X
from_json(date) 3188 3251 57 0.3 3188.0 0.1X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters 24601 24817 219 0.0 246012.5 1.0X
pushdown disabled 24029 24183 137 0.0 240289.2 1.0X
w/ filters 782 794 12 0.1 7822.7 31.4X