spark-instrumented-optimizer/sql/core/benchmarks/DateTimeRebaseBenchmark-jdk11-results.txt
Max Gekk bbf2d6f6df [SPARK-33160][SQL][FOLLOWUP] Update benchmarks of INT96 type rebasing
### What changes were proposed in this pull request?
1. Turn off/on the SQL config `spark.sql.legacy.parquet.int96RebaseModeInWrite` which was added by https://github.com/apache/spark/pull/30056 in `DateTimeRebaseBenchmark`. The parquet readers should infer correct rebasing mode automatically from metadata.
2. Regenerate benchmark results of `DateTimeRebaseBenchmark` in the environment:

| Item | Description |
| ---- | ----|
| Region | us-west-2 (Oregon) |
| Instance | r3.xlarge (spot instance) |
| AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) |
| Java | OpenJDK8/11 installed by`sudo add-apt-repository ppa:openjdk-r/ppa` & `sudo apt install openjdk-11-jdk`|

### Why are the changes needed?
To have up-to-date info about INT96 performance which is the default type for Catalyst's timestamp type.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By updating benchmark results:
```
$ SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.DateTimeRebaseBenchmark"
```

Closes #30118 from MaxGekk/int96-rebase-benchmark.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-10-22 10:03:41 +09:00

155 lines
15 KiB
Plaintext

================================================================================================
Rebasing dates/timestamps in Parquet datasource
================================================================================================
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Save DATE to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1582, noop 21041 21041 0 4.8 210.4 1.0X
before 1582, noop 11202 11202 0 8.9 112.0 1.9X
after 1582, rebase EXCEPTION 32810 32810 0 3.0 328.1 0.6X
after 1582, rebase LEGACY 32530 32530 0 3.1 325.3 0.6X
after 1582, rebase CORRECTED 32849 32849 0 3.0 328.5 0.6X
before 1582, rebase LEGACY 23537 23537 0 4.2 235.4 0.9X
before 1582, rebase CORRECTED 22870 22870 0 4.4 228.7 0.9X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load DATE from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1582, vec off, rebase EXCEPTION 13114 13225 104 7.6 131.1 1.0X
after 1582, vec off, rebase LEGACY 13175 13189 15 7.6 131.8 1.0X
after 1582, vec off, rebase CORRECTED 13080 13115 34 7.6 130.8 1.0X
after 1582, vec on, rebase EXCEPTION 3698 3726 29 27.0 37.0 3.5X
after 1582, vec on, rebase LEGACY 3730 3745 17 26.8 37.3 3.5X
after 1582, vec on, rebase CORRECTED 3714 3758 75 26.9 37.1 3.5X
before 1582, vec off, rebase LEGACY 13519 13575 63 7.4 135.2 1.0X
before 1582, vec off, rebase CORRECTED 13210 13309 108 7.6 132.1 1.0X
before 1582, vec on, rebase LEGACY 4459 4488 44 22.4 44.6 2.9X
before 1582, vec on, rebase CORRECTED 3661 3718 88 27.3 36.6 3.6X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Save TIMESTAMP_INT96 to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, noop 2900 2900 0 34.5 29.0 1.0X
before 1900, noop 2848 2848 0 35.1 28.5 1.0X
after 1900, rebase EXCEPTION 27623 27623 0 3.6 276.2 0.1X
after 1900, rebase LEGACY 27305 27305 0 3.7 273.0 0.1X
after 1900, rebase CORRECTED 27715 27715 0 3.6 277.2 0.1X
before 1900, rebase LEGACY 30911 30911 0 3.2 309.1 0.1X
before 1900, rebase CORRECTED 27944 27944 0 3.6 279.4 0.1X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load TIMESTAMP_INT96 from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, vec off, rebase EXCEPTION 16853 16885 41 5.9 168.5 1.0X
after 1900, vec off, rebase LEGACY 16804 16816 21 6.0 168.0 1.0X
after 1900, vec off, rebase CORRECTED 16985 17020 58 5.9 169.9 1.0X
after 1900, vec on, rebase EXCEPTION 7044 7063 19 14.2 70.4 2.4X
after 1900, vec on, rebase LEGACY 7183 7255 94 13.9 71.8 2.3X
after 1900, vec on, rebase CORRECTED 7047 7137 86 14.2 70.5 2.4X
before 1900, vec off, rebase LEGACY 20371 20458 81 4.9 203.7 0.8X
before 1900, vec off, rebase CORRECTED 17484 17541 54 5.7 174.8 1.0X
before 1900, vec on, rebase LEGACY 10284 10327 45 9.7 102.8 1.6X
before 1900, vec on, rebase CORRECTED 7044 7073 37 14.2 70.4 2.4X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Save TIMESTAMP_MICROS to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, noop 2848 2848 0 35.1 28.5 1.0X
before 1900, noop 2855 2855 0 35.0 28.6 1.0X
after 1900, rebase EXCEPTION 15622 15622 0 6.4 156.2 0.2X
after 1900, rebase LEGACY 16148 16148 0 6.2 161.5 0.2X
after 1900, rebase CORRECTED 16946 16946 0 5.9 169.5 0.2X
before 1900, rebase LEGACY 19486 19486 0 5.1 194.9 0.1X
before 1900, rebase CORRECTED 17029 17029 0 5.9 170.3 0.2X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load TIMESTAMP_MICROS from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, vec off, rebase EXCEPTION 15785 15848 56 6.3 157.9 1.0X
after 1900, vec off, rebase LEGACY 15935 15954 17 6.3 159.3 1.0X
after 1900, vec off, rebase CORRECTED 15976 16046 62 6.3 159.8 1.0X
after 1900, vec on, rebase EXCEPTION 4925 4941 20 20.3 49.3 3.2X
after 1900, vec on, rebase LEGACY 5033 5041 11 19.9 50.3 3.1X
after 1900, vec on, rebase CORRECTED 4946 4972 29 20.2 49.5 3.2X
before 1900, vec off, rebase LEGACY 18619 18782 176 5.4 186.2 0.8X
before 1900, vec off, rebase CORRECTED 15956 16018 56 6.3 159.6 1.0X
before 1900, vec on, rebase LEGACY 8461 8472 14 11.8 84.6 1.9X
before 1900, vec on, rebase CORRECTED 4953 4962 12 20.2 49.5 3.2X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Save TIMESTAMP_MILLIS to parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, noop 3019 3019 0 33.1 30.2 1.0X
before 1900, noop 2896 2896 0 34.5 29.0 1.0X
after 1900, rebase EXCEPTION 15525 15525 0 6.4 155.2 0.2X
after 1900, rebase LEGACY 15903 15903 0 6.3 159.0 0.2X
after 1900, rebase CORRECTED 16468 16468 0 6.1 164.7 0.2X
before 1900, rebase LEGACY 19620 19620 0 5.1 196.2 0.2X
before 1900, rebase CORRECTED 16470 16470 0 6.1 164.7 0.2X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load TIMESTAMP_MILLIS from parquet: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, vec off, rebase EXCEPTION 16329 16357 26 6.1 163.3 1.0X
after 1900, vec off, rebase LEGACY 16609 16659 51 6.0 166.1 1.0X
after 1900, vec off, rebase CORRECTED 16659 16765 91 6.0 166.6 1.0X
after 1900, vec on, rebase EXCEPTION 6132 6162 28 16.3 61.3 2.7X
after 1900, vec on, rebase LEGACY 6344 6397 61 15.8 63.4 2.6X
after 1900, vec on, rebase CORRECTED 6023 6024 2 16.6 60.2 2.7X
before 1900, vec off, rebase LEGACY 19611 19626 13 5.1 196.1 0.8X
before 1900, vec off, rebase CORRECTED 16765 16784 19 6.0 167.7 1.0X
before 1900, vec on, rebase LEGACY 9136 9158 19 10.9 91.4 1.8X
before 1900, vec on, rebase CORRECTED 6023 6042 30 16.6 60.2 2.7X
================================================================================================
Rebasing dates/timestamps in ORC datasource
================================================================================================
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Save DATE to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1582, noop 20934 20934 0 4.8 209.3 1.0X
before 1582, noop 11098 11098 0 9.0 111.0 1.9X
after 1582 29249 29249 0 3.4 292.5 0.7X
before 1582 20059 20059 0 5.0 200.6 1.0X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load DATE from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1582, vec off 10751 10802 56 9.3 107.5 1.0X
after 1582, vec on 3815 3870 62 26.2 38.1 2.8X
before 1582, vec off 11144 11174 37 9.0 111.4 1.0X
before 1582, vec on 4120 4126 8 24.3 41.2 2.6X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Save TIMESTAMP to ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, noop 2858 2858 0 35.0 28.6 1.0X
before 1900, noop 2859 2859 0 35.0 28.6 1.0X
after 1900 17098 17098 0 5.8 171.0 0.2X
before 1900 20639 20639 0 4.8 206.4 0.1X
OpenJDK 64-Bit Server VM 11.0.8+10-post-Ubuntu-0ubuntu118.04.1 on Linux 5.3.0-1034-aws
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Load TIMESTAMP from ORC: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
after 1900, vec off 12292 12318 23 8.1 122.9 1.0X
after 1900, vec on 5198 5271 95 19.2 52.0 2.4X
before 1900, vec off 15108 15145 53 6.6 151.1 0.8X
before 1900, vec on 8085 8277 245 12.4 80.8 1.5X