854a0f752e
### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
61 lines
4.6 KiB
Plaintext
61 lines
4.6 KiB
Plaintext
================================================================================================
|
|
Parquet writer benchmark
|
|
================================================================================================
|
|
|
|
OpenJDK 64-Bit Server VM 11.0.4+11-LTS on Linux 3.10.0-862.3.2.el7.x86_64
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
Parquet writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Output Single Int Column 2552 2690 195 6.2 162.2 1.0X
|
|
Output Single Double Column 2865 2892 38 5.5 182.2 0.9X
|
|
Output Int and String Column 7876 7885 12 2.0 500.7 0.3X
|
|
Output Partitions 5079 5871 1120 3.1 322.9 0.5X
|
|
Output Buckets 6980 6994 20 2.3 443.8 0.4X
|
|
|
|
|
|
================================================================================================
|
|
ORC writer benchmark
|
|
================================================================================================
|
|
|
|
OpenJDK 64-Bit Server VM 11.0.4+11-LTS on Linux 3.10.0-862.3.2.el7.x86_64
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
ORC writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Output Single Int Column 1799 1902 146 8.7 114.4 1.0X
|
|
Output Single Double Column 2268 2276 11 6.9 144.2 0.8X
|
|
Output Int and String Column 6650 6670 28 2.4 422.8 0.3X
|
|
Output Partitions 4697 4719 31 3.3 298.6 0.4X
|
|
Output Buckets 6394 6436 60 2.5 406.5 0.3X
|
|
|
|
|
|
================================================================================================
|
|
JSON writer benchmark
|
|
================================================================================================
|
|
|
|
OpenJDK 64-Bit Server VM 11.0.4+11-LTS on Linux 3.10.0-862.3.2.el7.x86_64
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
JSON writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Output Single Int Column 2778 3522 1052 5.7 176.6 1.0X
|
|
Output Single Double Column 4222 4269 67 3.7 268.4 0.7X
|
|
Output Int and String Column 10822 10845 33 1.5 688.0 0.3X
|
|
Output Partitions 5450 5523 104 2.9 346.5 0.5X
|
|
Output Buckets 10827 11622 1123 1.5 688.4 0.3X
|
|
|
|
|
|
================================================================================================
|
|
CSV writer benchmark
|
|
================================================================================================
|
|
|
|
OpenJDK 64-Bit Server VM 11.0.4+11-LTS on Linux 3.10.0-862.3.2.el7.x86_64
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
CSV writer benchmark: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Output Single Int Column 3649 3698 68 4.3 232.0 1.0X
|
|
Output Single Double Column 4612 4696 120 3.4 293.2 0.8X
|
|
Output Int and String Column 7334 7517 258 2.1 466.3 0.5X
|
|
Output Partitions 6386 6541 220 2.5 406.0 0.6X
|
|
Output Buckets 8692 9439 1057 1.8 552.6 0.4X
|
|
|
|
|