spark-instrumented-optimizer/sql/core/benchmarks/CSVBenchmark-results.txt
HyukjinKwon ec70467d4d [SPARK-34815][SQL] Update CSVBenchmark
### What changes were proposed in this pull request?

This PR updates CSVBenchmark especially we have a fix like https://github.com/apache/spark/pull/31858 that could potentially improve the performance.

### Why are the changes needed?

To have the updated benchmark results.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually ran the benchmark

Closes #31917 from HyukjinKwon/SPARK-34815.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-03-22 10:49:53 +03:00

68 lines
6.1 KiB
Plaintext

================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Parsing quoted values: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string 24185 24195 10 0.0 483694.2 1.0X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Wide rows with 1000 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns 61793 62388 532 0.0 61793.4 1.0X
Select 100 columns 21958 21993 34 0.0 21957.9 2.8X
Select one column 18215 18515 505 0.1 18215.0 3.4X
count() 5865 6168 296 0.2 5865.1 10.5X
Select 100 columns, one bad input field 39638 39739 124 0.0 39637.5 1.6X
Select 100 columns, corrupt record field 47290 48133 741 0.0 47290.0 1.3X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Count a dataset with 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count() 9935 10460 461 1.0 993.5 1.0X
Select 1 column + count() 6786 7179 342 1.5 678.6 1.5X
count() 2281 2458 165 4.4 228.1 4.4X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps 812 826 14 12.3 81.2 1.0X
to_csv(timestamp) 7548 7764 192 1.3 754.8 0.1X
write timestamps to files 7052 7193 141 1.4 705.2 0.1X
Create a dataset of dates 897 909 13 11.1 89.7 0.9X
to_csv(date) 4778 4787 10 2.1 477.8 0.2X
write dates to files 3853 3891 33 2.6 385.3 0.2X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
read timestamp text from files 1259 1262 4 7.9 125.9 1.0X
read timestamps from files 20030 20105 80 0.5 2003.0 0.1X
infer timestamps from files 39621 39674 61 0.3 3962.1 0.0X
read date text from files 1039 1068 40 9.6 103.9 1.2X
read date from files 9352 9363 10 1.1 935.2 0.1X
infer date from files 11465 11485 23 0.9 1146.5 0.1X
timestamp strings 1759 1812 59 5.7 175.9 0.7X
parse timestamps from Dataset[String] 20806 20858 75 0.5 2080.6 0.1X
infer timestamps from Dataset[String] 40537 40821 258 0.2 4053.7 0.0X
date strings 1808 1816 12 5.5 180.8 0.7X
parse dates from Dataset[String] 12080 12311 245 0.8 1208.0 0.1X
from_csv(timestamp) 20120 21503 1224 0.5 2012.0 0.1X
from_csv(date) 10607 10768 246 0.9 1060.7 0.1X
Java HotSpot(TM) 64-Bit Server VM 1.8.0_202-b08 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters 13109 13249 151 0.0 131086.4 1.0X
pushdown disabled 12951 12994 63 0.0 129509.7 1.0X
w/ filters 1095 1113 15 0.1 10953.7 12.0X