spark-instrumented-optimizer/sql/core/benchmarks/JsonBenchmark-results.txt
Dongjoon Hyun 4e0e4e51c4 [MINOR][TESTS] Rename JSONBenchmark to JsonBenchmark
### What changes were proposed in this pull request?

This PR renames `object JSONBenchmark` to `object JsonBenchmark` and the benchmark result file `JSONBenchmark-results.txt` to `JsonBenchmark-results.txt`.

### Why are the changes needed?

Since the file name doesn't match with `object JSONBenchmark`, it makes a confusion when we run the benchmark. In addition, this makes the automation difficult.
```
$ find . -name JsonBenchmark.scala
./sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonBenchmark.scala
```
```
$ build/sbt "sql/test:runMain org.apache.spark.sql.execution.datasources.json.JsonBenchmark"
[info] Running org.apache.spark.sql.execution.datasources.json.JsonBenchmark
[error] Error: Could not find or load main class org.apache.spark.sql.execution.datasources.json.JsonBenchmark
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

This is just renaming.

Closes #26008 from dongjoon-hyun/SPARK-RENAME-JSON.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 09:02:06 -07:00

113 lines
9.5 KiB
Plaintext

================================================================================================
Benchmark for performance of JSON parsing
================================================================================================
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
JSON schema inferring: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 69387 69850 407 1.4 693.9 1.0X
UTF-8 is set 112131 112205 83 0.9 1121.3 0.6X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
count a short column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 44542 44671 122 2.2 445.4 1.0X
UTF-8 is set 71793 71945 146 1.4 717.9 0.6X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
count a wide column: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 58615 59011 347 0.2 5861.5 1.0X
UTF-8 is set 102542 102719 153 0.1 10254.2 0.6X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
select wide row: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
No encoding 168861 170014 1552 0.0 337722.0 1.0X
UTF-8 is set 191140 191250 112 0.0 382280.3 0.9X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Select a subset of 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns 28017 28066 47 0.4 2801.7 1.0X
Select 1 column 24590 24641 67 0.4 2459.0 1.1X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
creation of JSON parser per line: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Short column without encoding 17179 17465 487 0.6 1717.9 1.0X
Short column with UTF-8 25173 25255 139 0.4 2517.3 0.7X
Wide column without encoding 146956 147069 104 0.1 14695.6 0.1X
Wide column with UTF-8 196626 197233 549 0.1 19662.6 0.1X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
JSON functions: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Text read 1206 1212 5 8.3 120.6 1.0X
from_json 27641 27680 34 0.4 2764.1 0.0X
json_tuple 43404 44377 860 0.2 4340.4 0.0X
get_json_object 26821 27239 619 0.4 2682.1 0.0X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Dataset of json strings: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Text read 5842 5865 33 8.6 116.8 1.0X
schema inferring 69673 70082 478 0.7 1393.5 0.1X
parsing 78805 81812 NaN 0.6 1576.1 0.1X
Preparing data for benchmarking ...
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Json files in the per-line mode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Text read 10782 10801 18 4.6 215.6 1.0X
Schema inferring 76817 77205 623 0.7 1536.3 0.1X
Parsing without charset 90638 91395 794 0.6 1812.8 0.1X
Parsing with UTF-8 120085 121975 1705 0.4 2401.7 0.1X
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps 4706 4717 9 2.1 470.6 1.0X
to_json(timestamp) 29447 29615 226 0.3 2944.7 0.2X
write timestamps to files 20251 20673 503 0.5 2025.1 0.2X
Create a dataset of dates 4157 4172 18 2.4 415.7 1.1X
to_json(date) 21267 21301 53 0.5 2126.7 0.2X
write dates to files 13477 13897 485 0.7 1347.7 0.3X
OpenJDK 64-Bit Server VM 1.8.0_222-b10 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
read timestamp text from files 2666 2687 29 3.8 266.6 1.0X
read timestamps from files 46967 47354 438 0.2 4696.7 0.1X
infer timestamps from files 97693 97745 65 0.1 9769.3 0.0X
read date text from files 2594 2599 5 3.9 259.4 1.0X
read date from files 35796 36008 195 0.3 3579.6 0.1X
timestamp strings 6367 6424 84 1.6 636.7 0.4X
parse timestamps from Dataset[String] 58863 59255 340 0.2 5886.3 0.0X
infer timestamps from Dataset[String] 114148 114820 836 0.1 11414.8 0.0X
date strings 7847 7863 22 1.3 784.7 0.3X
parse dates from Dataset[String] 49085 49289 212 0.2 4908.5 0.1X
from_json(timestamp) 77030 77335 395 0.1 7703.0 0.0X
from_json(date) 63275 63290 15 0.2 6327.5 0.0X