spark-instrumented-optimizer/sql/core/benchmarks/AggregateBenchmark-jdk11-results.txt
Cheng Su c4ad86f311 [SPARK-35235][SQL][TEST] Add row-based hash map into aggregate benchmark
### What changes were proposed in this pull request?

`AggregateBenchmark` is only testing the performance for vectorized fast hash map, but not row-based hash map (which is used by default). We should add the row-based hash map into the benchmark.

java 8 benchmark run - https://github.com/c21/spark/actions/runs/787731549
java 11 benchmark run - https://github.com/c21/spark/actions/runs/787742858

### Why are the changes needed?

To have and track a basic sense of benchmarking different fast hash map used in hash aggregate.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing unit test, as this only touches benchmark code.

Closes #32357 from c21/agg-benchmark.

Authored-by: Cheng Su <chengsu@fb.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2021-04-27 06:53:42 +00:00

149 lines
11 KiB
Plaintext

================================================================================================
aggregate without grouping
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
agg w/o group: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg w/o group wholestage off 82274 82877 853 25.5 39.2 1.0X
agg w/o group wholestage on 1322 1358 37 1586.7 0.6 62.2X
================================================================================================
stat functions
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
stddev: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
stddev wholestage off 8975 9129 219 11.7 85.6 1.0X
stddev wholestage on 1424 1444 34 73.6 13.6 6.3X
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
kurtosis: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
kurtosis wholestage off 42273 42424 213 2.5 403.1 1.0X
kurtosis wholestage on 1492 1528 27 70.3 14.2 28.3X
================================================================================================
aggregate with linear keys
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 10873 10998 176 7.7 129.6 1.0X
codegen = T, hashmap = F 5906 6005 95 14.2 70.4 1.8X
codegen = T, row-based hashmap = T 2325 2410 94 36.1 27.7 4.7X
codegen = T, vectorized hashmap = T 1185 1259 78 70.8 14.1 9.2X
================================================================================================
aggregate with randomized keys
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 12385 12470 120 6.8 147.6 1.0X
codegen = T, hashmap = F 7734 8110 378 10.8 92.2 1.6X
codegen = T, row-based hashmap = T 3663 3702 37 22.9 43.7 3.4X
codegen = T, vectorized hashmap = T 2532 2621 54 33.1 30.2 4.9X
================================================================================================
aggregate with string key
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w string key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 4465 4517 73 4.7 212.9 1.0X
codegen = T, hashmap = F 2667 2825 208 7.9 127.2 1.7X
codegen = T, row-based hashmap = T 1436 1466 21 14.6 68.5 3.1X
codegen = T, vectorized hashmap = T 1297 1301 5 16.2 61.8 3.4X
================================================================================================
aggregate with decimal key
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w decimal key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 3722 3746 34 5.6 177.5 1.0X
codegen = T, hashmap = F 2229 2297 96 9.4 106.3 1.7X
codegen = T, row-based hashmap = T 927 957 28 22.6 44.2 4.0X
codegen = T, vectorized hashmap = T 772 796 22 27.2 36.8 4.8X
================================================================================================
aggregate with multiple key types
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w multiple keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 7013 7060 67 3.0 334.4 1.0X
codegen = T, hashmap = F 3750 3894 205 5.6 178.8 1.9X
codegen = T, row-based hashmap = T 2948 2952 5 7.1 140.6 2.4X
codegen = T, vectorized hashmap = T 2986 3145 226 7.0 142.4 2.3X
================================================================================================
max function bytecode size of wholestagecodegen
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
max function bytecode size: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 567 620 37 1.2 864.6 1.0X
codegen = T, hugeMethodLimit = 10000 283 316 26 2.3 431.9 2.0X
codegen = T, hugeMethodLimit = 1500 275 324 40 2.4 420.2 2.1X
================================================================================================
cube
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
cube: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
cube wholestage off 3389 3476 123 1.5 646.4 1.0X
cube wholestage on 1692 1726 34 3.1 322.7 2.0X
================================================================================================
hash and BytesToBytesMap
================================================================================================
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
BytesToBytesMap: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UnsafeRowhash 302 306 4 69.5 14.4 1.0X
murmur3 hash 125 129 3 167.8 6.0 2.4X
fast hash 69 73 3 304.1 3.3 4.4X
arrayEqual 192 195 3 109.0 9.2 1.6X
Java HashMap (Long) 133 187 53 157.2 6.4 2.3X
Java HashMap (two ints) 156 230 62 134.3 7.4 1.9X
Java HashMap (UnsafeRow) 807 812 6 26.0 38.5 0.4X
LongToUnsafeRowMap (opt=false) 502 529 24 41.8 23.9 0.6X
LongToUnsafeRowMap (opt=true) 148 164 20 141.7 7.1 2.0X
BytesToBytesMap (off Heap) 936 950 23 22.4 44.6 0.3X
BytesToBytesMap (on Heap) 954 956 2 22.0 45.5 0.3X
Aggregate HashMap 46 54 11 455.4 2.2 6.6X