[SPARK-35235][SQL][TEST] Add row-based hash map into aggregate benchmark

### What changes were proposed in this pull request?

`AggregateBenchmark` is only testing the performance for vectorized fast hash map, but not row-based hash map (which is used by default). We should add the row-based hash map into the benchmark.

java 8 benchmark run - https://github.com/c21/spark/actions/runs/787731549
java 11 benchmark run - https://github.com/c21/spark/actions/runs/787742858

### Why are the changes needed?

To have and track a basic sense of benchmarking different fast hash map used in hash aggregate.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing unit test, as this only touches benchmark code.

Closes #32357 from c21/agg-benchmark.

Authored-by: Cheng Su <chengsu@fb.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
Cheng Su 2021-04-27 06:53:42 +00:00 committed by Wenchen Fan
parent eb08b9010a
commit c4ad86f311
3 changed files with 187 additions and 132 deletions

View file

@ -2,142 +2,147 @@
aggregate without grouping
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
agg w/o group: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg w/o group wholestage off 63666 64021 502 32.9 30.4 1.0X
agg w/o group wholestage on 882 912 37 2376.9 0.4 72.2X
agg w/o group wholestage off 82274 82877 853 25.5 39.2 1.0X
agg w/o group wholestage on 1322 1358 37 1586.7 0.6 62.2X
================================================================================================
stat functions
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
stddev: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
stddev wholestage off 7370 7688 450 14.2 70.3 1.0X
stddev wholestage on 931 997 50 112.6 8.9 7.9X
stddev wholestage off 8975 9129 219 11.7 85.6 1.0X
stddev wholestage on 1424 1444 34 73.6 13.6 6.3X
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
kurtosis: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
kurtosis wholestage off 30901 31209 436 3.4 294.7 1.0X
kurtosis wholestage on 950 996 33 110.4 9.1 32.5X
kurtosis wholestage off 42273 42424 213 2.5 403.1 1.0X
kurtosis wholestage on 1492 1528 27 70.3 14.2 28.3X
================================================================================================
aggregate with linear keys
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 8845 8874 41 9.5 105.4 1.0X
codegen = T hashmap = F 5804 5854 47 14.5 69.2 1.5X
codegen = T hashmap = T 954 1001 35 87.9 11.4 9.3X
codegen = F 10873 10998 176 7.7 129.6 1.0X
codegen = T, hashmap = F 5906 6005 95 14.2 70.4 1.8X
codegen = T, row-based hashmap = T 2325 2410 94 36.1 27.7 4.7X
codegen = T, vectorized hashmap = T 1185 1259 78 70.8 14.1 9.2X
================================================================================================
aggregate with randomized keys
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 10398 10788 552 8.1 124.0 1.0X
codegen = T hashmap = F 7426 7520 84 11.3 88.5 1.4X
codegen = T hashmap = T 1883 1917 31 44.5 22.4 5.5X
codegen = F 12385 12470 120 6.8 147.6 1.0X
codegen = T, hashmap = F 7734 8110 378 10.8 92.2 1.6X
codegen = T, row-based hashmap = T 3663 3702 37 22.9 43.7 3.4X
codegen = T, vectorized hashmap = T 2532 2621 54 33.1 30.2 4.9X
================================================================================================
aggregate with string key
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w string key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 3615 3888 386 5.8 172.4 1.0X
codegen = T hashmap = F 2253 2381 168 9.3 107.4 1.6X
codegen = T hashmap = T 1242 1316 59 16.9 59.2 2.9X
codegen = F 4465 4517 73 4.7 212.9 1.0X
codegen = T, hashmap = F 2667 2825 208 7.9 127.2 1.7X
codegen = T, row-based hashmap = T 1436 1466 21 14.6 68.5 3.1X
codegen = T, vectorized hashmap = T 1297 1301 5 16.2 61.8 3.4X
================================================================================================
aggregate with decimal key
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w decimal key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 3437 3534 137 6.1 163.9 1.0X
codegen = T hashmap = F 2122 2226 147 9.9 101.2 1.6X
codegen = T hashmap = T 638 678 36 32.9 30.4 5.4X
codegen = F 3722 3746 34 5.6 177.5 1.0X
codegen = T, hashmap = F 2229 2297 96 9.4 106.3 1.7X
codegen = T, row-based hashmap = T 927 957 28 22.6 44.2 4.0X
codegen = T, vectorized hashmap = T 772 796 22 27.2 36.8 4.8X
================================================================================================
aggregate with multiple key types
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Aggregate w multiple keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 6549 6648 140 3.2 312.3 1.0X
codegen = T hashmap = F 3591 3693 144 5.8 171.2 1.8X
codegen = T hashmap = T 2822 2922 141 7.4 134.6 2.3X
codegen = F 7013 7060 67 3.0 334.4 1.0X
codegen = T, hashmap = F 3750 3894 205 5.6 178.8 1.9X
codegen = T, row-based hashmap = T 2948 2952 5 7.1 140.6 2.4X
codegen = T, vectorized hashmap = T 2986 3145 226 7.0 142.4 2.3X
================================================================================================
max function bytecode size of wholestagecodegen
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
max function bytecode size: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 531 571 36 1.2 810.7 1.0X
codegen = T hugeMethodLimit = 10000 223 282 36 2.9 340.1 2.4X
codegen = T hugeMethodLimit = 1500 264 308 27 2.5 402.2 2.0X
codegen = F 567 620 37 1.2 864.6 1.0X
codegen = T, hugeMethodLimit = 10000 283 316 26 2.3 431.9 2.0X
codegen = T, hugeMethodLimit = 1500 275 324 40 2.4 420.2 2.1X
================================================================================================
cube
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
cube: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
cube wholestage off 2963 3099 193 1.8 565.1 1.0X
cube wholestage on 1624 1767 98 3.2 309.8 1.8X
cube wholestage off 3389 3476 123 1.5 646.4 1.0X
cube wholestage on 1692 1726 34 3.1 322.7 2.0X
================================================================================================
hash and BytesToBytesMap
================================================================================================
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
BytesToBytesMap: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UnsafeRowhash 247 268 19 84.8 11.8 1.0X
murmur3 hash 99 123 40 211.3 4.7 2.5X
fast hash 56 66 5 374.0 2.7 4.4X
arrayEqual 186 200 8 113.0 8.8 1.3X
Java HashMap (Long) 121 207 65 173.5 5.8 2.0X
Java HashMap (two ints) 147 233 61 142.8 7.0 1.7X
Java HashMap (UnsafeRow) 733 778 45 28.6 34.9 0.3X
LongToUnsafeRowMap (opt=false) 489 504 15 42.8 23.3 0.5X
LongToUnsafeRowMap (opt=true) 125 154 29 168.2 5.9 2.0X
BytesToBytesMap (off Heap) 840 895 48 25.0 40.1 0.3X
BytesToBytesMap (on Heap) 853 904 60 24.6 40.7 0.3X
Aggregate HashMap 38 46 8 546.3 1.8 6.4X
UnsafeRowhash 302 306 4 69.5 14.4 1.0X
murmur3 hash 125 129 3 167.8 6.0 2.4X
fast hash 69 73 3 304.1 3.3 4.4X
arrayEqual 192 195 3 109.0 9.2 1.6X
Java HashMap (Long) 133 187 53 157.2 6.4 2.3X
Java HashMap (two ints) 156 230 62 134.3 7.4 1.9X
Java HashMap (UnsafeRow) 807 812 6 26.0 38.5 0.4X
LongToUnsafeRowMap (opt=false) 502 529 24 41.8 23.9 0.6X
LongToUnsafeRowMap (opt=true) 148 164 20 141.7 7.1 2.0X
BytesToBytesMap (off Heap) 936 950 23 22.4 44.6 0.3X
BytesToBytesMap (on Heap) 954 956 2 22.0 45.5 0.3X
Aggregate HashMap 46 54 11 455.4 2.2 6.6X

View file

@ -2,142 +2,147 @@
aggregate without grouping
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
agg w/o group: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
agg w/o group wholestage off 47798 50190 NaN 43.9 22.8 1.0X
agg w/o group wholestage on 1091 1128 28 1922.6 0.5 43.8X
agg w/o group wholestage off 53440 63455 NaN 39.2 25.5 1.0X
agg w/o group wholestage on 1157 1216 39 1812.5 0.6 46.2X
================================================================================================
stat functions
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
stddev: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
stddev wholestage off 7884 7959 106 13.3 75.2 1.0X
stddev wholestage on 1012 1072 34 103.6 9.6 7.8X
stddev wholestage off 7920 7947 39 13.2 75.5 1.0X
stddev wholestage on 1147 1160 11 91.4 10.9 6.9X
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
kurtosis: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
kurtosis wholestage off 34023 34576 783 3.1 324.5 1.0X
kurtosis wholestage on 1092 1121 30 96.1 10.4 31.2X
kurtosis wholestage off 35143 35319 250 3.0 335.1 1.0X
kurtosis wholestage on 1239 1258 20 84.6 11.8 28.4X
================================================================================================
aggregate with linear keys
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 9309 9379 99 9.0 111.0 1.0X
codegen = T hashmap = F 5453 5643 223 15.4 65.0 1.7X
codegen = T hashmap = T 1084 1110 16 77.4 12.9 8.6X
codegen = F 9147 9183 50 9.2 109.0 1.0X
codegen = T, hashmap = F 5794 5949 226 14.5 69.1 1.6X
codegen = T, row-based hashmap = T 1378 1397 14 60.9 16.4 6.6X
codegen = T, vectorized hashmap = T 996 1034 25 84.3 11.9 9.2X
================================================================================================
aggregate with randomized keys
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 10707 10950 344 7.8 127.6 1.0X
codegen = T hashmap = F 7295 7423 145 11.5 87.0 1.5X
codegen = T hashmap = T 2057 2199 199 40.8 24.5 5.2X
codegen = F 9356 9425 98 9.0 111.5 1.0X
codegen = T, hashmap = F 5787 5912 176 14.5 69.0 1.6X
codegen = T, row-based hashmap = T 2569 2602 49 32.7 30.6 3.6X
codegen = T, vectorized hashmap = T 2094 2128 27 40.1 25.0 4.5X
================================================================================================
aggregate with string key
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Aggregate w string key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 4570 4573 4 4.6 217.9 1.0X
codegen = T hashmap = F 3600 3686 74 5.8 171.7 1.3X
codegen = T hashmap = T 2384 2432 45 8.8 113.7 1.9X
codegen = F 4270 4322 75 4.9 203.6 1.0X
codegen = T, hashmap = F 3241 3264 30 6.5 154.6 1.3X
codegen = T, row-based hashmap = T 2196 2247 32 9.6 104.7 1.9X
codegen = T, vectorized hashmap = T 2291 2306 14 9.2 109.3 1.9X
================================================================================================
aggregate with decimal key
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Aggregate w decimal key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 2966 3011 64 7.1 141.4 1.0X
codegen = T hashmap = F 1857 1908 73 11.3 88.5 1.6X
codegen = T hashmap = T 695 702 8 30.2 33.2 4.3X
codegen = F 2993 3010 23 7.0 142.7 1.0X
codegen = T, hashmap = F 1940 1945 7 10.8 92.5 1.5X
codegen = T, row-based hashmap = T 738 752 20 28.4 35.2 4.1X
codegen = T, vectorized hashmap = T 620 650 21 33.8 29.6 4.8X
================================================================================================
aggregate with multiple key types
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Aggregate w multiple keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 7361 7385 35 2.8 351.0 1.0X
codegen = T hashmap = F 4525 4688 231 4.6 215.8 1.6X
codegen = T hashmap = T 3865 3977 159 5.4 184.3 1.9X
codegen = F 6635 6636 2 3.2 316.4 1.0X
codegen = T, hashmap = F 4236 4269 47 5.0 202.0 1.6X
codegen = T, row-based hashmap = T 3118 3158 57 6.7 148.7 2.1X
codegen = T, vectorized hashmap = T 3259 3278 27 6.4 155.4 2.0X
================================================================================================
max function bytecode size of wholestagecodegen
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
max function bytecode size: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
codegen = F 451 489 23 1.5 688.5 1.0X
codegen = T hugeMethodLimit = 10000 211 229 19 3.1 322.4 2.1X
codegen = T hugeMethodLimit = 1500 203 226 20 3.2 309.5 2.2X
codegen = F 467 492 33 1.4 712.4 1.0X
codegen = T, hugeMethodLimit = 10000 216 231 19 3.0 329.7 2.2X
codegen = T, hugeMethodLimit = 1500 209 221 9 3.1 319.0 2.2X
================================================================================================
cube
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
cube: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
cube wholestage off 2479 2548 97 2.1 472.9 1.0X
cube wholestage on 1487 1567 62 3.5 283.7 1.7X
cube wholestage off 2490 2529 56 2.1 474.8 1.0X
cube wholestage on 1401 1416 22 3.7 267.3 1.8X
================================================================================================
hash and BytesToBytesMap
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
BytesToBytesMap: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
UnsafeRowhash 826 837 16 25.4 39.4 1.0X
murmur3 hash 537 553 11 39.1 25.6 1.5X
fast hash 559 572 14 37.5 26.6 1.5X
arrayEqual 1665 1728 90 12.6 79.4 0.5X
Java HashMap (Long) 732 739 7 28.7 34.9 1.1X
Java HashMap (two ints) 682 694 15 30.7 32.5 1.2X
Java HashMap (UnsafeRow) 1486 1499 19 14.1 70.9 0.6X
LongToUnsafeRowMap (opt=false) 1235 1240 8 17.0 58.9 0.7X
LongToUnsafeRowMap (opt=true) 718 736 17 29.2 34.2 1.2X
BytesToBytesMap (off Heap) 945 965 20 22.2 45.1 0.9X
BytesToBytesMap (on Heap) 870 895 28 24.1 41.5 0.9X
Aggregate HashMap 64 71 5 325.6 3.1 12.8X
UnsafeRowhash 259 264 5 81.0 12.3 1.0X
murmur3 hash 113 121 3 185.7 5.4 2.3X
fast hash 84 87 2 249.8 4.0 3.1X
arrayEqual 172 180 4 121.9 8.2 1.5X
Java HashMap (Long) 155 161 5 135.2 7.4 1.7X
Java HashMap (two ints) 147 157 8 142.6 7.0 1.8X
Java HashMap (UnsafeRow) 739 742 4 28.4 35.2 0.4X
LongToUnsafeRowMap (opt=false) 489 491 3 42.9 23.3 0.5X
LongToUnsafeRowMap (opt=true) 93 100 6 224.8 4.4 2.8X
BytesToBytesMap (off Heap) 882 896 16 23.8 42.1 0.3X
BytesToBytesMap (on Heap) 833 863 36 25.2 39.7 0.3X
Aggregate HashMap 66 69 1 317.0 3.2 3.9X

View file

@ -80,7 +80,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = F", numIters = 3) { _ =>
benchmark.addCase("codegen = T, hashmap = F", numIters = 3) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
@ -89,7 +89,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = T", numIters = 5) { _ =>
benchmark.addCase("codegen = T, row-based hashmap = T", numIters = 5) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
f()
}
}
benchmark.addCase("codegen = T, vectorized hashmap = T", numIters = 5) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
@ -116,7 +125,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = F", numIters = 3) { _ =>
benchmark.addCase("codegen = T, hashmap = F", numIters = 3) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
@ -125,7 +134,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = T", numIters = 5) { _ =>
benchmark.addCase("codegen = T, row-based hashmap = T", numIters = 5) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
f()
}
}
benchmark.addCase("codegen = T, vectorized hashmap = T", numIters = 5) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
@ -151,7 +169,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = F", numIters = 3) { _ =>
benchmark.addCase("codegen = T, hashmap = F", numIters = 3) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
@ -160,7 +178,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = T", numIters = 5) { _ =>
benchmark.addCase("codegen = T, row-based hashmap = T", numIters = 5) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
f()
}
}
benchmark.addCase("codegen = T, vectorized hashmap = T", numIters = 5) { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
@ -186,7 +213,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = F") { _ =>
benchmark.addCase("codegen = T, hashmap = F") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
@ -195,7 +222,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = T") { _ =>
benchmark.addCase("codegen = T, row-based hashmap = T") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
f()
}
}
benchmark.addCase("codegen = T, vectorized hashmap = T") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
@ -231,7 +267,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = F") { _ =>
benchmark.addCase("codegen = T, hashmap = F") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
@ -240,7 +276,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hashmap = T") { _ =>
benchmark.addCase("codegen = T, row-based hashmap = T") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
f()
}
}
benchmark.addCase("codegen = T, vectorized hashmap = T") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
@ -291,7 +336,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hugeMethodLimit = 10000") { _ =>
benchmark.addCase("codegen = T, hugeMethodLimit = 10000") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> "10000") {
@ -299,7 +344,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
}
}
benchmark.addCase("codegen = T hugeMethodLimit = 1500") { _ =>
benchmark.addCase("codegen = T, hugeMethodLimit = 1500") { _ =>
withSQLConf(
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> "1500") {