[SPARK-35235][SQL][TEST] Add row-based hash map into aggregate benchmark
### What changes were proposed in this pull request? `AggregateBenchmark` is only testing the performance for vectorized fast hash map, but not row-based hash map (which is used by default). We should add the row-based hash map into the benchmark. java 8 benchmark run - https://github.com/c21/spark/actions/runs/787731549 java 11 benchmark run - https://github.com/c21/spark/actions/runs/787742858 ### Why are the changes needed? To have and track a basic sense of benchmarking different fast hash map used in hash aggregate. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit test, as this only touches benchmark code. Closes #32357 from c21/agg-benchmark. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
parent
eb08b9010a
commit
c4ad86f311
|
@ -2,142 +2,147 @@
|
|||
aggregate without grouping
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
agg w/o group: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
agg w/o group wholestage off 63666 64021 502 32.9 30.4 1.0X
|
||||
agg w/o group wholestage on 882 912 37 2376.9 0.4 72.2X
|
||||
agg w/o group wholestage off 82274 82877 853 25.5 39.2 1.0X
|
||||
agg w/o group wholestage on 1322 1358 37 1586.7 0.6 62.2X
|
||||
|
||||
|
||||
================================================================================================
|
||||
stat functions
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
stddev: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
stddev wholestage off 7370 7688 450 14.2 70.3 1.0X
|
||||
stddev wholestage on 931 997 50 112.6 8.9 7.9X
|
||||
stddev wholestage off 8975 9129 219 11.7 85.6 1.0X
|
||||
stddev wholestage on 1424 1444 34 73.6 13.6 6.3X
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
kurtosis: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
kurtosis wholestage off 30901 31209 436 3.4 294.7 1.0X
|
||||
kurtosis wholestage on 950 996 33 110.4 9.1 32.5X
|
||||
kurtosis wholestage off 42273 42424 213 2.5 403.1 1.0X
|
||||
kurtosis wholestage on 1492 1528 27 70.3 14.2 28.3X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with linear keys
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 8845 8874 41 9.5 105.4 1.0X
|
||||
codegen = T hashmap = F 5804 5854 47 14.5 69.2 1.5X
|
||||
codegen = T hashmap = T 954 1001 35 87.9 11.4 9.3X
|
||||
codegen = F 10873 10998 176 7.7 129.6 1.0X
|
||||
codegen = T, hashmap = F 5906 6005 95 14.2 70.4 1.8X
|
||||
codegen = T, row-based hashmap = T 2325 2410 94 36.1 27.7 4.7X
|
||||
codegen = T, vectorized hashmap = T 1185 1259 78 70.8 14.1 9.2X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with randomized keys
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 10398 10788 552 8.1 124.0 1.0X
|
||||
codegen = T hashmap = F 7426 7520 84 11.3 88.5 1.4X
|
||||
codegen = T hashmap = T 1883 1917 31 44.5 22.4 5.5X
|
||||
codegen = F 12385 12470 120 6.8 147.6 1.0X
|
||||
codegen = T, hashmap = F 7734 8110 378 10.8 92.2 1.6X
|
||||
codegen = T, row-based hashmap = T 3663 3702 37 22.9 43.7 3.4X
|
||||
codegen = T, vectorized hashmap = T 2532 2621 54 33.1 30.2 4.9X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with string key
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
Aggregate w string key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 3615 3888 386 5.8 172.4 1.0X
|
||||
codegen = T hashmap = F 2253 2381 168 9.3 107.4 1.6X
|
||||
codegen = T hashmap = T 1242 1316 59 16.9 59.2 2.9X
|
||||
codegen = F 4465 4517 73 4.7 212.9 1.0X
|
||||
codegen = T, hashmap = F 2667 2825 208 7.9 127.2 1.7X
|
||||
codegen = T, row-based hashmap = T 1436 1466 21 14.6 68.5 3.1X
|
||||
codegen = T, vectorized hashmap = T 1297 1301 5 16.2 61.8 3.4X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with decimal key
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
Aggregate w decimal key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 3437 3534 137 6.1 163.9 1.0X
|
||||
codegen = T hashmap = F 2122 2226 147 9.9 101.2 1.6X
|
||||
codegen = T hashmap = T 638 678 36 32.9 30.4 5.4X
|
||||
codegen = F 3722 3746 34 5.6 177.5 1.0X
|
||||
codegen = T, hashmap = F 2229 2297 96 9.4 106.3 1.7X
|
||||
codegen = T, row-based hashmap = T 927 957 28 22.6 44.2 4.0X
|
||||
codegen = T, vectorized hashmap = T 772 796 22 27.2 36.8 4.8X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with multiple key types
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
Aggregate w multiple keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 6549 6648 140 3.2 312.3 1.0X
|
||||
codegen = T hashmap = F 3591 3693 144 5.8 171.2 1.8X
|
||||
codegen = T hashmap = T 2822 2922 141 7.4 134.6 2.3X
|
||||
codegen = F 7013 7060 67 3.0 334.4 1.0X
|
||||
codegen = T, hashmap = F 3750 3894 205 5.6 178.8 1.9X
|
||||
codegen = T, row-based hashmap = T 2948 2952 5 7.1 140.6 2.4X
|
||||
codegen = T, vectorized hashmap = T 2986 3145 226 7.0 142.4 2.3X
|
||||
|
||||
|
||||
================================================================================================
|
||||
max function bytecode size of wholestagecodegen
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
max function bytecode size: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 531 571 36 1.2 810.7 1.0X
|
||||
codegen = T hugeMethodLimit = 10000 223 282 36 2.9 340.1 2.4X
|
||||
codegen = T hugeMethodLimit = 1500 264 308 27 2.5 402.2 2.0X
|
||||
codegen = F 567 620 37 1.2 864.6 1.0X
|
||||
codegen = T, hugeMethodLimit = 10000 283 316 26 2.3 431.9 2.0X
|
||||
codegen = T, hugeMethodLimit = 1500 275 324 40 2.4 420.2 2.1X
|
||||
|
||||
|
||||
================================================================================================
|
||||
cube
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
cube: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
cube wholestage off 2963 3099 193 1.8 565.1 1.0X
|
||||
cube wholestage on 1624 1767 98 3.2 309.8 1.8X
|
||||
cube wholestage off 3389 3476 123 1.5 646.4 1.0X
|
||||
cube wholestage on 1692 1726 34 3.1 322.7 2.0X
|
||||
|
||||
|
||||
================================================================================================
|
||||
hash and BytesToBytesMap
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 11.0.10+9-LTS on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
|
||||
BytesToBytesMap: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
UnsafeRowhash 247 268 19 84.8 11.8 1.0X
|
||||
murmur3 hash 99 123 40 211.3 4.7 2.5X
|
||||
fast hash 56 66 5 374.0 2.7 4.4X
|
||||
arrayEqual 186 200 8 113.0 8.8 1.3X
|
||||
Java HashMap (Long) 121 207 65 173.5 5.8 2.0X
|
||||
Java HashMap (two ints) 147 233 61 142.8 7.0 1.7X
|
||||
Java HashMap (UnsafeRow) 733 778 45 28.6 34.9 0.3X
|
||||
LongToUnsafeRowMap (opt=false) 489 504 15 42.8 23.3 0.5X
|
||||
LongToUnsafeRowMap (opt=true) 125 154 29 168.2 5.9 2.0X
|
||||
BytesToBytesMap (off Heap) 840 895 48 25.0 40.1 0.3X
|
||||
BytesToBytesMap (on Heap) 853 904 60 24.6 40.7 0.3X
|
||||
Aggregate HashMap 38 46 8 546.3 1.8 6.4X
|
||||
UnsafeRowhash 302 306 4 69.5 14.4 1.0X
|
||||
murmur3 hash 125 129 3 167.8 6.0 2.4X
|
||||
fast hash 69 73 3 304.1 3.3 4.4X
|
||||
arrayEqual 192 195 3 109.0 9.2 1.6X
|
||||
Java HashMap (Long) 133 187 53 157.2 6.4 2.3X
|
||||
Java HashMap (two ints) 156 230 62 134.3 7.4 1.9X
|
||||
Java HashMap (UnsafeRow) 807 812 6 26.0 38.5 0.4X
|
||||
LongToUnsafeRowMap (opt=false) 502 529 24 41.8 23.9 0.6X
|
||||
LongToUnsafeRowMap (opt=true) 148 164 20 141.7 7.1 2.0X
|
||||
BytesToBytesMap (off Heap) 936 950 23 22.4 44.6 0.3X
|
||||
BytesToBytesMap (on Heap) 954 956 2 22.0 45.5 0.3X
|
||||
Aggregate HashMap 46 54 11 455.4 2.2 6.6X
|
||||
|
||||
|
||||
|
|
|
@ -2,142 +2,147 @@
|
|||
aggregate without grouping
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
agg w/o group: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
agg w/o group wholestage off 47798 50190 NaN 43.9 22.8 1.0X
|
||||
agg w/o group wholestage on 1091 1128 28 1922.6 0.5 43.8X
|
||||
agg w/o group wholestage off 53440 63455 NaN 39.2 25.5 1.0X
|
||||
agg w/o group wholestage on 1157 1216 39 1812.5 0.6 46.2X
|
||||
|
||||
|
||||
================================================================================================
|
||||
stat functions
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
stddev: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
stddev wholestage off 7884 7959 106 13.3 75.2 1.0X
|
||||
stddev wholestage on 1012 1072 34 103.6 9.6 7.8X
|
||||
stddev wholestage off 7920 7947 39 13.2 75.5 1.0X
|
||||
stddev wholestage on 1147 1160 11 91.4 10.9 6.9X
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
kurtosis: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
kurtosis wholestage off 34023 34576 783 3.1 324.5 1.0X
|
||||
kurtosis wholestage on 1092 1121 30 96.1 10.4 31.2X
|
||||
kurtosis wholestage off 35143 35319 250 3.0 335.1 1.0X
|
||||
kurtosis wholestage on 1239 1258 20 84.6 11.8 28.4X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with linear keys
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 9309 9379 99 9.0 111.0 1.0X
|
||||
codegen = T hashmap = F 5453 5643 223 15.4 65.0 1.7X
|
||||
codegen = T hashmap = T 1084 1110 16 77.4 12.9 8.6X
|
||||
codegen = F 9147 9183 50 9.2 109.0 1.0X
|
||||
codegen = T, hashmap = F 5794 5949 226 14.5 69.1 1.6X
|
||||
codegen = T, row-based hashmap = T 1378 1397 14 60.9 16.4 6.6X
|
||||
codegen = T, vectorized hashmap = T 996 1034 25 84.3 11.9 9.2X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with randomized keys
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 10707 10950 344 7.8 127.6 1.0X
|
||||
codegen = T hashmap = F 7295 7423 145 11.5 87.0 1.5X
|
||||
codegen = T hashmap = T 2057 2199 199 40.8 24.5 5.2X
|
||||
codegen = F 9356 9425 98 9.0 111.5 1.0X
|
||||
codegen = T, hashmap = F 5787 5912 176 14.5 69.0 1.6X
|
||||
codegen = T, row-based hashmap = T 2569 2602 49 32.7 30.6 3.6X
|
||||
codegen = T, vectorized hashmap = T 2094 2128 27 40.1 25.0 4.5X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with string key
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
Aggregate w string key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 4570 4573 4 4.6 217.9 1.0X
|
||||
codegen = T hashmap = F 3600 3686 74 5.8 171.7 1.3X
|
||||
codegen = T hashmap = T 2384 2432 45 8.8 113.7 1.9X
|
||||
codegen = F 4270 4322 75 4.9 203.6 1.0X
|
||||
codegen = T, hashmap = F 3241 3264 30 6.5 154.6 1.3X
|
||||
codegen = T, row-based hashmap = T 2196 2247 32 9.6 104.7 1.9X
|
||||
codegen = T, vectorized hashmap = T 2291 2306 14 9.2 109.3 1.9X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with decimal key
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
Aggregate w decimal key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 2966 3011 64 7.1 141.4 1.0X
|
||||
codegen = T hashmap = F 1857 1908 73 11.3 88.5 1.6X
|
||||
codegen = T hashmap = T 695 702 8 30.2 33.2 4.3X
|
||||
codegen = F 2993 3010 23 7.0 142.7 1.0X
|
||||
codegen = T, hashmap = F 1940 1945 7 10.8 92.5 1.5X
|
||||
codegen = T, row-based hashmap = T 738 752 20 28.4 35.2 4.1X
|
||||
codegen = T, vectorized hashmap = T 620 650 21 33.8 29.6 4.8X
|
||||
|
||||
|
||||
================================================================================================
|
||||
aggregate with multiple key types
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
Aggregate w multiple keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 7361 7385 35 2.8 351.0 1.0X
|
||||
codegen = T hashmap = F 4525 4688 231 4.6 215.8 1.6X
|
||||
codegen = T hashmap = T 3865 3977 159 5.4 184.3 1.9X
|
||||
codegen = F 6635 6636 2 3.2 316.4 1.0X
|
||||
codegen = T, hashmap = F 4236 4269 47 5.0 202.0 1.6X
|
||||
codegen = T, row-based hashmap = T 3118 3158 57 6.7 148.7 2.1X
|
||||
codegen = T, vectorized hashmap = T 3259 3278 27 6.4 155.4 2.0X
|
||||
|
||||
|
||||
================================================================================================
|
||||
max function bytecode size of wholestagecodegen
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
max function bytecode size: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
codegen = F 451 489 23 1.5 688.5 1.0X
|
||||
codegen = T hugeMethodLimit = 10000 211 229 19 3.1 322.4 2.1X
|
||||
codegen = T hugeMethodLimit = 1500 203 226 20 3.2 309.5 2.2X
|
||||
codegen = F 467 492 33 1.4 712.4 1.0X
|
||||
codegen = T, hugeMethodLimit = 10000 216 231 19 3.0 329.7 2.2X
|
||||
codegen = T, hugeMethodLimit = 1500 209 221 9 3.1 319.0 2.2X
|
||||
|
||||
|
||||
================================================================================================
|
||||
cube
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
cube: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
cube wholestage off 2479 2548 97 2.1 472.9 1.0X
|
||||
cube wholestage on 1487 1567 62 3.5 283.7 1.7X
|
||||
cube wholestage off 2490 2529 56 2.1 474.8 1.0X
|
||||
cube wholestage on 1401 1416 22 3.7 267.3 1.8X
|
||||
|
||||
|
||||
================================================================================================
|
||||
hash and BytesToBytesMap
|
||||
================================================================================================
|
||||
|
||||
OpenJDK 64-Bit Server VM 1.8.0_282-b08 on Linux 5.4.0-1043-azure
|
||||
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
|
||||
OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure
|
||||
Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
|
||||
BytesToBytesMap: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
||||
------------------------------------------------------------------------------------------------------------------------
|
||||
UnsafeRowhash 826 837 16 25.4 39.4 1.0X
|
||||
murmur3 hash 537 553 11 39.1 25.6 1.5X
|
||||
fast hash 559 572 14 37.5 26.6 1.5X
|
||||
arrayEqual 1665 1728 90 12.6 79.4 0.5X
|
||||
Java HashMap (Long) 732 739 7 28.7 34.9 1.1X
|
||||
Java HashMap (two ints) 682 694 15 30.7 32.5 1.2X
|
||||
Java HashMap (UnsafeRow) 1486 1499 19 14.1 70.9 0.6X
|
||||
LongToUnsafeRowMap (opt=false) 1235 1240 8 17.0 58.9 0.7X
|
||||
LongToUnsafeRowMap (opt=true) 718 736 17 29.2 34.2 1.2X
|
||||
BytesToBytesMap (off Heap) 945 965 20 22.2 45.1 0.9X
|
||||
BytesToBytesMap (on Heap) 870 895 28 24.1 41.5 0.9X
|
||||
Aggregate HashMap 64 71 5 325.6 3.1 12.8X
|
||||
UnsafeRowhash 259 264 5 81.0 12.3 1.0X
|
||||
murmur3 hash 113 121 3 185.7 5.4 2.3X
|
||||
fast hash 84 87 2 249.8 4.0 3.1X
|
||||
arrayEqual 172 180 4 121.9 8.2 1.5X
|
||||
Java HashMap (Long) 155 161 5 135.2 7.4 1.7X
|
||||
Java HashMap (two ints) 147 157 8 142.6 7.0 1.8X
|
||||
Java HashMap (UnsafeRow) 739 742 4 28.4 35.2 0.4X
|
||||
LongToUnsafeRowMap (opt=false) 489 491 3 42.9 23.3 0.5X
|
||||
LongToUnsafeRowMap (opt=true) 93 100 6 224.8 4.4 2.8X
|
||||
BytesToBytesMap (off Heap) 882 896 16 23.8 42.1 0.3X
|
||||
BytesToBytesMap (on Heap) 833 863 36 25.2 39.7 0.3X
|
||||
Aggregate HashMap 66 69 1 317.0 3.2 3.9X
|
||||
|
||||
|
||||
|
|
|
@ -80,7 +80,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = F", numIters = 3) { _ =>
|
||||
benchmark.addCase("codegen = T, hashmap = F", numIters = 3) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
|
||||
|
@ -89,7 +89,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = T", numIters = 5) { _ =>
|
||||
benchmark.addCase("codegen = T, row-based hashmap = T", numIters = 5) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
|
||||
f()
|
||||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T, vectorized hashmap = T", numIters = 5) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
|
@ -116,7 +125,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = F", numIters = 3) { _ =>
|
||||
benchmark.addCase("codegen = T, hashmap = F", numIters = 3) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
|
||||
|
@ -125,7 +134,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = T", numIters = 5) { _ =>
|
||||
benchmark.addCase("codegen = T, row-based hashmap = T", numIters = 5) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
|
||||
f()
|
||||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T, vectorized hashmap = T", numIters = 5) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
|
@ -151,7 +169,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = F", numIters = 3) { _ =>
|
||||
benchmark.addCase("codegen = T, hashmap = F", numIters = 3) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
|
||||
|
@ -160,7 +178,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = T", numIters = 5) { _ =>
|
||||
benchmark.addCase("codegen = T, row-based hashmap = T", numIters = 5) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
|
||||
f()
|
||||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T, vectorized hashmap = T", numIters = 5) { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
|
@ -186,7 +213,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = F") { _ =>
|
||||
benchmark.addCase("codegen = T, hashmap = F") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
|
||||
|
@ -195,7 +222,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = T") { _ =>
|
||||
benchmark.addCase("codegen = T, row-based hashmap = T") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
|
||||
f()
|
||||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T, vectorized hashmap = T") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
|
@ -231,7 +267,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = F") { _ =>
|
||||
benchmark.addCase("codegen = T, hashmap = F") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "false",
|
||||
|
@ -240,7 +276,16 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hashmap = T") { _ =>
|
||||
benchmark.addCase("codegen = T, row-based hashmap = T") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
SQLConf.ENABLE_VECTORIZED_HASH_MAP.key -> "false") {
|
||||
f()
|
||||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T, vectorized hashmap = T") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.ENABLE_TWOLEVEL_AGG_MAP.key -> "true",
|
||||
|
@ -291,7 +336,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hugeMethodLimit = 10000") { _ =>
|
||||
benchmark.addCase("codegen = T, hugeMethodLimit = 10000") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> "10000") {
|
||||
|
@ -299,7 +344,7 @@ object AggregateBenchmark extends SqlBasedBenchmark {
|
|||
}
|
||||
}
|
||||
|
||||
benchmark.addCase("codegen = T hugeMethodLimit = 1500") { _ =>
|
||||
benchmark.addCase("codegen = T, hugeMethodLimit = 1500") { _ =>
|
||||
withSQLConf(
|
||||
SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true",
|
||||
SQLConf.WHOLESTAGE_HUGE_METHOD_LIMIT.key -> "1500") {
|
||||
|
|
Loading…
Reference in a new issue