spark-instrumented-optimizer/sql/core/benchmarks/CompressionSchemeBenchmark-results.txt
Chao Sun a6d6ea3efe [SPARK-32802][SQL] Avoid using SpecificInternalRow in RunLengthEncoding#Encoder
### What changes were proposed in this pull request?

Currently `RunLengthEncoding#Encoder` uses `SpecificInternalRow` as a holder for the current value when calculating compression stats and doing the actual compression. It calls `ColumnType.copyField` and `ColumnType.getField` on the internal row which incurs extra cost comparing to directly operating on the internal type. This proposes to replace the `SpecificInternalRow` with `T#InternalType` to avoid the extra cost.

### Why are the changes needed?

Operating on `SpecificInternalRow` carries certain cost and negatively impact performance when using `RunLengthEncoding` for compression.

With the change I see some improvements through `CompressionSchemeBenchmark`:

```diff
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 BOOLEAN Encode:                           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                    1              1           0      51957.0           0.0       1.0X
-RunLengthEncoding(2.502)                            549            555           9        122.2           8.2       0.0X
-BooleanBitSet(0.125)                                296            301           3        226.6           4.4       0.0X
+PassThrough(1.000)                                    2              2           0      42985.4           0.0       1.0X
+RunLengthEncoding(2.517)                            487            500          10        137.7           7.3       0.0X
+BooleanBitSet(0.125)                                348            353           4        192.8           5.2       0.0X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 SHORT Encode (Lower Skew):                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                    3              3           0      22779.9           0.0       1.0X
-RunLengthEncoding(1.520)                           1186           1192           9         56.6          17.7       0.0X
+PassThrough(1.000)                                    3              4           0      21216.6           0.0       1.0X
+RunLengthEncoding(1.493)                            882            931          50         76.1          13.1       0.0X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 SHORT Encode (Higher Skew):               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                    3              4           0      21352.2           0.0       1.0X
-RunLengthEncoding(2.009)                           1173           1175           3         57.2          17.5       0.0X
+PassThrough(1.000)                                    3              3           0      22388.6           0.0       1.0X
+RunLengthEncoding(2.015)                            924            941          23         72.6          13.8       0.0X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 INT Encode (Lower Skew):                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                    9             10           1       7410.1           0.1       1.0X
-RunLengthEncoding(1.000)                           1499           1502           4         44.8          22.3       0.0X
-DictionaryEncoding(0.500)                           621            630          11        108.0           9.3       0.0X
-IntDelta(0.250)                                     134            149          10        502.0           2.0       0.1X
+PassThrough(1.000)                                    9             10           1       7575.9           0.1       1.0X
+RunLengthEncoding(1.002)                            952            966          12         70.5          14.2       0.0X
+DictionaryEncoding(0.500)                           561            567           6        119.7           8.4       0.0X
+IntDelta(0.250)                                     129            134           3        521.9           1.9       0.1X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 INT Encode (Higher Skew):                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                    9             10           1       7668.3           0.1       1.0X
-RunLengthEncoding(1.332)                           1561           1685         175         43.0          23.3       0.0X
-DictionaryEncoding(0.501)                           616            642          21        108.9           9.2       0.0X
-IntDelta(0.250)                                     126            131           2        533.4           1.9       0.1X
+PassThrough(1.000)                                    9             10           1       7494.1           0.1       1.0X
+RunLengthEncoding(1.336)                            974            987          13         68.9          14.5       0.0X
+DictionaryEncoding(0.501)                           709            719          10         94.6          10.6       0.0X
+IntDelta(0.250)                                     127            132           4        528.4           1.9       0.1X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 LONG Encode (Lower Skew):                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                   18             19           1       3803.0           0.3       1.0X
-RunLengthEncoding(0.754)                           1526           1540          20         44.0          22.7       0.0X
-DictionaryEncoding(0.250)                           735            759          33         91.3          11.0       0.0X
-LongDelta(0.125)                                    126            129           2        530.8           1.9       0.1X
+PassThrough(1.000)                                   19             21           1       3543.5           0.3       1.0X
+RunLengthEncoding(0.747)                           1049           1058          12         63.9          15.6       0.0X
+DictionaryEncoding(0.250)                           620            634          17        108.2           9.2       0.0X
+LongDelta(0.125)                                    129            132           2        520.1           1.9       0.1X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 LONG Encode (Higher Skew):                Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                   18             20           1       3705.4           0.3       1.0X
-RunLengthEncoding(1.002)                           1665           1669           6         40.3          24.8       0.0X
-DictionaryEncoding(0.251)                           890            901          11         75.4          13.3       0.0X
-LongDelta(0.125)                                    125            130           3        537.2           1.9       0.1X
+PassThrough(1.000)                                   18             20           2       3726.8           0.3       1.0X
+RunLengthEncoding(0.999)                           1076           1077           2         62.4          16.0       0.0X
+DictionaryEncoding(0.251)                           904            919          19         74.3          13.5       0.0X
+LongDelta(0.125)                                    125            131           4        536.5           1.9       0.1X

 OpenJDK 64-Bit Server VM 11.0.8+10-LTS on Mac OS X 10.15.5
 Intel(R) Core(TM) i9-9880H CPU  2.30GHz
 STRING Encode:                            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
 ------------------------------------------------------------------------------------------------------------------------
-PassThrough(1.000)                                   27             30           2       2497.1           0.4       1.0X
-RunLengthEncoding(0.892)                           3443           3587         204         19.5          51.3       0.0X
-DictionaryEncoding(0.167)                          2286           2290           6         29.4          34.1       0.0X
+PassThrough(1.000)                                   28             31           2       2430.2           0.4       1.0X
+RunLengthEncoding(0.889)                           1798           1800           3         37.3          26.8       0.0X
+DictionaryEncoding(0.167)                          1956           1959           4         34.3          29.1       0.0X
```

In the above diff, new results are with changes in this PR. It can be seen that encoding performance has improved quite a lot especially for string type.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Relies on existing unit tests.

Closes #29654 from sunchao/SPARK-32802.

Authored-by: Chao Sun <sunchao@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-09-12 22:19:30 -07:00

138 lines
12 KiB
Plaintext

================================================================================================
Compression Scheme Benchmark
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BOOLEAN Encode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 1 2 0 49671.6 0.0 1.0X
RunLengthEncoding(2.501) 470 487 25 142.7 7.0 0.0X
BooleanBitSet(0.125) 358 362 4 187.6 5.3 0.0X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BOOLEAN Decode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 90 95 5 746.2 1.3 1.0X
RunLengthEncoding 550 559 8 122.0 8.2 0.2X
BooleanBitSet 1082 1087 7 62.0 16.1 0.1X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
SHORT Encode (Lower Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 3 4 0 20595.0 0.0 1.0X
RunLengthEncoding(1.495) 1074 1087 19 62.5 16.0 0.0X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
SHORT Decode (Lower Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 807 844 33 83.1 12.0 1.0X
RunLengthEncoding 1077 1078 1 62.3 16.0 0.7X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
SHORT Encode (Higher Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 3 3 0 23144.6 0.0 1.0X
RunLengthEncoding(2.001) 1067 1073 8 62.9 15.9 0.0X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
SHORT Decode (Higher Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 793 811 16 84.7 11.8 1.0X
RunLengthEncoding 1099 1123 33 61.1 16.4 0.7X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
INT Encode (Lower Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 10 11 1 6979.9 0.1 1.0X
RunLengthEncoding(1.000) 985 994 9 68.1 14.7 0.0X
DictionaryEncoding(0.500) 896 903 10 74.9 13.4 0.0X
IntDelta(0.250) 237 244 6 283.5 3.5 0.0X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
INT Decode (Lower Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 791 795 3 84.8 11.8 1.0X
RunLengthEncoding 1111 1114 5 60.4 16.6 0.7X
DictionaryEncoding 641 650 17 104.7 9.6 1.2X
IntDelta 560 575 24 119.8 8.4 1.4X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
INT Encode (Higher Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 9 10 1 7181.9 0.1 1.0X
RunLengthEncoding(1.336) 1006 1006 1 66.7 15.0 0.0X
DictionaryEncoding(0.501) 1034 1045 15 64.9 15.4 0.0X
IntDelta(0.250) 235 238 2 285.7 3.5 0.0X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
INT Decode (Higher Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 829 832 3 81.0 12.3 1.0X
RunLengthEncoding 1199 1207 11 56.0 17.9 0.7X
DictionaryEncoding 725 726 1 92.6 10.8 1.1X
IntDelta 680 683 5 98.6 10.1 1.2X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
LONG Encode (Lower Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 20 22 1 3405.6 0.3 1.0X
RunLengthEncoding(0.747) 1097 1102 7 61.2 16.3 0.0X
DictionaryEncoding(0.250) 854 933 74 78.6 12.7 0.0X
LongDelta(0.125) 322 328 11 208.5 4.8 0.1X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
LONG Decode (Lower Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 839 843 4 80.0 12.5 1.0X
RunLengthEncoding 1234 1234 1 54.4 18.4 0.7X
DictionaryEncoding 806 809 3 83.3 12.0 1.0X
LongDelta 550 558 6 122.0 8.2 1.5X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
LONG Encode (Higher Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 20 22 1 3319.5 0.3 1.0X
RunLengthEncoding(1.005) 1153 1169 24 58.2 17.2 0.0X
DictionaryEncoding(0.251) 923 930 9 72.7 13.7 0.0X
LongDelta(0.125) 327 332 4 205.0 4.9 0.1X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
LONG Decode (Higher Skew): Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 854 864 16 78.6 12.7 1.0X
RunLengthEncoding 1242 1244 3 54.0 18.5 0.7X
DictionaryEncoding 823 823 1 81.6 12.3 1.0X
LongDelta 640 651 8 104.8 9.5 1.3X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
STRING Encode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough(1.000) 29 32 1 2279.8 0.4 1.0X
RunLengthEncoding(0.886) 1723 1734 15 38.9 25.7 0.0X
DictionaryEncoding(0.167) 2667 2690 33 25.2 39.7 0.0X
OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X 10.15.5
Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
STRING Decode: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
PassThrough 1847 1892 64 36.3 27.5 1.0X
RunLengthEncoding 2305 2332 38 29.1 34.3 0.8X
DictionaryEncoding 2134 2150 22 31.5 31.8 0.9X