f4d5aa4213
### What changes were proposed in this pull request? Instead of using GZIP for compressing the serialized `MapStatuses`, ZStd provides better compression rate and faster compression time. The original approach is serializing and writing data directly into `GZIPOutputStream` as one step; however, the compression time is faster if a bigger chuck of the data is processed by the codec at once. As a result, in this PR, the serialized data is written into an uncompressed byte array first, and then the data is compressed. For smaller `MapStatues`, we find it's 2x faster. Here is the benchmark result. #### 20k map outputs, and each has 500 blocks 1. ZStd two steps in this PR: 0.402 ops/ms, 89,066 bytes 2. ZStd one step as the original approach: 0.370 ops/ms, 89,069 bytes 3. GZip: 0.092 ops/ms, 217,345 bytes #### 20k map outputs, and each has 5 blocks 1. ZStd two steps in this PR: 0.9 ops/ms, 75,449 bytes 2. ZStd one step as the original approach: 0.38 ops/ms, 75,452 bytes 3. GZip: 0.21 ops/ms, 160,094 bytes ### Why are the changes needed? Decrease the time for serializing the `MapStatuses` in large scale job. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing tests. Closes #26085 from dbtsai/mapStatus. Lead-authored-by: DB Tsai <d_tsai@apple.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
67 lines
4.3 KiB
Plaintext
67 lines
4.3 KiB
Plaintext
OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 on Linux 4.15.0-1044-aws
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
200000 MapOutputs, 10 blocks w/ broadcast: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Serialization 236 245 18 0.8 1179.1 1.0X
|
|
Deserialization 842 885 37 0.2 4211.4 0.3X
|
|
|
|
Compressed Serialized MapStatus sizes: 400 bytes
|
|
Compressed Serialized Broadcast MapStatus sizes: 2 MB
|
|
|
|
|
|
OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 on Linux 4.15.0-1044-aws
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
200000 MapOutputs, 10 blocks w/o broadcast: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Serialization 213 219 8 0.9 1065.1 1.0X
|
|
Deserialization 846 870 33 0.2 4228.6 0.3X
|
|
|
|
Compressed Serialized MapStatus sizes: 2 MB
|
|
Compressed Serialized Broadcast MapStatus sizes: 0 bytes
|
|
|
|
|
|
OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 on Linux 4.15.0-1044-aws
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
200000 MapOutputs, 100 blocks w/ broadcast: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Serialization 624 709 167 0.3 3121.1 1.0X
|
|
Deserialization 885 908 22 0.2 4427.0 0.7X
|
|
|
|
Compressed Serialized MapStatus sizes: 418 bytes
|
|
Compressed Serialized Broadcast MapStatus sizes: 14 MB
|
|
|
|
|
|
OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 on Linux 4.15.0-1044-aws
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
200000 MapOutputs, 100 blocks w/o broadcast: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Serialization 603 604 2 0.3 3014.9 1.0X
|
|
Deserialization 892 895 5 0.2 4458.7 0.7X
|
|
|
|
Compressed Serialized MapStatus sizes: 14 MB
|
|
Compressed Serialized Broadcast MapStatus sizes: 0 bytes
|
|
|
|
|
|
OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 on Linux 4.15.0-1044-aws
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
200000 MapOutputs, 1000 blocks w/ broadcast: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Serialization 4612 4945 471 0.0 23061.0 1.0X
|
|
Deserialization 1493 1495 2 0.1 7466.3 3.1X
|
|
|
|
Compressed Serialized MapStatus sizes: 546 bytes
|
|
Compressed Serialized Broadcast MapStatus sizes: 123 MB
|
|
|
|
|
|
OpenJDK 64-Bit Server VM 1.8.0_222-8u222-b10-1ubuntu1~18.04.1-b10 on Linux 4.15.0-1044-aws
|
|
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
|
|
200000 MapOutputs, 1000 blocks w/o broadcast: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
|
------------------------------------------------------------------------------------------------------------------------
|
|
Serialization 4452 4595 202 0.0 22261.4 1.0X
|
|
Deserialization 1464 1477 18 0.1 7321.4 3.0X
|
|
|
|
Compressed Serialized MapStatus sizes: 123 MB
|
|
Compressed Serialized Broadcast MapStatus sizes: 0 bytes
|
|
|
|
|