From eb5558ed35d51215de2ab12156e477cd7130a66a Mon Sep 17 00:00:00 2001 From: Dongjoon Hyun Date: Sat, 6 Feb 2021 23:55:17 -0800 Subject: [PATCH] [SPARK-34390][CORE] Enable Zstandard buffer pool by default ### What changes were proposed in this pull request? This PR aims to enable ZStandard JNI BufferPool by default in Apache Spark 3.2.0. ### Why are the changes needed? **1. SPEED UP** SPARK-34387 shows the speed-up on both Java8/Java11 by adding [ZStandardBenchmark](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/io/ZStandardBenchmark.scala). **2. MEMORY USAGE** The followings are the memory usage graphs while running [ZStandardBenchmark](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/io/ZStandardBenchmark.scala) on Java11 with the increased N (100000) value in order to visualize easily. In the charts, the first half is the memory consumption without buffer pool while the last last half is one with buffer pool. The difference is noticeable. ```scala - val N = 10000 + val N = 100000 ``` - Compression ![Screenshot from 2021-02-06 18-41-17](https://user-images.githubusercontent.com/9700541/107134909-0c4cfb00-68ab-11eb-9273-82cbecdebfba.png) - Decompression ![Screenshot from 2021-02-06 18-43-05](https://user-images.githubusercontent.com/9700541/107134927-2edf1400-68ab-11eb-97c4-5cd101e91bb0.png) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the existing UTs. Closes #31502 from dongjoon-hyun/SPARK-34390. Authored-by: Dongjoon Hyun Signed-off-by: Dongjoon Hyun --- .../main/scala/org/apache/spark/internal/config/package.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala index 0e15354274..1afad3084b 100644 --- a/core/src/main/scala/org/apache/spark/internal/config/package.scala +++ b/core/src/main/scala/org/apache/spark/internal/config/package.scala @@ -1685,7 +1685,7 @@ package object config { .doc("If true, enable buffer pool of ZSTD JNI library.") .version("3.2.0") .booleanConf - .createWithDefault(false) + .createWithDefault(true) private[spark] val IO_COMPRESSION_ZSTD_LEVEL = ConfigBuilder("spark.io.compression.zstd.level")