[SPARK-34390][CORE] Enable Zstandard buffer pool by default

### What changes were proposed in this pull request?

This PR aims to enable ZStandard JNI BufferPool by default in Apache Spark 3.2.0.

### Why are the changes needed?

**1. SPEED UP**
SPARK-34387 shows the speed-up on both Java8/Java11 by adding [ZStandardBenchmark](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/io/ZStandardBenchmark.scala).

**2. MEMORY USAGE**
The followings are the memory usage graphs while running [ZStandardBenchmark](https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/io/ZStandardBenchmark.scala) on Java11 with the increased N (100000) value in order to visualize easily. In the charts, the first half is the memory consumption without buffer pool while the last last half is one with buffer pool. The difference is noticeable.
```scala
-  val N = 10000
+  val N = 100000
```

- Compression
![Screenshot from 2021-02-06 18-41-17](https://user-images.githubusercontent.com/9700541/107134909-0c4cfb00-68ab-11eb-9273-82cbecdebfba.png)

- Decompression
![Screenshot from 2021-02-06 18-43-05](https://user-images.githubusercontent.com/9700541/107134927-2edf1400-68ab-11eb-97c4-5cd101e91bb0.png)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the existing UTs.

Closes #31502 from dongjoon-hyun/SPARK-34390.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
Dongjoon Hyun 2021-02-06 23:55:17 -08:00
parent 6e05e99143
commit eb5558ed35

View file

@ -1685,7 +1685,7 @@ package object config {
.doc("If true, enable buffer pool of ZSTD JNI library.")
.version("3.2.0")
.booleanConf
.createWithDefault(false)
.createWithDefault(true)
private[spark] val IO_COMPRESSION_ZSTD_LEVEL =
ConfigBuilder("spark.io.compression.zstd.level")