spark-instrumented-optimizer/dev/deps
yangjie01 01cf6f4c6b [SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache
### What changes were proposed in this pull request?
There are 3 ways to use Guava cache in spark code:

1. `Loadingcache` is the main way to use Guava cache in spark code and the key usages are as follows:
  a. `LoadingCache` with `maximumsize` data eviction policy, such as `appCache` in `ApplicationCache`, `cache` in `Codegenerator`
  b. `LoadingCache` with `maximumWeight` data eviction policy, such as `shuffleIndexCache` in `ExternalShuffleBlockResolver`
  c. `LoadingCache` with 'expireAfterWrite' data eviction policy, such as `tableRelationCache` in `SessionCatalog`
2. `ManualCache` is another way to use Guava cache in spark code and the key usage is `cache` in `SharedInMemoryCache`, it use to caches partition file statuses in memory

3. The last use way is `hadoopJobMetadata` in `SparkEnv`, it uses Guava Cache to build a `soft-reference map`.

The goal of this pr is use `Caffeine` instead of `Guava Cache` because `Caffeine` is faster than `Guava Cache` from benchmarks, the main changes as follows:

1. Add `Caffeine` deps to maven `pom.xml`

2. Use `Caffeine` instead of Guava `LoadingCache`, `ManualCache` and soft-reference map in `SparkEnv`

3. Add `LocalCacheBenchmark` to compare performance of `Loadingcache` between `Guava Cache` and `Caffeine`

### Why are the changes needed?
`Caffeine` is faster than `Guava Cache` from benchmarks

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass the Jenkins or GitHub Action
- Add `LocalCacheBenchmark` to compare performance of `Loadingcache` between `Guava Cache` and `Caffeine`

Closes #31517 from LuciferYang/guava-cache-to-caffeine.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Holden Karau <hkarau@netflix.com>
2021-08-04 12:01:44 -07:00
..
spark-deps-hadoop-2.7-hive-2.3 [SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache 2021-08-04 12:01:44 -07:00
spark-deps-hadoop-3.2-hive-2.3 [SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache 2021-08-04 12:01:44 -07:00