spark-instrumented-optimizer/core
Baohe Zhang 90b0c26b22 [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster
### What changes were proposed in this pull request?
Add a new class HybridStore to make the history server faster when loading event files. When rebuilding the application state from event logs, HybridStore will write data to InMemoryStore at first and use a background thread to dump data to LevelDB once the writing to InMemoryStore is completed. HybridStore is to make content serving faster by using more memory. It's only safe to enable it when the cluster is not having a heavy load.

### Why are the changes needed?
HybridStore can greatly reduce the event logs loading time, especially for large log files. In general, it has 4x - 6x UI loading speed improvement for large log files. The detailed result is shown in comments.

### Does this PR introduce any user-facing change?
This PR adds new configs `spark.history.store.hybridStore.enabled` and `spark.history.store.hybridStore.maxMemoryUsage`.

### How was this patch tested?
A test suite for HybridStore is added. I also manually tested it on 3.1.0 on mac os.

This is a follow-up for the work done by Hieu Huynh in 2019.

Closes #28412 from baohe-zhang/SPARK-31608.

Authored-by: Baohe Zhang <baohe.zhang@verizonmedia.com>
Signed-off-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
2020-07-15 07:51:13 +09:00
..
benchmarks [SPARK-29576][CORE] Use Spark's CompressionCodec for Ser/Deser of MapOutputStatus 2019-10-23 18:17:37 -07:00
src [SPARK-31608][CORE][WEBUI] Add a new type of KVStore to make loading UI faster 2020-07-15 07:51:13 +09:00
pom.xml [SPARK-31765][WEBUI][TEST-MAVEN] Upgrade HtmlUnit >= 2.37.0 2020-06-11 18:27:53 -05:00