spark-instrumented-optimizer

History

Shixiong Zhu 4e3cb7a5d9 [SPARK-15317][CORE] Don't store accumulators for every task in listeners ## What changes were proposed in this pull request? In general, the Web UI doesn't need to store the Accumulator/AccumulableInfo for every task. It only needs the Accumulator values. In this PR, it creates new UIData classes to store the necessary fields and make `JobProgressListener` store only these new classes, so that `JobProgressListener` won't store Accumulator/AccumulableInfo and the size of `JobProgressListener` becomes pretty small. I also eliminates `AccumulableInfo` from `SQLListener` so that we don't keep any references for those unused `AccumulableInfo`s. ## How was this patch tested? I ran two tests reported in JIRA locally: The first one is: ``` val data = spark.range(0, 10000, 1, 10000) data.cache().count() ``` The retained size of JobProgressListener decreases from 60.7M to 6.9M. The second one is: ``` import org.apache.spark.ml.CC import org.apache.spark.sql.SQLContext val sqlContext = SQLContext.getOrCreate(sc) CC.runTest(sqlContext) ``` This test won't cause OOM after applying this patch. Author: Shixiong Zhu <shixiong@databricks.com> Closes #13153 from zsxwing/memory.		2016-05-19 12:05:17 -07:00
..
java/org/apache/spark	[SPARK-15357] Cooperative spilling should check consumer memory mode	2016-05-18 09:44:21 -07:00
resources/org/apache/spark	[SPARK-15373][WEB UI] Spark UI should show consistent timezones.	2016-05-18 23:19:55 +01:00
scala/org/apache/spark	[SPARK-15317][CORE] Don't store accumulators for every task in listeners	2016-05-19 12:05:17 -07:00