[SPARK-30481][DOCS][FOLLOWUP] Document event log compaction into new section of monitoring.md
### What changes were proposed in this pull request? This is a FOLLOW-UP PR for review comment on #27208 : https://github.com/apache/spark/pull/27208#pullrequestreview-347451714 This PR documents a new feature `Eventlog Compaction` into the new section of `monitoring.md`, as it only has one configuration on the SHS side and it's hard to explain everything on the description on the single configuration. ### Why are the changes needed? Event log compaction lacks the documentation for what it is and how it helps. This PR will explain it. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Built docs via jekyll. > change on the new section <img width="951" alt="Screen Shot 2020-02-16 at 2 23 18 PM" src="https://user-images.githubusercontent.com/1317309/74599587-eb9efa80-50c7-11ea-942c-f7744268e40b.png"> > change on the table <img width="1126" alt="Screen Shot 2020-01-30 at 5 08 12 PM" src="https://user-images.githubusercontent.com/1317309/73431190-2e9c6680-4383-11ea-8ce0-815f10917ddd.png"> Closes #27398 from HeartSaVioR/SPARK-30481-FOLLOWUP-document-new-feature. Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
parent
8f247e5d36
commit
02f8165343
|
@ -95,6 +95,48 @@ The history server can be configured as follows:
|
||||||
</tr>
|
</tr>
|
||||||
</table>
|
</table>
|
||||||
|
|
||||||
|
### Applying compaction on rolling event log files
|
||||||
|
|
||||||
|
A long-running application (e.g. streaming) can bring a huge single event log file which may cost a lot to maintain and
|
||||||
|
also requires a bunch of resource to replay per each update in Spark History Server.
|
||||||
|
|
||||||
|
Enabling <code>spark.eventLog.rolling.enabled</code> and <code>spark.eventLog.rolling.maxFileSize</code> would
|
||||||
|
let you have rolling event log files instead of single huge event log file which may help some scenarios on its own,
|
||||||
|
but it still doesn't help you reducing the overall size of logs.
|
||||||
|
|
||||||
|
Spark History Server can apply compaction on the rolling event log files to reduce the overall size of
|
||||||
|
logs, via setting the configuration <code>spark.history.fs.eventLog.rolling.maxFilesToRetain</code> on the
|
||||||
|
Spark History Server.
|
||||||
|
|
||||||
|
Details will be described below, but please note in prior that compaction is LOSSY operation.
|
||||||
|
Compaction will discard some events which will be no longer seen on UI - you may want to check which events will be discarded
|
||||||
|
before enabling the option.
|
||||||
|
|
||||||
|
When the compaction happens, the History Server lists all the available event log files for the application, and considers
|
||||||
|
the event log files having less index than the file with smallest index which will be retained as target of compaction.
|
||||||
|
For example, if the application A has 5 event log files and <code>spark.history.fs.eventLog.rolling.maxFilesToRetain</code> is set to 2, then first 3 log files will be selected to be compacted.
|
||||||
|
|
||||||
|
Once it selects the target, it analyzes them to figure out which events can be excluded, and rewrites them
|
||||||
|
into one compact file with discarding events which are decided to exclude.
|
||||||
|
|
||||||
|
The compaction tries to exclude the events which point to the outdated data. As of now, below describes the candidates of events to be excluded:
|
||||||
|
|
||||||
|
* Events for the job which is finished, and related stage/tasks events
|
||||||
|
* Events for the executor which is terminated
|
||||||
|
* Events for the SQL execution which is finished, and related job/stage/tasks events
|
||||||
|
|
||||||
|
Once rewriting is done, original log files will be deleted, via best-effort manner. The History Server may not be able to delete
|
||||||
|
the original log files, but it will not affect the operation of the History Server.
|
||||||
|
|
||||||
|
Please note that Spark History Server may not compact the old event log files if figures out not a lot of space
|
||||||
|
would be reduced during compaction. For streaming query we normally expect compaction
|
||||||
|
will run as each micro-batch will trigger one or more jobs which will be finished shortly, but compaction won't run
|
||||||
|
in many cases for batch query.
|
||||||
|
|
||||||
|
Please also note that this is a new feature introduced in Spark 3.0, and may not be completely stable. Under some circumstances,
|
||||||
|
the compaction may exclude more events than you expect, leading some UI issues on History Server for the application.
|
||||||
|
Use it with caution.
|
||||||
|
|
||||||
### Spark History Server Configuration Options
|
### Spark History Server Configuration Options
|
||||||
|
|
||||||
Security options for the Spark History Server are covered more detail in the
|
Security options for the Spark History Server are covered more detail in the
|
||||||
|
@ -303,19 +345,8 @@ Security options for the Spark History Server are covered more detail in the
|
||||||
<td>Int.MaxValue</td>
|
<td>Int.MaxValue</td>
|
||||||
<td>
|
<td>
|
||||||
The maximum number of event log files which will be retained as non-compacted. By default,
|
The maximum number of event log files which will be retained as non-compacted. By default,
|
||||||
all event log files will be retained.<br/>
|
all event log files will be retained. The lowest value is 1 for technical reason.<br/>
|
||||||
Please note that compaction will happen in Spark History Server, which means this configuration
|
Please read the section of "Applying compaction of old event log files" for more details.
|
||||||
should be set to the configuration of Spark History server, and the same value will be applied
|
|
||||||
across applications which are being loaded in Spark History Server. This also means compaction
|
|
||||||
and cleanup would require running Spark History Server.<br/>
|
|
||||||
Please set the configuration in Spark History Server, and <code>spark.eventLog.rolling.maxFileSize</code>
|
|
||||||
in each application accordingly if you want to control the overall size of event log files.
|
|
||||||
The event log files older than these retained files will be compacted into single file and
|
|
||||||
deleted afterwards.<br/>
|
|
||||||
NOTE: Spark History Server may not compact the old event log files if it figures
|
|
||||||
out not a lot of space would be reduced during compaction. For streaming query
|
|
||||||
(including Structured Streaming) we normally expect compaction will run, but for
|
|
||||||
batch query compaction won't run in many cases.
|
|
||||||
</td>
|
</td>
|
||||||
</tr>
|
</tr>
|
||||||
</table>
|
</table>
|
||||||
|
|
Loading…
Reference in a new issue