[SPARK-7296] Add timeline visualization for stages in the UI.

This PR builds on #2342 by adding a timeline view for the Stage page,
showing how tasks spend their time.

With this timeline, we can understand following things of a Stage.

* When/where each task ran
* Total duration of each task
* Proportion of the time each task spends

Also, this timeline view can scrollable and zoomable.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #5843 from sarutak/stage-page-timeline and squashes the following commits:

4ba9604 [Kousuke Saruta] Fixed the order of legends
16bb552 [Kousuke Saruta] Removed border of legend area
2e5d605 [Kousuke Saruta] Modified warning message
16cb2e6 [Kousuke Saruta] Merge branch 'master' of https://github.com/apache/spark into stage-page-timeline
7ae328f [Kousuke Saruta] Modified code style
d5f794a [Kousuke Saruta] Fixed performance issues more
64e6642 [Kousuke Saruta] Merge branch 'master' of https://github.com/apache/spark into stage-page-timeline
e4a3354 [Kousuke Saruta] minor code style change
878e3b8 [Kousuke Saruta] Fixed a bug that tooltip remains
b9d8f1b [Kousuke Saruta] Fixed performance issue
ac8842b [Kousuke Saruta] Fixed layout
2319739 [Kousuke Saruta] Modified appearances more
81903ab [Kousuke Saruta] Modified appearances
a79dcc3 [Kousuke Saruta] Modified appearance
55a390c [Kousuke Saruta] Ignored scalastyle for a line-comment
29eae3e [Kousuke Saruta] limited to longest 1000 tasks
2a9e376 [Kousuke Saruta] Minor cleanup
385b6d2 [Kousuke Saruta] Added link feature
ba1ac3e [Kousuke Saruta] Fixed style
2ae8520 [Kousuke Saruta] Updated bootstrap-tooltip.js from 2.2.2 to 2.3.2
af430f1 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into stage-page-timeline
e694b8e [Kousuke Saruta] Added timeline view to StagePage
8f6610c [Kousuke Saruta] Fixed conflict
b587cf2 [Kousuke Saruta] initial commit
11fe67d [Kousuke Saruta] Fixed conflict
79ac03d [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
a91abd3 [Kousuke Saruta] Merge branch 'master' of https://github.com/apache/spark into timeline-viewer-feature
ef34a5b [Kousuke Saruta] Implement tooltip using bootstrap
b09d0c5 [Kousuke Saruta] Move `stroke` and `fill` attribute of rect elements to css
d3c63c8 [Kousuke Saruta] Fixed a little bit bugs
a36291b [Kousuke Saruta] Merge branch 'master' of https://github.com/apache/spark into timeline-viewer-feature
28714b6 [Kousuke Saruta] Fixed highlight issue
0dc4278 [Kousuke Saruta] Addressed most of Patrics's feedbacks
8110acf [Kousuke Saruta] Added scroll limit to Job timeline
974a64a [Kousuke Saruta] Removed unused function
ee7a7f0 [Kousuke Saruta] Refactored
6a91872 [Kousuke Saruta] Temporary commit
6693f34 [Kousuke Saruta] Added link to job/stage box in the timeline in order to move to corresponding row when we click
8f88222 [Kousuke Saruta] Added job/stage description
aeed4b1 [Kousuke Saruta] Removed stage timeline
fc1696c [Kousuke Saruta] Merge branch 'timeline-viewer-feature' of github.com:sarutak/spark into timeline-viewer-feature
999ccd4 [Kousuke Saruta] Improved scalability
0fc6a31 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
19815ae [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
68b7540 [Kousuke Saruta] Merge branch 'timeline-viewer-feature' of github.com:sarutak/spark into timeline-viewer-feature
52b5f0b [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
dec85db [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
fcdab7d [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
dab7cc1 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
09cce97 [Kousuke Saruta] Cleanuped
16f82cf [Kousuke Saruta] Cleanuped
9fb522e [Kousuke Saruta] Cleanuped
d05f2c2 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into timeline-viewer-feature
e85e9aa [Kousuke Saruta] Cleanup: Added TimelineViewUtils.scala
a76e569 [Kousuke Saruta] Removed unused setting in timeline-view.css
5ce1b21 [Kousuke Saruta] Added vis.min.js, vis.min.css and vis.map to .rat-exclude
082f709 [Kousuke Saruta] Added Timeline-View feature for Applications, Jobs and Stages

(cherry picked from commit 9b6cf285d0)
Signed-off-by: Kay Ousterhout <kayousterhout@gmail.com>
This commit is contained in:
Kousuke Saruta 2015-05-15 13:54:09 -07:00 committed by Kay Ousterhout
parent 7dc0ff3f12
commit a5f7b3b9c7
4 changed files with 348 additions and 10 deletions

View file

@ -24,6 +24,65 @@ div#application-timeline, div#job-timeline {
margin-top: 5px;
}
#task-assignment-timeline div.legend-area {
width: 574px;
}
#task-assignment-timeline .legend-area > svg {
width: 100%;
height: 55px;
}
#task-assignment-timeline div.item.range {
padding: 0px;
height: 26px;
border-width: 0;
}
.task-assignment-timeline-content {
width: 100%;
}
.task-assignment-timeline-duration-bar {
width: 100%;
height: 26px;
}
rect.scheduler-delay-proportion {
fill: #80B1D3;
stroke: #6B94B0;
}
rect.deserialization-time-proportion {
fill: #FB8072;
stroke: #D26B5F;
}
rect.shuffle-read-time-proportion {
fill: #FDB462;
stroke: #D39651;
}
rect.executor-runtime-proportion {
fill: #B3DE69;
stroke: #95B957;
}
rect.shuffle-write-time-proportion {
fill: #FFED6F;
stroke: #D5C65C;
}
rect.serialization-time-proportion {
fill: #BC80BD;
stroke: #9D6B9E;
}
rect.getting-result-time-proportion {
fill: #8DD3C7;
stroke: #75B0A6;
}
.vis.timeline {
line-height: 14px;
}
@ -178,6 +237,10 @@ tr.corresponding-item-hover > td, tr.corresponding-item-hover > th {
display: none;
}
#task-assignment-timeline.collapsed {
display: none;
}
.control-panel {
margin-bottom: 5px;
}
@ -186,7 +249,8 @@ tr.corresponding-item-hover > td, tr.corresponding-item-hover > th {
margin: 0;
}
span.expand-application-timeline, span.expand-job-timeline {
span.expand-application-timeline, span.expand-job-timeline,
span.expand-task-assignment-timeline {
cursor: pointer;
}

View file

@ -133,6 +133,73 @@ function drawJobTimeline(groupArray, eventObjArray, startTime) {
});
}
function drawTaskAssignmentTimeline(groupArray, eventObjArray, minLaunchTime, zoomMax) {
var groups = new vis.DataSet(groupArray);
var items = new vis.DataSet(eventObjArray);
var container = $("#task-assignment-timeline")[0]
var options = {
groupOrder: function(a, b) {
return a.value - b.value
},
editable: false,
align: 'left',
selectable: false,
showCurrentTime: false,
min: minLaunchTime,
zoomable: false,
zoomMax: zoomMax
};
var taskTimeline = new vis.Timeline(container)
taskTimeline.setOptions(options);
taskTimeline.setGroups(groups);
taskTimeline.setItems(items);
taskTimeline.on("rangechange", function(prop) {
if (currentDisplayedTooltip !== null) {
$(currentDisplayedTooltip).tooltip("hide");
}
});
function getTaskIdxAndAttempt(selector) {
var taskIdxText = $(selector).attr("data-title");
var taskIdxAndAttempt = taskIdxText.match("Task (\\d+) \\(attempt (\\d+)");
var taskIdx = taskIdxAndAttempt[1];
var taskAttempt = taskIdxAndAttempt[2];
return taskIdx + "-" + taskAttempt;
}
// If we zoom up and a box moves away when the corresponding tooltip is shown,
// the tooltip can be remain.
// So, we need to hide tooltips using another mechanism.
var currentDisplayedTooltip = null;
$("#task-assignment-timeline").on({
"mouseenter": function() {
var taskIdxAndAttempt = getTaskIdxAndAttempt(this);
$("#task-" + taskIdxAndAttempt).addClass("corresponding-item-hover");
$(this).tooltip("show");
currentDisplayedTooltip = this;
},
"mouseleave" : function() {
var taskIdxAndAttempt = getTaskIdxAndAttempt(this);
$("#task-" + taskIdxAndAttempt).removeClass("corresponding-item-hover");
$(this).tooltip("hide");
currentDisplayedTooltip = null;
}
}, ".task-assignment-timeline-content");
setupZoomable('#task-assignment-timeline-zoom-lock', taskTimeline);
$("span.expand-task-assignment-timeline").click(function() {
$("#task-assignment-timeline").toggleClass('collapsed');
// Switch the class of the arrow from open to closed.
$(this).find('.expand-task-assignment-timeline-arrow').toggleClass('arrow-open');
$(this).find('.expand-task-assignment-timeline-arrow').toggleClass('arrow-closed');
});
}
function setupExecutorEventAction() {
$(".item.box.executor").each(function () {
$(this).hover(
@ -147,7 +214,7 @@ function setupExecutorEventAction() {
}
function setupZoomable(id, timeline) {
$(id + '>input[type="checkbox"]').click(function() {
$(id + ' > input[type="checkbox"]').click(function() {
if (this.checked) {
timeline.setOptions({zoomable: true});
} else {
@ -155,7 +222,7 @@ function setupZoomable(id, timeline) {
}
});
$(id + ">span").click(function() {
$(id + " > span").click(function() {
$(this).parent().find('input:checkbox').trigger('click');
});
}

View file

@ -20,6 +20,7 @@ package org.apache.spark.ui.jobs
import java.util.Date
import javax.servlet.http.HttpServletRequest
import scala.collection.mutable.HashSet
import scala.xml.{Elem, Node, Unparsed}
import org.apache.commons.lang3.StringEscapeUtils
@ -36,6 +37,35 @@ private[ui] class StagePage(parent: StagesTab) extends WebUIPage("stage") {
private val progressListener = parent.progressListener
private val operationGraphListener = parent.operationGraphListener
private val TIMELINE_LEGEND = {
<div class="legend-area">
<svg>
{
val legendPairs = List(("scheduler-delay-proportion", "Scheduler Delay"),
("deserialization-time-proportion", "Task Deserialization Time"),
("shuffle-read-time-proportion", "Shuffle Read Time"),
("executor-runtime-proportion", "Executor Computing Time"),
("shuffle-write-time-proportion", "Shuffle Write Time"),
("serialization-time-proportion", "Result Serialization TIme"),
("getting-result-time-proportion", "Getting Result Time"))
legendPairs.zipWithIndex.map {
case ((classAttr, name), index) =>
<rect x={5 + (index / 3) * 210 + "px"} y={10 + (index % 3) * 15 + "px"}
width="10px" height="10px" class={classAttr}></rect>
<text x={25 + (index / 3) * 210 + "px"}
y={20 + (index % 3) * 15 + "px"}>{name}</text>
}
}
</svg>
</div>
}
// TODO: We should consider increasing the number of this parameter over time
// if we find that it's okay.
private val MAX_TIMELINE_TASKS = parent.conf.getInt("spark.ui.timeline.tasks.maximum", 1000)
def render(request: HttpServletRequest): Seq[Node] = {
progressListener.synchronized {
val parameterId = request.getParameter("id")
@ -196,7 +226,9 @@ private[ui] class StagePage(parent: StagesTab) extends WebUIPage("stage") {
val accumulableHeaders: Seq[String] = Seq("Accumulable", "Value")
def accumulableRow(acc: AccumulableInfo): Elem =
<tr><td>{acc.name}</td><td>{acc.value}</td></tr>
val accumulableTable = UIUtils.listingTable(accumulableHeaders, accumulableRow,
val accumulableTable = UIUtils.listingTable(
accumulableHeaders,
accumulableRow,
accumulables.values.toSeq)
val taskHeadersAndCssClasses: Seq[(String, String)] =
@ -232,10 +264,17 @@ private[ui] class StagePage(parent: StagesTab) extends WebUIPage("stage") {
val unzipped = taskHeadersAndCssClasses.unzip
val currentTime = System.currentTimeMillis()
val taskTable = UIUtils.listingTable(
unzipped._1,
taskRow(hasAccumulators, stageData.hasInput, stageData.hasOutput,
stageData.hasShuffleRead, stageData.hasShuffleWrite, stageData.hasBytesSpilled),
taskRow(
hasAccumulators,
stageData.hasInput,
stageData.hasOutput,
stageData.hasShuffleRead,
stageData.hasShuffleWrite,
stageData.hasBytesSpilled,
currentTime),
tasks,
headerClasses = unzipped._2)
// Excludes tasks which failed and have incomplete metrics
@ -460,25 +499,192 @@ private[ui] class StagePage(parent: StagesTab) extends WebUIPage("stage") {
dagViz ++
maybeExpandDagViz ++
showAdditionalMetrics ++
makeTimeline(stageData.taskData.values.toSeq, currentTime) ++
<h4>Summary Metrics for {numCompleted} Completed Tasks</h4> ++
<div>{summaryTable.getOrElse("No tasks have reported metrics yet.")}</div> ++
<h4>Aggregated Metrics by Executor</h4> ++ executorTable.toNodeSeq ++
maybeAccumulableTable ++
<h4>Tasks</h4> ++ taskTable
UIUtils.headerSparkPage(stageHeader, content, parent, showVisualization = true)
}
}
def makeTimeline(tasks: Seq[TaskUIData], currentTime: Long): Seq[Node] = {
val executorsSet = new HashSet[(String, String)]
var minLaunchTime = Long.MaxValue
var maxFinishTime = Long.MinValue
val executorsArrayStr =
tasks.sortBy(-_.taskInfo.launchTime).take(MAX_TIMELINE_TASKS).map { taskUIData =>
val taskInfo = taskUIData.taskInfo
val executorId = taskInfo.executorId
val host = taskInfo.host
executorsSet += ((executorId, host))
val classNameByStatus = {
if (taskInfo.successful) {
"succeeded"
} else if (taskInfo.failed) {
"failed"
} else if (taskInfo.running) {
"running"
}
}
val launchTime = taskInfo.launchTime
val finishTime = if (!taskInfo.running) taskInfo.finishTime else currentTime
val totalExecutionTime = finishTime - launchTime
minLaunchTime = launchTime.min(minLaunchTime)
maxFinishTime = launchTime.max(maxFinishTime)
def toProportion(time: Long) = (time.toDouble / totalExecutionTime * 100).toLong
val metricsOpt = taskUIData.taskMetrics
val shuffleReadTime =
metricsOpt.flatMap(_.shuffleReadMetrics.map(_.fetchWaitTime)).getOrElse(0L)
val shuffleReadTimeProportion = toProportion(shuffleReadTime)
val shuffleWriteTime =
(metricsOpt.flatMap(_.shuffleWriteMetrics
.map(_.shuffleWriteTime)).getOrElse(0L) / 1e6).toLong
val shuffleWriteTimeProportion = toProportion(shuffleWriteTime)
val executorComputingTime = metricsOpt.map(_.executorRunTime).getOrElse(0L) -
shuffleReadTime - shuffleWriteTime
val executorComputingTimeProportion = toProportion(executorComputingTime)
val serializationTime = metricsOpt.map(_.resultSerializationTime).getOrElse(0L)
val serializationTimeProportion = toProportion(serializationTime)
val deserializationTime = metricsOpt.map(_.executorDeserializeTime).getOrElse(0L)
val deserializationTimeProportion = toProportion(deserializationTime)
val gettingResultTime = getGettingResultTime(taskUIData.taskInfo)
val gettingResultTimeProportion = toProportion(gettingResultTime)
val schedulerDelay = totalExecutionTime -
(executorComputingTime + shuffleReadTime + shuffleWriteTime +
serializationTime + deserializationTime + gettingResultTime)
val schedulerDelayProportion =
(100 - executorComputingTimeProportion - shuffleReadTimeProportion -
shuffleWriteTimeProportion - serializationTimeProportion -
deserializationTimeProportion - gettingResultTimeProportion)
val schedulerDelayProportionPos = 0
val deserializationTimeProportionPos =
schedulerDelayProportionPos + schedulerDelayProportion
val shuffleReadTimeProportionPos =
deserializationTimeProportionPos + deserializationTimeProportion
val executorRuntimeProportionPos =
shuffleReadTimeProportionPos + shuffleReadTimeProportion
val shuffleWriteTimeProportionPos =
executorRuntimeProportionPos + executorComputingTimeProportion
val serializationTimeProportionPos =
shuffleWriteTimeProportionPos + shuffleWriteTimeProportion
val gettingResultTimeProportionPos =
serializationTimeProportionPos + serializationTimeProportion
val index = taskInfo.index
val attempt = taskInfo.attempt
val timelineObject =
s"""
{
'className': 'task task-assignment-timeline-object $classNameByStatus',
'group': '$executorId',
'content': '<div class="task-assignment-timeline-content"' +
'data-toggle="tooltip" data-placement="top"' +
'data-html="true" data-container="body"' +
'data-title="${s"Task " + index + " (attempt " + attempt + ")"}<br>' +
'Status: ${taskInfo.status}<br>' +
'Launch Time: ${UIUtils.formatDate(new Date(launchTime))}' +
'${
if (!taskInfo.running) {
s"""<br>Finish Time: ${UIUtils.formatDate(new Date(finishTime))}"""
} else {
""
}
}' +
'<br>Scheduler Delay: $schedulerDelay ms' +
'<br>Task Deserialization Time: ${UIUtils.formatDuration(deserializationTime)}' +
'<br>Shuffle Read Time: ${UIUtils.formatDuration(shuffleReadTime)}' +
'<br>Executor Computing Time: ${UIUtils.formatDuration(executorComputingTime)}' +
'<br>Shuffle Write Time: ${UIUtils.formatDuration(shuffleWriteTime)}' +
'<br>Result Serialization Time: ${UIUtils.formatDuration(serializationTime)}' +
'<br>Getting Result Time: ${UIUtils.formatDuration(gettingResultTime)}">' +
'<svg class="task-assignment-timeline-duration-bar">' +
'<rect class="scheduler-delay-proportion" ' +
'x="$schedulerDelayProportionPos%" y="0px" height="26px"' +
'width="$schedulerDelayProportion%""></rect>' +
'<rect class="deserialization-time-proportion" '+
'x="$deserializationTimeProportionPos%" y="0px" height="26px"' +
'width="$deserializationTimeProportion%"></rect>' +
'<rect class="shuffle-read-time-proportion" ' +
'x="$shuffleReadTimeProportionPos%" y="0px" height="26px"' +
'width="$shuffleReadTimeProportion%"></rect>' +
'<rect class="executor-runtime-proportion" ' +
'x="$executorRuntimeProportionPos%" y="0px" height="26px"' +
'width="$executorComputingTimeProportion%"></rect>' +
'<rect class="shuffle-write-time-proportion" ' +
'x="$shuffleWriteTimeProportionPos%" y="0px" height="26px"' +
'width="$shuffleWriteTimeProportion%"></rect>' +
'<rect class="serialization-time-proportion" ' +
'x="$serializationTimeProportionPos%" y="0px" height="26px"' +
'width="$serializationTimeProportion%"></rect>' +
'<rect class="getting-result-time-proportion" ' +
'x="$gettingResultTimeProportionPos%" y="0px" height="26px"' +
'width="$gettingResultTimeProportion%"></rect></svg>',
'start': new Date($launchTime),
'end': new Date($finishTime)
}
"""
timelineObject
}.mkString("[", ",", "]")
val groupArrayStr = executorsSet.map {
case (executorId, host) =>
s"""
{
'id': '$executorId',
'content': '$executorId / $host',
}
"""
}.mkString("[", ",", "]")
val maxZoom = maxFinishTime - minLaunchTime
<span class="expand-task-assignment-timeline">
<span class="expand-task-assignment-timeline-arrow arrow-closed"></span>
<a>Event Timeline</a>
</span> ++
<div id="task-assignment-timeline" class="collapsed">
{
if (MAX_TIMELINE_TASKS < tasks.size) {
<strong>
This stage has more than the maximum number of tasks that can be shown in the
visualization! Only the most recent {MAX_TIMELINE_TASKS} tasks
(of {tasks.size} total) are shown.
</strong>
} else {
Seq.empty
}
}
<div class="control-panel">
<div id="task-assignment-timeline-zoom-lock">
<input type="checkbox"></input>
<span>Enable zooming</span>
</div>
</div>
{TIMELINE_LEGEND}
</div> ++
<script type="text/javascript">
{Unparsed(s"drawTaskAssignmentTimeline(" +
s"$groupArrayStr, $executorsArrayStr, $minLaunchTime, $maxZoom)")}
</script>
}
def taskRow(
hasAccumulators: Boolean,
hasInput: Boolean,
hasOutput: Boolean,
hasShuffleRead: Boolean,
hasShuffleWrite: Boolean,
hasBytesSpilled: Boolean)(taskData: TaskUIData): Seq[Node] = {
hasBytesSpilled: Boolean,
currentTime: Long)(taskData: TaskUIData): Seq[Node] = {
taskData match { case TaskUIData(info, metrics, errorMessage) =>
val duration = if (info.status == "RUNNING") info.timeRunning(System.currentTimeMillis())
val duration = if (info.status == "RUNNING") info.timeRunning(currentTime)
else metrics.map(_.executorRunTime).getOrElse(1L)
val formatDuration = if (info.status == "RUNNING") UIUtils.formatDuration(duration)
else metrics.map(m => UIUtils.formatDuration(m.executorRunTime)).getOrElse("")
@ -542,7 +748,7 @@ private[ui] class StagePage(parent: StagesTab) extends WebUIPage("stage") {
val diskBytesSpilledSortable = maybeDiskBytesSpilled.map(_.toString).getOrElse("")
val diskBytesSpilledReadable = maybeDiskBytesSpilled.map(Utils.bytesToString).getOrElse("")
<tr>
<tr id={"task-" + info.index + "-" + info.attempt}>
<td>{info.index}</td>
<td>{info.taskId}</td>
<td sorttable_customkey={info.attempt.toString}>{

View file

@ -25,6 +25,7 @@ import org.apache.spark.ui.{SparkUI, SparkUITab}
/** Web UI showing progress status of all stages in the given SparkContext. */
private[ui] class StagesTab(parent: SparkUI) extends SparkUITab(parent, "stages") {
val sc = parent.sc
val conf = parent.conf
val killEnabled = parent.killEnabled
val progressListener = parent.jobProgressListener
val operationGraphListener = parent.operationGraphListener