From 8b26c69ce7f9077775a3c7bbabb1c47ee6a51a23 Mon Sep 17 00:00:00 2001
From: Yuanjian Li <yuanjian.li@databricks.com>
Date: Sat, 22 Aug 2020 21:32:23 +0900
Subject: [PATCH] [SPARK-31792][SS][DOC][FOLLOW-UP] Rephrase the description
 for some operations

### What changes were proposed in this pull request?
Rephrase the description for some operations to make it clearer.

### Why are the changes needed?
Add more detail in the document.

### Does this PR introduce _any_ user-facing change?
No, document only.

### How was this patch tested?
Document only.

Closes #29269 from xuanyuanking/SPARK-31792-follow.

Authored-by: Yuanjian Li <yuanjian.li@databricks.com>
Signed-off-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
---
 docs/web-ui.md                                         | 10 +++++-----
 .../sql/execution/streaming/MicroBatchExecution.scala  |  3 +--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/docs/web-ui.md b/docs/web-ui.md
index 69c9da6428..465f526ee5 100644
--- a/docs/web-ui.md
+++ b/docs/web-ui.md
@@ -426,11 +426,11 @@ queries. Currently, it contains the following metrics.
 * **Batch Duration.** The process duration of each batch. 
 * **Operation Duration.** The amount of time taken to perform various operations in milliseconds.
 The tracked operations are listed as follows.
-    * addBatch: Adds result data of the current batch to the sink.
-    * getBatch: Gets a new batch of data to process.
-    * latestOffset: Gets the latest offsets for sources. 
-    * queryPlanning: Generates the execution plan.
-    * walCommit: Writes the offsets to the metadata log.
+    * addBatch: Time taken to read the micro-batch's input data from the sources, process it, and write the batch's output to the sink. This should take the bulk of the micro-batch's time.
+    * getBatch: Time taken to prepare the logical query to read the input of the current micro-batch from the sources.
+    * latestOffset & getOffset: Time taken to query the maximum available offset for this source.
+    * queryPlanning: Time taken to generates the execution plan.
+    * walCommit: Time taken to write the offsets to the metadata log.
     
 As an early-release version, the statistics page is still under development and will be improved in
 future releases.
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
index e022bfb683..e0731db1f3 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MicroBatchExecution.scala
@@ -566,8 +566,7 @@ class MicroBatchExecution(
     val nextBatch =
       new Dataset(lastExecution, RowEncoder(lastExecution.analyzed.schema))
 
-    val batchSinkProgress: Option[StreamWriterCommitProgress] =
-      reportTimeTaken("addBatch") {
+    val batchSinkProgress: Option[StreamWriterCommitProgress] = reportTimeTaken("addBatch") {
       SQLExecution.withNewExecutionId(lastExecution) {
         sink match {
           case s: Sink => s.addBatch(currentBatchId, nextBatch)