[SPARK-31792][SS][DOCS] Introduce the structured streaming UI in the Web UI doc

### What changes were proposed in this pull request?
This PR adds the structured streaming UI introduction to the Web UI doc.

![image](https://user-images.githubusercontent.com/1452518/82642209-92b99380-9bdb-11ea-9a0d-cbb26040b0ef.png)

### Why are the changes needed?
The structured streaming web UI introduced before was missing from the Web UI documentation.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
N.A.

Closes #28609 from xccui/ss-ui-doc.

Authored-by: Xingcan Cui <xccui@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
Xingcan Cui 2020-05-26 14:27:42 +09:00 committed by HyukjinKwon
parent 7e4f5bbd8a
commit 8ba2b47737
2 changed files with 28 additions and 0 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 175 KiB

View file

@ -407,6 +407,34 @@ Here is the list of SQL metrics:
</table>
## Structured Streaming Tab
When running Structured Streaming jobs in micro-batch mode, a Structured Streaming tab will be
available on the Web UI. The overview page displays some brief statistics for running and completed
queries. Also, you can check the latest exception of a failed query. For detailed statistics, please
click a "run id" in the tables.
<p style="text-align: center;">
<img src="img/webui-structured-streaming-detail.png" title="Structured Streaming Query Statistics" alt="Structured Streaming Query Statistics">
</p>
The statistics page displays some useful metrics for insight into the status of your streaming
queries. Currently, it contains the following metrics.
* **Input Rate.** The aggregate (across all sources) rate of data arriving.
* **Process Rate.** The aggregate (across all sources) rate at which Spark is processing data.
* **Input Rows.** The aggregate (across all sources) number of records processed in a trigger.
* **Batch Duration.** The process duration of each batch.
* **Operation Duration.** The amount of time taken to perform various operations in milliseconds.
The tracked operations are listed as follows.
* addBatch: Adds result data of the current batch to the sink.
* getBatch: Gets a new batch of data to process.
* latestOffset: Gets the latest offsets for sources.
* queryPlanning: Generates the execution plan.
* walCommit: Writes the offsets to the metadata log.
As an early-release version, the statistics page is still under development and will be improved in
future releases.
## Streaming Tab
The web UI includes a Streaming tab if the application uses Spark streaming. This tab displays
scheduling delay and processing time for each micro-batch in the data stream, which can be useful