[SPARK-34399][DOCS][FOLLOWUP] Add docs for the new metrics of task/job commit time

### What changes were proposed in this pull request?

This is follow-up of https://github.com/apache/spark/pull/31522.
It adds docs for the new metrics of task/job commit time

### Why are the changes needed?

So that users can understand the metrics better and know that the new metrics are only for file table writes.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Build docs and preview:
![image](https://user-images.githubusercontent.com/1097932/127198210-2ab201d3-5fca-4065-ace6-0b930390380f.png)

Closes #33542 from gengliangwang/addDocForMetrics.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
Gengliang Wang 2021-07-28 13:54:35 +08:00 committed by Wenchen Fan
parent 59e0c25376
commit c9a7ff3f36

View file

@ -404,7 +404,8 @@ Here is the list of SQL metrics:
<tr><td> <code>avg hash probe bucket list iters</code> </td><td> the average bucket list iterations per lookup during aggregation </td><td> HashAggregate </td></tr>
<tr><td> <code>data size of build side</code> </td><td> the size of built hash map </td><td> ShuffledHashJoin </td></tr>
<tr><td> <code>time to build hash map</code> </td><td> the time spent on building hash map </td><td> ShuffledHashJoin </td></tr>
<tr><td> <code>task commit time</code> </td><td> the time spent on committing the output of a task after the writes succeed </td><td> any write operation on a file-based table </td></tr>
<tr><td> <code>job commit time</code> </td><td> the time spent on committing the output of a job after the writes succeed </td><td> any write operation on a file-based table </td></tr>
</table>
## Structured Streaming Tab