[SPARK-36549][SQL] Add taskStatus supports multiple value to monitoring doc

### What changes were proposed in this pull request?
In Stage related restful API, we support `taskStatus` parameter as a list
```
 QueryParam("taskStatus") taskStatus: JList[TaskStatus]
```
In restful we should write like
```
taskStatus=SUCCESS&taskStatus=FAILED
```

It's usefule but not show in the doc, and many user don't know how to write the list parameters.
So add this feature to monitoring doc too.

### Why are the changes needed?
Make doc clear

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
With restful request
```
http://localhost:4040/api/v1/applications/local-1629432414554/stages/0?details=true&taskStatus=FAILED
```
Resultful request result tasks
```
tasks" : {
    "0" : {
      "taskId" : 0,
      "index" : 0,
      "attempt" : 0,
      "launchTime" : "2021-08-20T04:06:55.515GMT",
      "duration" : 273,
      "executorId" : "driver",
      "host" : "host",
      "status" : "FAILED",
      "taskLocality" : "PROCESS_LOCAL",
      "speculative" : false,
      "accumulatorUpdates" : [ ],
      "errorMessage" : "java.lang.RuntimeException\n\tat org.apache.spark.ui.UISuite.$anonfun$new$8(UISuite.scala:95)\n\tat scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)\n\tat scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1003)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1003)\n\tat org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2254)\n\tat org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)\n\tat org.apache.spark.scheduler.Task.run(Task.scala:136)\n\tat org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)\n\tat org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)\n\tat org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n",
      "taskMetrics" : {
        "executorDeserializeTime" : 0,
        "executorDeserializeCpuTime" : 0,
        "executorRunTime" : 206,
        "executorCpuTime" : 0,
        "resultSize" : 0,
        "jvmGcTime" : 0,
        "resultSerializationTime" : 0,
        "memoryBytesSpilled" : 0,
        "diskBytesSpilled" : 0,
        "peakExecutionMemory" : 0,
        "inputMetrics" : {
          "bytesRead" : 0,
          "recordsRead" : 0
        },
        "outputMetrics" : {
          "bytesWritten" : 0,
          "recordsWritten" : 0
        },
        "shuffleReadMetrics" : {
          "remoteBlocksFetched" : 0,
          "localBlocksFetched" : 0,
          "fetchWaitTime" : 0,
          "remoteBytesRead" : 0,
          "remoteBytesReadToDisk" : 0,
          "localBytesRead" : 0,
          "recordsRead" : 0
        },
        "shuffleWriteMetrics" : {
          "bytesWritten" : 0,
          "writeTime" : 0,
          "recordsWritten" : 0
        }
      },
      "executorLogs" : { },
      "schedulerDelay" : 67,
      "gettingResultTime" : 0
    }
  },
```

With restful request
```
http://localhost:4040/api/v1/applications/local-1629432414554/stages/0?details=true&taskStatus=FAILED&taskStatus=SUCCESS
```
Restful result tasks
```
"tasks" : {
    "1" : {
      "taskId" : 1,
      "index" : 1,
      "attempt" : 0,
      "launchTime" : "2021-08-20T04:06:55.786GMT",
      "duration" : 16,
      "executorId" : "driver",
      "host" : "host",
      "status" : "SUCCESS",
      "taskLocality" : "PROCESS_LOCAL",
      "speculative" : false,
      "accumulatorUpdates" : [ ],
      "taskMetrics" : {
        "executorDeserializeTime" : 2,
        "executorDeserializeCpuTime" : 2638000,
        "executorRunTime" : 2,
        "executorCpuTime" : 1993000,
        "resultSize" : 837,
        "jvmGcTime" : 0,
        "resultSerializationTime" : 0,
        "memoryBytesSpilled" : 0,
        "diskBytesSpilled" : 0,
        "peakExecutionMemory" : 0,
        "inputMetrics" : {
          "bytesRead" : 0,
          "recordsRead" : 0
        },
        "outputMetrics" : {
          "bytesWritten" : 0,
          "recordsWritten" : 0
        },
        "shuffleReadMetrics" : {
          "remoteBlocksFetched" : 0,
          "localBlocksFetched" : 0,
          "fetchWaitTime" : 0,
          "remoteBytesRead" : 0,
          "remoteBytesReadToDisk" : 0,
          "localBytesRead" : 0,
          "recordsRead" : 0
        },
        "shuffleWriteMetrics" : {
          "bytesWritten" : 0,
          "writeTime" : 0,
          "recordsWritten" : 0
        }
      },
      "executorLogs" : { },
      "schedulerDelay" : 12,
      "gettingResultTime" : 0
    },
    "0" : {
      "taskId" : 0,
      "index" : 0,
      "attempt" : 0,
      "launchTime" : "2021-08-20T04:06:55.515GMT",
      "duration" : 273,
      "executorId" : "driver",
      "host" : "host",
      "status" : "FAILED",
      "taskLocality" : "PROCESS_LOCAL",
      "speculative" : false,
      "accumulatorUpdates" : [ ],
      "errorMessage" : "java.lang.RuntimeException\n\tat org.apache.spark.ui.UISuite.$anonfun$new$8(UISuite.scala:95)\n\tat scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)\n\tat scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1003)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1003)\n\tat org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2254)\n\tat org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)\n\tat org.apache.spark.scheduler.Task.run(Task.scala:136)\n\tat org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)\n\tat org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)\n\tat org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n",
      "taskMetrics" : {
        "executorDeserializeTime" : 0,
        "executorDeserializeCpuTime" : 0,
        "executorRunTime" : 206,
        "executorCpuTime" : 0,
        "resultSize" : 0,
        "jvmGcTime" : 0,
        "resultSerializationTime" : 0,
        "memoryBytesSpilled" : 0,
        "diskBytesSpilled" : 0,
        "peakExecutionMemory" : 0,
        "inputMetrics" : {
          "bytesRead" : 0,
          "recordsRead" : 0
        },
        "outputMetrics" : {
          "bytesWritten" : 0,
          "recordsWritten" : 0
        },
        "shuffleReadMetrics" : {
          "remoteBlocksFetched" : 0,
          "localBlocksFetched" : 0,
          "fetchWaitTime" : 0,
          "remoteBytesRead" : 0,
          "remoteBytesReadToDisk" : 0,
          "localBytesRead" : 0,
          "recordsRead" : 0
        },
        "shuffleWriteMetrics" : {
          "bytesWritten" : 0,
          "writeTime" : 0,
          "recordsWritten" : 0
        }
      },
      "executorLogs" : { },
      "schedulerDelay" : 67,
      "gettingResultTime" : 0
    }
  },
```

Closes #33793 from AngersZhuuuu/SPARK-36549.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
Angerszhuuuu 2021-08-22 09:45:21 +09:00 committed by Hyukjin Kwon
parent f918c123a0
commit 5740d5641d

View file

@ -477,7 +477,7 @@ can be identified by their `[attempt-id]`. In the API listed below, when running
A list of all stages for a given application. A list of all stages for a given application.
<br><code>?status=[active|complete|pending|failed]</code> list only stages in the given state. <br><code>?status=[active|complete|pending|failed]</code> list only stages in the given state.
<br><code>?details=true</code> lists all stages with the task data. <br><code>?details=true</code> lists all stages with the task data.
<br><code>?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]</code> lists stages only those tasks with the specified task status. Query parameter taskStatus takes effect only when <code>details=true</code>. <br><code>?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]</code> lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when <code>details=true</code>. This also supports multiple <code>taskStatus</code> such as <code>?details=true&taskStatus=SUCCESS&taskStatus=FAILED</code> which will return all tasks matching any of specified task status.
<br><code>?withSummaries=true</code> lists stages with task metrics distribution and executor metrics distribution. <br><code>?withSummaries=true</code> lists stages with task metrics distribution and executor metrics distribution.
<br><code>?quantiles=0.0,0.25,0.5,0.75,1.0</code> summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when <code>withSummaries=true</code>. Default value is <code>0.0,0.25,0.5,0.75,1.0</code>. <br><code>?quantiles=0.0,0.25,0.5,0.75,1.0</code> summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when <code>withSummaries=true</code>. Default value is <code>0.0,0.25,0.5,0.75,1.0</code>.
</td> </td>
@ -487,7 +487,7 @@ can be identified by their `[attempt-id]`. In the API listed below, when running
<td> <td>
A list of all attempts for the given stage. A list of all attempts for the given stage.
<br><code>?details=true</code> lists all attempts with the task data for the given stage. <br><code>?details=true</code> lists all attempts with the task data for the given stage.
<br><code>?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]</code> lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when <code>details=true</code>. <br><code>?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]</code> lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when <code>details=true</code>. This also supports multiple <code>taskStatus</code> such as <code>?details=true&taskStatus=SUCCESS&taskStatus=FAILED</code> which will return all tasks matching any of specified task status.
<br><code>?withSummaries=true</code> lists task metrics distribution and executor metrics distribution of each attempt. <br><code>?withSummaries=true</code> lists task metrics distribution and executor metrics distribution of each attempt.
<br><code>?quantiles=0.0,0.25,0.5,0.75,1.0</code> summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when <code>withSummaries=true</code>. Default value is <code>0.0,0.25,0.5,0.75,1.0</code>. <br><code>?quantiles=0.0,0.25,0.5,0.75,1.0</code> summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when <code>withSummaries=true</code>. Default value is <code>0.0,0.25,0.5,0.75,1.0</code>.
<br>Example: <br>Example:
@ -502,7 +502,7 @@ can be identified by their `[attempt-id]`. In the API listed below, when running
<td> <td>
Details for the given stage attempt. Details for the given stage attempt.
<br><code>?details=true</code> lists all task data for the given stage attempt. <br><code>?details=true</code> lists all task data for the given stage attempt.
<br><code>?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]</code> lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when <code>details=true</code>. <br><code>?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING]</code> lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when <code>details=true</code>. This also supports multiple <code>taskStatus</code> such as <code>?details=true&taskStatus=SUCCESS&taskStatus=FAILED</code> which will return all tasks matching any of specified task status.
<br><code>?withSummaries=true</code> lists task metrics distribution and executor metrics distribution for the given stage attempt. <br><code>?withSummaries=true</code> lists task metrics distribution and executor metrics distribution for the given stage attempt.
<br><code>?quantiles=0.0,0.25,0.5,0.75,1.0</code> summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when <code>withSummaries=true</code>. Default value is <code>0.0,0.25,0.5,0.75,1.0</code>. <br><code>?quantiles=0.0,0.25,0.5,0.75,1.0</code> summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when <code>withSummaries=true</code>. Default value is <code>0.0,0.25,0.5,0.75,1.0</code>.
<br>Example: <br>Example: