From 5740d5641d7878ad3b90000714cf113a1f6d2fd7 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Sun, 22 Aug 2021 09:45:21 +0900 Subject: [PATCH] [SPARK-36549][SQL] Add taskStatus supports multiple value to monitoring doc ### What changes were proposed in this pull request? In Stage related restful API, we support `taskStatus` parameter as a list ``` QueryParam("taskStatus") taskStatus: JList[TaskStatus] ``` In restful we should write like ``` taskStatus=SUCCESS&taskStatus=FAILED ``` It's usefule but not show in the doc, and many user don't know how to write the list parameters. So add this feature to monitoring doc too. ### Why are the changes needed? Make doc clear ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? With restful request ``` http://localhost:4040/api/v1/applications/local-1629432414554/stages/0?details=true&taskStatus=FAILED ``` Resultful request result tasks ``` tasks" : { "0" : { "taskId" : 0, "index" : 0, "attempt" : 0, "launchTime" : "2021-08-20T04:06:55.515GMT", "duration" : 273, "executorId" : "driver", "host" : "host", "status" : "FAILED", "taskLocality" : "PROCESS_LOCAL", "speculative" : false, "accumulatorUpdates" : [ ], "errorMessage" : "java.lang.RuntimeException\n\tat org.apache.spark.ui.UISuite.$anonfun$new$8(UISuite.scala:95)\n\tat scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)\n\tat scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1003)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1003)\n\tat org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2254)\n\tat org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)\n\tat org.apache.spark.scheduler.Task.run(Task.scala:136)\n\tat org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)\n\tat org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)\n\tat org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n", "taskMetrics" : { "executorDeserializeTime" : 0, "executorDeserializeCpuTime" : 0, "executorRunTime" : 206, "executorCpuTime" : 0, "resultSize" : 0, "jvmGcTime" : 0, "resultSerializationTime" : 0, "memoryBytesSpilled" : 0, "diskBytesSpilled" : 0, "peakExecutionMemory" : 0, "inputMetrics" : { "bytesRead" : 0, "recordsRead" : 0 }, "outputMetrics" : { "bytesWritten" : 0, "recordsWritten" : 0 }, "shuffleReadMetrics" : { "remoteBlocksFetched" : 0, "localBlocksFetched" : 0, "fetchWaitTime" : 0, "remoteBytesRead" : 0, "remoteBytesReadToDisk" : 0, "localBytesRead" : 0, "recordsRead" : 0 }, "shuffleWriteMetrics" : { "bytesWritten" : 0, "writeTime" : 0, "recordsWritten" : 0 } }, "executorLogs" : { }, "schedulerDelay" : 67, "gettingResultTime" : 0 } }, ``` With restful request ``` http://localhost:4040/api/v1/applications/local-1629432414554/stages/0?details=true&taskStatus=FAILED&taskStatus=SUCCESS ``` Restful result tasks ``` "tasks" : { "1" : { "taskId" : 1, "index" : 1, "attempt" : 0, "launchTime" : "2021-08-20T04:06:55.786GMT", "duration" : 16, "executorId" : "driver", "host" : "host", "status" : "SUCCESS", "taskLocality" : "PROCESS_LOCAL", "speculative" : false, "accumulatorUpdates" : [ ], "taskMetrics" : { "executorDeserializeTime" : 2, "executorDeserializeCpuTime" : 2638000, "executorRunTime" : 2, "executorCpuTime" : 1993000, "resultSize" : 837, "jvmGcTime" : 0, "resultSerializationTime" : 0, "memoryBytesSpilled" : 0, "diskBytesSpilled" : 0, "peakExecutionMemory" : 0, "inputMetrics" : { "bytesRead" : 0, "recordsRead" : 0 }, "outputMetrics" : { "bytesWritten" : 0, "recordsWritten" : 0 }, "shuffleReadMetrics" : { "remoteBlocksFetched" : 0, "localBlocksFetched" : 0, "fetchWaitTime" : 0, "remoteBytesRead" : 0, "remoteBytesReadToDisk" : 0, "localBytesRead" : 0, "recordsRead" : 0 }, "shuffleWriteMetrics" : { "bytesWritten" : 0, "writeTime" : 0, "recordsWritten" : 0 } }, "executorLogs" : { }, "schedulerDelay" : 12, "gettingResultTime" : 0 }, "0" : { "taskId" : 0, "index" : 0, "attempt" : 0, "launchTime" : "2021-08-20T04:06:55.515GMT", "duration" : 273, "executorId" : "driver", "host" : "host", "status" : "FAILED", "taskLocality" : "PROCESS_LOCAL", "speculative" : false, "accumulatorUpdates" : [ ], "errorMessage" : "java.lang.RuntimeException\n\tat org.apache.spark.ui.UISuite.$anonfun$new$8(UISuite.scala:95)\n\tat scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)\n\tat scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1003)\n\tat org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1003)\n\tat org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2254)\n\tat org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)\n\tat org.apache.spark.scheduler.Task.run(Task.scala:136)\n\tat org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:507)\n\tat org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1468)\n\tat org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:510)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\n", "taskMetrics" : { "executorDeserializeTime" : 0, "executorDeserializeCpuTime" : 0, "executorRunTime" : 206, "executorCpuTime" : 0, "resultSize" : 0, "jvmGcTime" : 0, "resultSerializationTime" : 0, "memoryBytesSpilled" : 0, "diskBytesSpilled" : 0, "peakExecutionMemory" : 0, "inputMetrics" : { "bytesRead" : 0, "recordsRead" : 0 }, "outputMetrics" : { "bytesWritten" : 0, "recordsWritten" : 0 }, "shuffleReadMetrics" : { "remoteBlocksFetched" : 0, "localBlocksFetched" : 0, "fetchWaitTime" : 0, "remoteBytesRead" : 0, "remoteBytesReadToDisk" : 0, "localBytesRead" : 0, "recordsRead" : 0 }, "shuffleWriteMetrics" : { "bytesWritten" : 0, "writeTime" : 0, "recordsWritten" : 0 } }, "executorLogs" : { }, "schedulerDelay" : 67, "gettingResultTime" : 0 } }, ``` Closes #33793 from AngersZhuuuu/SPARK-36549. Authored-by: Angerszhuuuu Signed-off-by: Hyukjin Kwon --- docs/monitoring.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index b30c8e2110..e54ac5414b 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -477,7 +477,7 @@ can be identified by their `[attempt-id]`. In the API listed below, when running A list of all stages for a given application.
?status=[active|complete|pending|failed] list only stages in the given state.
?details=true lists all stages with the task data. -
?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] lists stages only those tasks with the specified task status. Query parameter taskStatus takes effect only when details=true. +
?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when details=true. This also supports multiple taskStatus such as ?details=true&taskStatus=SUCCESS&taskStatus=FAILED which will return all tasks matching any of specified task status.
?withSummaries=true lists stages with task metrics distribution and executor metrics distribution.
?quantiles=0.0,0.25,0.5,0.75,1.0 summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when withSummaries=true. Default value is 0.0,0.25,0.5,0.75,1.0. @@ -487,7 +487,7 @@ can be identified by their `[attempt-id]`. In the API listed below, when running A list of all attempts for the given stage.
?details=true lists all attempts with the task data for the given stage. -
?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when details=true. +
?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when details=true. This also supports multiple taskStatus such as ?details=true&taskStatus=SUCCESS&taskStatus=FAILED which will return all tasks matching any of specified task status.
?withSummaries=true lists task metrics distribution and executor metrics distribution of each attempt.
?quantiles=0.0,0.25,0.5,0.75,1.0 summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when withSummaries=true. Default value is 0.0,0.25,0.5,0.75,1.0.
Example: @@ -502,7 +502,7 @@ can be identified by their `[attempt-id]`. In the API listed below, when running Details for the given stage attempt.
?details=true lists all task data for the given stage attempt. -
?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when details=true. +
?taskStatus=[RUNNING|SUCCESS|FAILED|KILLED|PENDING] lists only those tasks with the specified task status. Query parameter taskStatus takes effect only when details=true. This also supports multiple taskStatus such as ?details=true&taskStatus=SUCCESS&taskStatus=FAILED which will return all tasks matching any of specified task status.
?withSummaries=true lists task metrics distribution and executor metrics distribution for the given stage attempt.
?quantiles=0.0,0.25,0.5,0.75,1.0 summarize the metrics with the given quantiles. Query parameter quantiles takes effect only when withSummaries=true. Default value is 0.0,0.25,0.5,0.75,1.0.
Example: