spark-instrumented-optimizer/resource-managers/yarn
Angerszhuuuu 4d9e577694 [SPARK-36650][YARN] ApplicationMaster shutdown hook should catch timeout exception
### What changes were proposed in this pull request?
Meet a case in yarn-cluster mode, after stop SparkContext call ApplicationMaster's Shutdown hook.
Throw timeout exception  then cause program throw exit code 1. But actually job success.
```
21/09/02 12:36:55 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(FutureTask.java:205)
	at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
	at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
21/09/02 12:36:55 ERROR Utils: Uncaught exception in thread shutdown-hook-0
java.io.InterruptedIOException: Call interrupted
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1569)
	at org.apache.hadoop.ipc.Client.call(Client.java:1521)
	at org.apache.hadoop.ipc.Client.call(Client.java:1418)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:251)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:130)
	at com.sun.proxy.$Proxy21.finishApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:92)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy22.finishApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:479)
	at org.apache.spark.deploy.yarn.YarnRMClient.unregister(YarnRMClient.scala:90)
	at org.apache.spark.deploy.yarn.ApplicationMaster.unregister(ApplicationMaster.scala:384)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl$1.apply$mcV$sp(ApplicationMaster.scala:313)
	at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1992)
	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
	at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
	at scala.util.Try$.apply(Try.scala:192)
	at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
	at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

```

### Why are the changes needed?
Return right exit code

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Not need

Closes #33897 from AngersZhuuuu/SPARK-36650.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-09-03 11:42:53 +09:00
..
src [SPARK-36650][YARN] ApplicationMaster shutdown hook should catch timeout exception 2021-09-03 11:42:53 +09:00
pom.xml [SPARK-36067][BUILD][TEST][YARN] YarnClusterSuite fails due to NoClassDefFoundError unless hadoop-3.2 profile is activated explicitly 2021-07-09 15:18:52 +09:00