[SPARK-14180][CORE] Fix a deadlock in CoarseGrainedExecutorBackend Shutdown
## What changes were proposed in this pull request? Call `executor.stop` in a new thread to eliminate deadlock. ## How was this patch tested? Existing unit tests Author: Shixiong Zhu <shixiong@databricks.com> Closes #12012 from zsxwing/SPARK-14180.
This commit is contained in:
parent
328c71161b
commit
34c0638ee6
|
@ -113,9 +113,15 @@ private[spark] class CoarseGrainedExecutorBackend(
|
|||
|
||||
case Shutdown =>
|
||||
stopping.set(true)
|
||||
executor.stop()
|
||||
stop()
|
||||
rpcEnv.shutdown()
|
||||
new Thread("CoarseGrainedExecutorBackend-stop-executor") {
|
||||
override def run(): Unit = {
|
||||
// executor.stop() will call `SparkEnv.stop()` which waits until RpcEnv stops totally.
|
||||
// However, if `executor.stop()` runs in some thread of RpcEnv, RpcEnv won't be able to
|
||||
// stop until `executor.stop()` returns, which becomes a dead-lock (See SPARK-14180).
|
||||
// Therefore, we put this line in a new thread.
|
||||
executor.stop()
|
||||
}
|
||||
}.start()
|
||||
}
|
||||
|
||||
override def onDisconnected(remoteAddress: RpcAddress): Unit = {
|
||||
|
|
Loading…
Reference in a new issue