[SPARK-14180][CORE] Fix a deadlock in CoarseGrainedExecutorBackend Shutdown

## What changes were proposed in this pull request?

Call `executor.stop` in a new thread to eliminate deadlock.

## How was this patch tested?

Existing unit tests

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #12012 from zsxwing/SPARK-14180.
This commit is contained in:
Shixiong Zhu 2016-03-28 16:23:29 -07:00 committed by Andrew Or
parent 328c71161b
commit 34c0638ee6

View file

@ -113,9 +113,15 @@ private[spark] class CoarseGrainedExecutorBackend(
case Shutdown =>
stopping.set(true)
executor.stop()
stop()
rpcEnv.shutdown()
new Thread("CoarseGrainedExecutorBackend-stop-executor") {
override def run(): Unit = {
// executor.stop() will call `SparkEnv.stop()` which waits until RpcEnv stops totally.
// However, if `executor.stop()` runs in some thread of RpcEnv, RpcEnv won't be able to
// stop until `executor.stop()` returns, which becomes a dead-lock (See SPARK-14180).
// Therefore, we put this line in a new thread.
executor.stop()
}
}.start()
}
override def onDisconnected(remoteAddress: RpcAddress): Unit = {