[SPARK-16379][CORE][MESOS] Spark on mesos is broken due to race condition in Logging
## What changes were proposed in this pull request?
The commit 044971eca0
introduced a lazy val to simplify code in Logging. Simple enough, though one side effect is that accessing log now means grabbing the instance's lock. This in turn turned up a form of deadlock in the Mesos code. It was arguably a bit of a problem in how this code is structured, but, in any event the safest thing to do seems to be to revert the commit, and that's 90% of the change here; it's just not worth the risk of similar more subtle issues.
What I didn't revert here was the removal of this odd override of log in the Mesos code. In retrospect it might have been put in place at some stage as a defense against this type of problem. After all the Logging code still involved a lock at initialization before the change in question.
Even after the revert, it doesn't seem like it does anything, given how Logging works now, so I left it removed. However, I also removed the particular log message that ended up playing a part in this problem anyway, maybe being paranoid, to make sure this type of problem can't happen even with how the current locking works in logging initialization.
## How was this patch tested?
Jenkins tests
Author: Sean Owen <sowen@cloudera.com>
Closes #14069 from srowen/SPARK-16379.
This commit is contained in:
parent
040f6f9f46
commit
a8f89df3b3
|
@ -32,10 +32,7 @@ private[spark] trait Logging {
|
|||
|
||||
// Make the log field transient so that objects with Logging can
|
||||
// be serialized and used on another machine
|
||||
@transient lazy val log: Logger = {
|
||||
initializeLogIfNecessary(false)
|
||||
LoggerFactory.getLogger(logName)
|
||||
}
|
||||
@transient private var log_ : Logger = null
|
||||
|
||||
// Method to get the logger name for this object
|
||||
protected def logName = {
|
||||
|
@ -43,6 +40,15 @@ private[spark] trait Logging {
|
|||
this.getClass.getName.stripSuffix("$")
|
||||
}
|
||||
|
||||
// Method to get or create the logger for this object
|
||||
protected def log: Logger = {
|
||||
if (log_ == null) {
|
||||
initializeLogIfNecessary(false)
|
||||
log_ = LoggerFactory.getLogger(logName)
|
||||
}
|
||||
log_
|
||||
}
|
||||
|
||||
// Log methods that take only a String
|
||||
protected def logInfo(msg: => String) {
|
||||
if (log.isInfoEnabled) log.info(msg)
|
||||
|
|
|
@ -244,7 +244,6 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
|
|||
d: org.apache.mesos.SchedulerDriver, frameworkId: FrameworkID, masterInfo: MasterInfo) {
|
||||
appId = frameworkId.getValue
|
||||
mesosExternalShuffleClient.foreach(_.init(appId))
|
||||
logInfo("Registered as framework ID " + appId)
|
||||
markRegistered()
|
||||
}
|
||||
|
||||
|
|
Loading…
Reference in a new issue