[SPARK-31970][CORE] Make MDC configuration step be consistent between setLocalProperty and log4j.properties
### What changes were proposed in this pull request? This PR proposes to use "mdc.XXX" as the consistent key for both `sc.setLocalProperty` and `log4j.properties` when setting up configurations for MDC. ### Why are the changes needed? It's weird that we use "mdc.XXX" as key to set MDC value via `sc.setLocalProperty` while we use "XXX" as key to set MDC pattern in log4j.properties. It could also bring extra burden to the user. ### Does this PR introduce _any_ user-facing change? No, as MDC feature is added in version 3.1, which hasn't been released. ### How was this patch tested? Tested manually. Closes #28801 from Ngone51/consistent-mdc. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This commit is contained in:
parent
1e40bccf44
commit
54e702c0dd
|
@ -323,10 +323,7 @@ private[spark] class Executor(
|
||||||
val threadName = s"Executor task launch worker for task $taskId"
|
val threadName = s"Executor task launch worker for task $taskId"
|
||||||
val taskName = taskDescription.name
|
val taskName = taskDescription.name
|
||||||
val mdcProperties = taskDescription.properties.asScala
|
val mdcProperties = taskDescription.properties.asScala
|
||||||
.filter(_._1.startsWith("mdc.")).map { item =>
|
.filter(_._1.startsWith("mdc.")).toSeq
|
||||||
val key = item._1.substring(4)
|
|
||||||
(key, item._2)
|
|
||||||
}.toSeq
|
|
||||||
|
|
||||||
/** If specified, this task has been killed and this option contains the reason. */
|
/** If specified, this task has been killed and this option contains the reason. */
|
||||||
@volatile private var reasonIfKilled: Option[String] = None
|
@volatile private var reasonIfKilled: Option[String] = None
|
||||||
|
@ -705,7 +702,7 @@ private[spark] class Executor(
|
||||||
MDC.clear()
|
MDC.clear()
|
||||||
mdc.foreach { case (key, value) => MDC.put(key, value) }
|
mdc.foreach { case (key, value) => MDC.put(key, value) }
|
||||||
// avoid overriding the takName by the user
|
// avoid overriding the takName by the user
|
||||||
MDC.put("taskName", taskName)
|
MDC.put("mdc.taskName", taskName)
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
@ -2955,11 +2955,11 @@ Spark uses [log4j](http://logging.apache.org/log4j/) for logging. You can config
|
||||||
`log4j.properties` file in the `conf` directory. One way to start is to copy the existing
|
`log4j.properties` file in the `conf` directory. One way to start is to copy the existing
|
||||||
`log4j.properties.template` located there.
|
`log4j.properties.template` located there.
|
||||||
|
|
||||||
By default, Spark adds 1 record to the MDC (Mapped Diagnostic Context): `taskName`, which shows something
|
By default, Spark adds 1 record to the MDC (Mapped Diagnostic Context): `mdc.taskName`, which shows something
|
||||||
like `task 1.0 in stage 0.0`. You can add `%X{taskName}` to your patternLayout in
|
like `task 1.0 in stage 0.0`. You can add `%X{mdc.taskName}` to your patternLayout in
|
||||||
order to print it in the logs.
|
order to print it in the logs.
|
||||||
Moreover, you can use `spark.sparkContext.setLocalProperty("mdc." + name, "value")` to add user specific data into MDC.
|
Moreover, you can use `spark.sparkContext.setLocalProperty(s"mdc.$name", "value")` to add user specific data into MDC.
|
||||||
The key in MDC will be the string after the `mdc.` prefix.
|
The key in MDC will be the string of "mdc.$name".
|
||||||
|
|
||||||
# Overriding configuration directory
|
# Overriding configuration directory
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue