[SPARK-29247][SQL] Redact sensitive information in when construct HiveClientHive.state
### What changes were proposed in this pull request? HiveClientImpl may be log sensitive information. e.g. url, secret and token: ```scala logDebug( s""" |Applying Hadoop/Hive/Spark and extra properties to Hive Conf: |$k=${if (k.toLowerCase(Locale.ROOT).contains("password")) "xxx" else v} """.stripMargin) ``` So redact it. Use SQLConf.get.redactOptions. I add a new overloading function to fit this situation for one by one kv pair situation. ### Why are the changes needed? Redact sensitive information when construct HiveClientImpl ### Does this PR introduce any user-facing change? No ### How was this patch tested? MT Run command ` /sbin/start-thriftserver.sh` In log we can get ``` 19/09/28 08:27:02 main DEBUG HiveClientImpl: Applying Hadoop/Hive/Spark and extra properties to Hive Conf: hive.druid.metadata.password=*********(redacted) ``` Closes #25954 from AngersZhuuuu/SPARK-29247. Authored-by: angerszhu <angers.zhu@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
parent
31700116d2
commit
1d4b2f010b
|
@ -59,6 +59,7 @@ import org.apache.spark.sql.execution.command.DDLUtils
|
||||||
import org.apache.spark.sql.hive.HiveExternalCatalog.{DATASOURCE_SCHEMA, DATASOURCE_SCHEMA_NUMPARTS, DATASOURCE_SCHEMA_PART_PREFIX}
|
import org.apache.spark.sql.hive.HiveExternalCatalog.{DATASOURCE_SCHEMA, DATASOURCE_SCHEMA_NUMPARTS, DATASOURCE_SCHEMA_PART_PREFIX}
|
||||||
import org.apache.spark.sql.hive.HiveUtils
|
import org.apache.spark.sql.hive.HiveUtils
|
||||||
import org.apache.spark.sql.hive.client.HiveClientImpl._
|
import org.apache.spark.sql.hive.client.HiveClientImpl._
|
||||||
|
import org.apache.spark.sql.internal.SQLConf
|
||||||
import org.apache.spark.sql.types._
|
import org.apache.spark.sql.types._
|
||||||
import org.apache.spark.util.{CircularBuffer, Utils}
|
import org.apache.spark.util.{CircularBuffer, Utils}
|
||||||
|
|
||||||
|
@ -176,14 +177,15 @@ private[hive] class HiveClientImpl(
|
||||||
// has hive-site.xml. So, HiveConf will use that to override its default values.
|
// has hive-site.xml. So, HiveConf will use that to override its default values.
|
||||||
// 2: we set all spark confs to this hiveConf.
|
// 2: we set all spark confs to this hiveConf.
|
||||||
// 3: we set all entries in config to this hiveConf.
|
// 3: we set all entries in config to this hiveConf.
|
||||||
(hadoopConf.iterator().asScala.map(kv => kv.getKey -> kv.getValue)
|
val confMap = (hadoopConf.iterator().asScala.map(kv => kv.getKey -> kv.getValue) ++
|
||||||
++ sparkConf.getAll.toMap ++ extraConfig).foreach { case (k, v) =>
|
sparkConf.getAll.toMap ++ extraConfig).toMap
|
||||||
|
confMap.foreach { case (k, v) => hiveConf.set(k, v) }
|
||||||
|
SQLConf.get.redactOptions(confMap).foreach { case (k, v) =>
|
||||||
logDebug(
|
logDebug(
|
||||||
s"""
|
s"""
|
||||||
|Applying Hadoop/Hive/Spark and extra properties to Hive Conf:
|
|Applying Hadoop/Hive/Spark and extra properties to Hive Conf:
|
||||||
|$k=${if (k.toLowerCase(Locale.ROOT).contains("password")) "xxx" else v}
|
|$k=$v
|
||||||
""".stripMargin)
|
""".stripMargin)
|
||||||
hiveConf.set(k, v)
|
|
||||||
}
|
}
|
||||||
// Disable CBO because we removed the Calcite dependency.
|
// Disable CBO because we removed the Calcite dependency.
|
||||||
hiveConf.setBoolean("hive.cbo.enable", false)
|
hiveConf.setBoolean("hive.cbo.enable", false)
|
||||||
|
|
Loading…
Reference in a new issue