[SPARK-36400][SPARK-36398][SQL][WEBUI] Make ThriftServer recognize spark.sql.redaction.string.regex

### What changes were proposed in this pull request?

This PR fixes an issue that ThriftServer doesn't recognize `spark.sql.redaction.string.regex`.
The problem is that sensitive information included in queries can be exposed.
![thrift-password1](https://user-images.githubusercontent.com/4736016/129440772-46379cc5-987b-41ac-adce-aaf2139f6955.png)
![thrift-password2](https://user-images.githubusercontent.com/4736016/129440775-fd328c0f-d128-4a20-82b0-46c331b9fd64.png)

### Why are the changes needed?

Bug fix.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Ran ThriftServer, connect to it and execute `CREATE TABLE mytbl2(a int) OPTIONS(url="jdbc:mysql//example.com:3306", driver="com.mysql.jdbc.Driver", dbtable="test_tbl", user="test_usr", password="abcde");` with `spark.sql.redaction.string.regex=((?i)(?<=password=))(".*")|('.*')`
Then, confirmed UI.

![thrift-hide-password1](https://user-images.githubusercontent.com/4736016/129440863-cabea247-d51f-41a4-80ac-6c64141e1fb7.png)
![thrift-hide-password2](https://user-images.githubusercontent.com/4736016/129440874-96cd0f0c-720b-4010-968a-cffbc85d2be5.png)

Closes #33743 from sarutak/thrift-redact.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>
This commit is contained in:
Kousuke Saruta 2021-08-18 13:31:22 +09:00
parent 7fb8ea319e
commit b914ff7d54

View file

@ -186,10 +186,11 @@ private[hive] class SparkExecuteStatementOperation(
override def runInternal(): Unit = {
setState(OperationState.PENDING)
logInfo(s"Submitting query '$statement' with $statementId")
val redactedStatement = SparkUtils.redact(sqlContext.conf.stringRedactionPattern, statement)
HiveThriftServer2.eventManager.onStatementStart(
statementId,
parentSession.getSessionHandle.getSessionId.toString,
statement,
redactedStatement,
statementId,
parentSession.getUsername)
setHasResultSet(true) // avoid no resultset for async run