### What changes were proposed in this pull request?
Run sql in spark thrift server, each session 's thrift server about method will be called in one thread, but when running query statement, we have two mode:
1. sync
2. async
5a482e7209/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala (L205-L238)
In sync mode, we just submit query in current session's corresponding thread and wait Spark to running query and return result, and the query method will always wait for query return.
In async mode, in SparkExecuteStatementOperation, we will submit query in a backend thread pool, and update operation state, after submitted to backend thread poll, ExecuteStatement method will return a OperationHandle to client side, and client side will request operation status continuously. after backend thread running sql and return , it will update corresponding operation status, when client got operation status is final status, it will got error or start fetching result of this operation.
When we use pyhive connect to SparkThriftServer, it will run statement in sync mode.
When we query data of hive table , it will check serde class in HiveTableScanExec#addColumnMetadataToConf
5a482e7209/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala (L123)
```
public Class<? extends Deserializer> getDeserializerClass() {
try {
return Class.forName(this.getSerdeClassName(), true, Utilities.getSessionSpecifiedClassLoader());
} catch (ClassNotFoundException var2) {
throw new RuntimeException(var2);
}
}
public static ClassLoader getSessionSpecifiedClassLoader() {
SessionState state = SessionState.get();
if (state != null && state.getConf() != null) {
ClassLoader sessionCL = state.getConf().getClassLoader();
if (sessionCL != null) {
if (LOG.isTraceEnabled()) {
LOG.trace("Use session specified class loader");
}
return sessionCL;
} else {
if (LOG.isDebugEnabled()) {
LOG.debug("Session specified class loader not found, use thread based class loader");
}
return JavaUtils.getClassLoader();
}
} else {
if (LOG.isDebugEnabled()) {
LOG.debug("Hive Conf not found or Session not initiated, use thread based class loader instead");
}
return JavaUtils.getClassLoader();
}
}
```
Since we run statement in sync mode, it will use HiveSession's SessionState, and use it's conf's classLoader. then error happened.
```
Current operation state RUNNING_STATE,
java.lang.RuntimeException: java.lang.ClassNotFoundException:
xxx.xxx.xxxJsonSerDe
at org.apache.hadoop.hive.ql.plan.TableDesc.getDeserializerClass(TableDesc.java:74)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.addColumnMetadataToConf(HiveTableScanExec.scala:123)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.hadoopConf$lzycompute(HiveTableScanExec.scala:101)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.hadoopConf(HiveTableScanExec.scala:98)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.org$apache$spark$sql$hive$execution$HiveTableScanExec$$hadoopReader$lzycompute(HiveTableScanExec.scala:110)
at org.apache.spark.sql.hive.execution.HiveTableScanExec.org$apache$spark$sql$hive$execution$HiveTableScanExec$$hadoopReader(HiveTableScanExec.scala:105)
at org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$11.apply(HiveTableScanExec.scala:192)
at org.apache.spark.sql.hive.execution.HiveTableScanExec$$anonfun$11.apply(HiveTableScanExec.scala:192)
```
We should reset it when we start run sql in sync mode.
### Why are the changes needed?
Fix bug
### Does this PR introduce any user-facing change?
NO
### How was this patch tested?
UT
Closes#26141 from AngersZhuuuu/add_jar_in_sync_mode.
Lead-authored-by: angerszhu <angers.zhu@gmail.com>
Co-authored-by: AngersZhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This module provides support for executing relational queries expressed in either SQL or the DataFrame/Dataset API.
Spark SQL is broken up into four subprojects:
Catalyst (sql/catalyst) - An implementation-agnostic framework for manipulating trees of relational operators and expressions.
Execution (sql/core) - A query planner / execution engine for translating Catalyst's logical query plans into Spark RDDs. This component also includes a new public interface, SQLContext, that allows users to execute SQL or LINQ statements against existing RDDs and Parquet files.
Hive Support (sql/hive) - Includes extensions that allow users to write queries using a subset of HiveQL and access data from a Hive Metastore using Hive SerDes. There are also wrappers that allow users to run queries that include Hive UDFs, UDAFs, and UDTFs.
HiveServer and CLI support (sql/hive-thriftserver) - Includes support for the SQL CLI (bin/spark-sql) and a HiveServer2 (for JDBC/ODBC) compatible server.
Running ./sql/create-docs.sh generates SQL documentation for built-in functions under sql/site, and SQL configuration documentation that gets included as part of configuration.md in the main docs directory.