### What changes were proposed in this pull request?
This PR fixes an issue that `ADD JAR` command can't add jar files which contain whitespaces in the path though `ADD FILE` and `ADD ARCHIVE` work with such files.
If we have `/some/path/test file.jar` and execute the following command:
```
ADD JAR "/some/path/test file.jar";
```
The following exception is thrown.
```
21/04/05 10:40:38 ERROR SparkSQLDriver: Failed in [add jar "/some/path/test file.jar"]
java.lang.IllegalArgumentException: Illegal character in path at index 9: /some/path/test file.jar
at java.net.URI.create(URI.java:852)
at org.apache.spark.sql.hive.HiveSessionResourceLoader.addJar(HiveSessionStateBuilder.scala:129)
at org.apache.spark.sql.execution.command.AddJarCommand.run(resources.scala:34)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
```
This is because `HiveSessionStateBuilder` and `SessionStateBuilder` don't check whether the form of the path is URI or plain path and it always regards the path as URI form.
Whitespces should be encoded to `%20` so `/some/path/test file.jar` is rejected.
We can resolve this part by checking whether the given path is URI form or not.
Unfortunatelly, if we fix this part, another problem occurs.
When we execute `ADD JAR` command, Hive's `ADD JAR` command is executed in `HiveClientImpl.addJar` and `AddResourceProcessor.run` is transitively invoked.
In `AddResourceProcessor.run`, the command line is just split by `
s+` and the path is also split into `/some/path/test` and `file.jar` and passed to `ss.add_resources`.
f1e8713703/ql/src/java/org/apache/hadoop/hive/ql/processors/AddResourceProcessor.java (L56-L75)
So, the command still fails.
Even if we convert the form of the path to URI like `file:/some/path/test%20file.jar` and execute the following command:
```
ADD JAR "file:/some/path/test%20file";
```
The following exception is thrown.
```
21/04/05 10:40:53 ERROR SessionState: file:/some/path/test%20file.jar does not exist
java.lang.IllegalArgumentException: file:/some/path/test%20file.jar does not exist
at org.apache.hadoop.hive.ql.session.SessionState.validateFiles(SessionState.java:1168)
at org.apache.hadoop.hive.ql.session.SessionState$ResourceType.preHook(SessionState.java:1289)
at org.apache.hadoop.hive.ql.session.SessionState$ResourceType$1.preHook(SessionState.java:1278)
at org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1378)
at org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1336)
at org.apache.hadoop.hive.ql.processors.AddResourceProcessor.run(AddResourceProcessor.java:74)
```
The reason is `Utilities.realFile` invoked in `SessionState.validateFiles` returns `null` as the result of `fs.exists(path)` is `false`.
f1e8713703/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java (L1052-L1064)
`fs.exists` checks the existence of the given path by comparing the string representation of Hadoop's `Path`.
The string representation of `Path` is similar to URI but it's actually different.
`Path` doesn't encode the given path.
For example, the URI form of `/some/path/jar file.jar` is `file:/some/path/jar%20file.jar` but the `Path` form of it is `file:/some/path/jar file.jar`. So `fs.exists` returns false.
So the solution I come up with is removing Hive's `ADD JAR` from `HiveClientimpl.addJar`.
I think Hive's `ADD JAR` was used to add jar files to the class loader for metadata and isolate the class loader from the one for execution.
https://github.com/apache/spark/pull/6758/files#diff-cdb07de713c84779a5308f65be47964af865e15f00eb9897ccf8a74908d581bbR94-R103
But, as of SPARK-10810 and SPARK-10902 (#8909) are resolved, the class loaders for metadata and execution seem to be isolated with different way.
https://github.com/apache/spark/pull/8909/files#diff-8ef7cabf145d3fe7081da799fa415189d9708892ed76d4d13dd20fa27021d149R635-R641
In the current implementation, such class loaders seem to be isolated by `SharedState.jarClassLoader` and `IsolatedClientLoader.classLoader`.
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala#L173-L188https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L956-L967
So I wonder we can remove Hive's `ADD JAR` from `HiveClientImpl.addJar`.
### Why are the changes needed?
This is a bug.
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes#32052 from sarutak/add-jar-whitespace.
Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>