spark-instrumented-optimizer/sql/hive-thriftserver/v2.3
Ali Smesseim d40ecfa3f7
[SPARK-31387][SQL] Handle unknown operation/session ID in HiveThriftServer2Listener
### What changes were proposed in this pull request?

This is a recreation of #28155, which was reverted due to causing test failures.

The update methods in HiveThriftServer2Listener now check if the parameter operation/session ID actually exist in the `sessionList` and `executionList` respectively. This prevents NullPointerExceptions if the operation or session ID is unknown. Instead, a warning is written to the log.

To improve robustness, we also make the following changes in HiveSessionImpl.close():

- Catch any exception thrown by `operationManager.closeOperation`. If for any reason this throws an exception, other operations are not prevented from being closed.
- Handle not being able to access the scratch directory. When closing, all `.pipeout` files are removed from the scratch directory, which would have resulted in an NPE if the directory does not exist.

### Why are the changes needed?

The listener's update methods would throw an exception if the operation or session ID is unknown. In Spark 2, where the listener is called directly, this changes the caller's control flow. In Spark 3, the exception is caught by the ListenerBus but results in an uninformative NullPointerException.

In HiveSessionImpl.close(), if an exception is thrown when closing an operation, all following operations are not closed.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit tests

Closes #28544 from alismess-db/hive-thriftserver-listener-update-safer-2.

Authored-by: Ali Smesseim <ali.smesseim@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-05-20 10:30:17 -07:00
..
if [SPARK-29981][BUILD][FOLLOWUP] Change hive.version.short 2019-11-23 12:50:50 -08:00
src [SPARK-31387][SQL] Handle unknown operation/session ID in HiveThriftServer2Listener 2020-05-20 10:30:17 -07:00