[SPARK-35935][SQL] Prevent failure of MSCK REPAIR TABLE on table refreshing

### What changes were proposed in this pull request?
In the PR, I propose to catch all non-fatal exceptions coming `refreshTable()` at the final stage of table repairing, and output an error message instead of failing with an exception.

### Why are the changes needed?
1. The uncaught exceptions from table refreshing might be considered as regression comparing to previous Spark versions. Table refreshing was introduced by https://github.com/apache/spark/pull/31066.
2. This should improve user experience with Spark SQL. For instance, when the `MSCK REPAIR TABLE` is performed in a chain of command in SQL where catching exception is difficult or even impossible.

### Does this PR introduce _any_ user-facing change?
Yes. Before the changes the `MSCK REPAIR TABLE` command can fail with the exception portrayed in SPARK-35935. After the changes, the same command outputs error message, and completes successfully.

### How was this patch tested?
By existing test suites.

Closes #33137 from MaxGekk/msck-repair-catch-except.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
This commit is contained in:
Max Gekk 2021-06-30 09:44:52 +03:00
parent 76682268d7
commit d28ca9cc98

View file

@ -675,7 +675,15 @@ case class RepairTableCommand(
// This is always the case for Hive format tables, but is not true for Datasource tables created
// before Spark 2.1 unless they are converted via `msck repair table`.
spark.sessionState.catalog.alterTable(table.copy(tracksPartitionsInCatalog = true))
spark.catalog.refreshTable(tableIdentWithDB)
try {
spark.catalog.refreshTable(tableIdentWithDB)
} catch {
case NonFatal(e) =>
logError(s"Cannot refresh the table '$tableIdentWithDB'. A query of the table " +
"might return wrong result if the table was cached. To avoid such issue, you should " +
"uncache the table manually via the UNCACHE TABLE command after table recovering will " +
"complete fully.", e)
}
logInfo(s"Recovered all partitions: added ($addedAmount), dropped ($droppedAmount).")
Seq.empty[Row]
}