spark-instrumented-optimizer/docs/sql-ref-syntax-aux-cache-refresh.md
ulysses 184074de22 [SPARK-31999][SQL] Add REFRESH FUNCTION command
### What changes were proposed in this pull request?

In Hive mode, permanent functions are shared with Hive metastore so that functions may be modified by other Hive client. With in long-lived spark scene, it's hard to update the change of function.

Here are 2 reasons:
* Spark cache the function in memory using `FunctionRegistry`.
* User may not know the location or classname of udf when using `replace function`.

Note that we use v2 command code path to add new command.

### Why are the changes needed?

Give a easy way to make spark function registry sync with Hive metastore.
Then we can call
```
refresh function functionName
```

### Does this PR introduce _any_ user-facing change?

Yes, new command.

### How was this patch tested?

New UT.

Closes #28840 from ulysses-you/SPARK-31999.

Authored-by: ulysses <youxiduo@weidian.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2020-07-22 19:05:50 +00:00

58 lines
1.8 KiB
Markdown

---
layout: global
title: REFRESH
displayTitle: REFRESH
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---
### Description
`REFRESH` is used to invalidate and refresh all the cached data (and the associated metadata) for
all Datasets that contains the given data source path. Path matching is by prefix, i.e. "/" would
invalidate everything that is cached.
### Syntax
```sql
REFRESH resource_path
```
### Parameters
* **resource_path**
The path of the resource that is to be refreshed.
### Examples
```sql
-- The Path is resolved using the datasource's File Index.
CREATE TABLE test(ID INT) using parquet;
INSERT INTO test SELECT 1000;
CACHE TABLE test;
INSERT INTO test SELECT 100;
REFRESH "hdfs://path/to/table";
```
### Related Statements
* [CACHE TABLE](sql-ref-syntax-aux-cache-cache-table.html)
* [CLEAR CACHE](sql-ref-syntax-aux-cache-clear-cache.html)
* [UNCACHE TABLE](sql-ref-syntax-aux-cache-uncache-table.html)
* [REFRESH TABLE](sql-ref-syntax-aux-cache-refresh-table.html)
* [REFRESH FUNCTION](sql-ref-syntax-aux-cache-refresh-function.html)