[SPARK-34401][SQL][DOCS] Update docs about altering cached tables/views

### What changes were proposed in this pull request?
Update public docs of SQL commands about altering cached tables/views. For instance:
<img width="869" alt="Screenshot 2021-02-08 at 15 11 48" src="https://user-images.githubusercontent.com/1580697/107217940-fd3b8980-6a1f-11eb-98b9-9b2e3fe7f4ef.png">

### Why are the changes needed?
To inform users about commands behavior in altering cached tables or views.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
By running the command below and manually checking the docs:
```
$ SKIP_API=1 SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 jekyll serve --watch
```

Closes #31524 from MaxGekk/doc-cmd-caching.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
Max Gekk 2021-02-22 04:32:09 +00:00 committed by Wenchen Fan
parent 03f4cf5845
commit 6ea4b5fda7
7 changed files with 21 additions and 1 deletions

View file

@ -27,6 +27,10 @@ license: |
`ALTER TABLE RENAME TO` statement changes the table name of an existing table in the database. The table rename command cannot be used to move a table between databases, only to rename a table within the same database.
If the table is cached, the commands clear cached data of the table. The cache will be lazily filled when the next time the table is accessed. Additionally:
* the table rename command uncaches all table's dependents such as views that refer to the table. The dependents should be cached again explicitly.
* the partition rename command clears caches of all table dependents while keeping them as cached. So, their caches will be lazily filled when the next time they are accessed.
#### Syntax
```sql
@ -103,6 +107,8 @@ ALTER TABLE table_identifier { ALTER | CHANGE } [ COLUMN ] col_spec alterColumnA
`ALTER TABLE ADD` statement adds partition to the partitioned table.
If the table is cached, the command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed.
##### Syntax
```sql
@ -128,6 +134,8 @@ ALTER TABLE table_identifier ADD [IF NOT EXISTS]
`ALTER TABLE DROP` statement drops the partition of the table.
If the table is cached, the command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed.
##### Syntax
```sql
@ -187,6 +195,8 @@ ALTER TABLE table_identifier [ partition_spec ] SET SERDE serde_class_name
`ALTER TABLE SET` command can also be used for changing the file location and file format for
existing tables.
If the table is cached, the `ALTER TABLE .. SET LOCATION` command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed.
##### Syntax
```sql

View file

@ -28,6 +28,8 @@ the name of a view to a different name, set and unset the metadata of the view b
Renames the existing view. If the new view name already exists in the source database, a `TableAlreadyExistsException` is thrown. This operation
does not support moving the views across databases.
If the view is cached, the command clears cached data of the view and all its dependents that refer to it. View's cache will be lazily filled when the next time the view is accessed. The command leaves view's dependents as uncached.
#### Syntax
```sql
ALTER VIEW view_identifier RENAME TO view_identifier

View file

@ -26,6 +26,8 @@ if the table is not `EXTERNAL` table. If the table is not present it throws an e
In case of an external table, only the associated metadata information is removed from the metastore database.
If the table is cached, the command uncaches the table and all its dependents.
### Syntax
```sql

View file

@ -23,6 +23,8 @@ license: |
`MSCK REPAIR TABLE` recovers all the partitions in the directory of a table and updates the Hive metastore. When creating a table using `PARTITIONED BY` clause, partitions are generated and registered in the Hive metastore. However, if the partitioned table is created from existing data, partitions are not registered automatically in the Hive metastore. User needs to run `MSCK REPAIR TABLE` to register the partitions. `MSCK REPAIR TABLE` on a non-existent table or a table without partitions throws an exception. Another way to recover partitions is to use `ALTER TABLE RECOVER PARTITIONS`.
If the table is cached, the command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed.
### Syntax
```sql

View file

@ -25,6 +25,8 @@ The `TRUNCATE TABLE` statement removes all the rows from a table or partition(s)
or an external/temporary table. In order to truncate multiple partitions at once, the user can specify the partitions
in `partition_spec`. If no `partition_spec` is specified it will remove all partitions in the table.
If the table is cached, the command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed.
### Syntax
```sql

View file

@ -23,6 +23,8 @@ license: |
`LOAD DATA` statement loads the data into a Hive serde table from the user specified directory or file. If a directory is specified then all the files from the directory are loaded. If a file is specified then only the single file is loaded. Additionally the `LOAD DATA` statement takes an optional partition specification. When a partition is specified, the data files (when input source is a directory) or the single file (when input source is a file) are loaded into the partition of the target table.
If the table is cached, the command clears cached data of the table and all its dependents that refer to it. The cache will be lazily filled when the next time the table or the dependents are accessed.
### Syntax
```sql

View file

@ -552,7 +552,7 @@ class CatalogImpl(sparkSession: SparkSession) extends Catalog {
// Re-caches the logical plan of the relation.
// Note this is a no-op for the relation itself if it's not cached, but will clear all
// caches referencing this relation. If this relation is cached as an InMemoryRelation,
// this will clear the relation cache and caches of all its dependants.
// this will clear the relation cache and caches of all its dependents.
relation match {
case SubqueryAlias(_, relationPlan) =>
sparkSession.sharedState.cacheManager.recacheByPlan(sparkSession, relationPlan)