[SPARK-28788][DOC][SQL] Document ANALYZE TABLE statement in SQL Reference
### What changes were proposed in this pull request? Document ANALYZE TABLE statement in SQL Reference ### Why are the changes needed? To complete SQL reference ### Does this PR introduce any user-facing change? Yes ***Before***: There was no documentation for this. ***After***: ![image](https://user-images.githubusercontent.com/13592258/64046883-f8339480-cb21-11e9-85da-6617d5c96412.png) ![image](https://user-images.githubusercontent.com/13592258/64209526-9a6eb780-ce55-11e9-9004-53c5c5d24567.png) ![image](https://user-images.githubusercontent.com/13592258/64209542-a2c6f280-ce55-11e9-8624-e7349204ec8e.png) ### How was this patch tested? Tested using jykyll build --serve Closes #25524 from huaxingao/spark-28788. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>
This commit is contained in:
parent
89800931aa
commit
56f2887dc8
|
@ -19,4 +19,78 @@ license: |
|
|||
limitations under the License.
|
||||
---
|
||||
|
||||
**This page is under construction**
|
||||
### Description
|
||||
|
||||
The `ANALYZE TABLE` statement collects statistics about the table to be used by the query optimizer to find a better query execution plan.
|
||||
|
||||
### Syntax
|
||||
{% highlight sql %}
|
||||
ANALYZE TABLE table_name [ PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] ) ]
|
||||
COMPUTE STATISTICS [ NOSCAN | FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS ]
|
||||
|
||||
{% endhighlight %}
|
||||
|
||||
### Parameters
|
||||
<dl>
|
||||
<dt><code><em>table_name</em></code></dt>
|
||||
<dd>The name of an existing table.</dd>
|
||||
</dl>
|
||||
|
||||
<dl>
|
||||
<dt><code><em>PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] )</em></code></dt>
|
||||
<dd>Specifies one or more partition column and value pairs. The partition value is optional.</dd>
|
||||
</dl>
|
||||
|
||||
<dl>
|
||||
<dt><code><em>[ NOSCAN | FOR COLUMNS col [ , ... ] | FOR ALL COLUMNS ]</em></code></dt>
|
||||
<dd>
|
||||
<ul>
|
||||
<li> If no analyze option is specified, <code>ANALYZE TABLE</code> collects the table's number of rows and size in bytes. </li>
|
||||
<li> <b>NOSCAN</b>
|
||||
<br> Collect only the table's size in bytes ( which does not require scanning the entire table ). </li>
|
||||
<li> <b>FOR COLUMNS col [ , ... ] <code> | </code> FOR ALL COLUMNS</b>
|
||||
<br> Collect column statistics for each column specified, or alternatively for every column, as well as table statistics.
|
||||
</li>
|
||||
</ul>
|
||||
</dd>
|
||||
</dl>
|
||||
|
||||
### Examples
|
||||
{% highlight sql %}
|
||||
ANALYZE TABLE students COMPUTE STATISTICS NOSCAN;
|
||||
|
||||
DESC EXTENDED students;
|
||||
......
|
||||
Statistics 2820 bytes
|
||||
......
|
||||
|
||||
ANALYZE TABLE students COMPUTE STATISTICS;
|
||||
|
||||
DESC EXTENDED students;
|
||||
......
|
||||
Statistics 2820 bytes, 3 rows
|
||||
......
|
||||
|
||||
ANALYZE TABLE students PARTITION (student_id = 111111) COMPUTE STATISTICS;
|
||||
|
||||
DESC EXTENDED students PARTITION (student_id = 111111);
|
||||
......
|
||||
Partition Statistics 919 bytes, 1 rows
|
||||
......
|
||||
|
||||
ANALYZE TABLE students COMPUTE STATISTICS FOR COLUMNS name;
|
||||
|
||||
DESC EXTENDED students name;
|
||||
=default tbl=students
|
||||
col_name name
|
||||
data_type string
|
||||
comment NULL
|
||||
min NULL
|
||||
max NULL
|
||||
num_nulls 0
|
||||
distinct_count 3
|
||||
avg_col_len 11
|
||||
max_col_len 13
|
||||
histogram NULL
|
||||
|
||||
{% endhighlight %}
|
|
@ -1,7 +1,7 @@
|
|||
---
|
||||
layout: global
|
||||
title: Reference
|
||||
displayTitle: Reference
|
||||
title: ANALYZE
|
||||
displayTitle: ANALYZE
|
||||
license: |
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
|
@ -19,7 +19,4 @@ license: |
|
|||
limitations under the License.
|
||||
---
|
||||
|
||||
Spark SQL is a Apache Spark's module for working with structured data.
|
||||
This guide is a reference for Structured Query Language (SQL) for Apache
|
||||
Spark. This document describes the SQL constructs supported by Spark in detail
|
||||
along with usage examples when applicable.
|
||||
* [ANALYZE TABLE statement](sql-ref-syntax-aux-analyze-table.html)
|
||||
|
|
Loading…
Reference in a new issue