[SPARK-35159][SQL][DOCS] Extract hive format doc
### What changes were proposed in this pull request? Extract common doc about hive format for `sql-ref-syntax-ddl-create-table-hiveformat.md` and `sql-ref-syntax-qry-select-transform.md` to refer. ![image](https://user-images.githubusercontent.com/46485123/115802193-04641800-a411-11eb-827d-d92544881842.png) ### Why are the changes needed? Improve doc ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Not need Closes #32264 from AngersZhuuuu/SPARK-35159. Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
parent
7582dc86bc
commit
20d68dc2f4
|
@ -39,14 +39,6 @@ CREATE [ EXTERNAL ] TABLE [ IF NOT EXISTS ] table_identifier
|
|||
[ LOCATION path ]
|
||||
[ TBLPROPERTIES ( key1=val1, key2=val2, ... ) ]
|
||||
[ AS select_statement ]
|
||||
|
||||
row_format:
|
||||
: SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
|
||||
| DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY escaped_char ] ]
|
||||
[ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ]
|
||||
[ MAP KEYS TERMINATED BY map_key_terminated_char ]
|
||||
[ LINES TERMINATED BY row_terminated_char ]
|
||||
[ NULL DEFINED AS null_char ]
|
||||
```
|
||||
|
||||
Note that, the clauses between the columns definition clause and the AS SELECT clause can come in
|
||||
|
@ -85,47 +77,7 @@ as any order. For example, you can write COMMENT table_comment after TBLPROPERTI
|
|||
|
||||
* **row_format**
|
||||
|
||||
Use the `SERDE` clause to specify a custom SerDe for one table. Otherwise, use the `DELIMITED` clause to use the native SerDe and specify the delimiter, escape character, null character and so on.
|
||||
|
||||
* **SERDE**
|
||||
|
||||
Specifies a custom SerDe for one table.
|
||||
|
||||
* **serde_class**
|
||||
|
||||
Specifies a fully-qualified class name of a custom SerDe.
|
||||
|
||||
* **SERDEPROPERTIES**
|
||||
|
||||
A list of key-value pairs that is used to tag the SerDe definition.
|
||||
|
||||
* **DELIMITED**
|
||||
|
||||
The `DELIMITED` clause can be used to specify the native SerDe and state the delimiter, escape character, null character and so on.
|
||||
|
||||
* **FIELDS TERMINATED BY**
|
||||
|
||||
Used to define a column separator.
|
||||
|
||||
* **COLLECTION ITEMS TERMINATED BY**
|
||||
|
||||
Used to define a collection item separator.
|
||||
|
||||
* **MAP KEYS TERMINATED BY**
|
||||
|
||||
Used to define a map key separator.
|
||||
|
||||
* **LINES TERMINATED BY**
|
||||
|
||||
Used to define a row separator.
|
||||
|
||||
* **NULL DEFINED AS**
|
||||
|
||||
Used to define the specific value for NULL.
|
||||
|
||||
* **ESCAPED BY**
|
||||
|
||||
Used for escape mechanism.
|
||||
Specifies the row format for input and output. See [HIVE FORMAT](sql-ref-syntax-hive-format.html) for more syntax details.
|
||||
|
||||
* **STORED AS**
|
||||
|
||||
|
|
73
docs/sql-ref-syntax-hive-format.md
Normal file
73
docs/sql-ref-syntax-hive-format.md
Normal file
|
@ -0,0 +1,73 @@
|
|||
---
|
||||
layout: global
|
||||
title: Hive Row Format
|
||||
displayTitle: Hive Row Format
|
||||
license: |
|
||||
Licensed to the Apache Software Foundation (ASF) under one or more
|
||||
contributor license agreements. See the NOTICE file distributed with
|
||||
this work for additional information regarding copyright ownership.
|
||||
The ASF licenses this file to You under the Apache License, Version 2.0
|
||||
(the "License"); you may not use this file except in compliance with
|
||||
the License. You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
---
|
||||
|
||||
### Description
|
||||
|
||||
Spark supports a Hive row format in `CREATE TABLE` and `TRANSFORM` clause to specify serde or text delimiter.
|
||||
There are two ways to define a row format in `row_format` of `CREATE TABLE` and `TRANSFORM` clauses.
|
||||
1. `SERDE` clause to specify a custom SerDe class.
|
||||
2. `DELIMITED` clause to specify a delimiter, an escape character, a null character, and so on for the native SerDe.
|
||||
|
||||
### Syntax
|
||||
|
||||
```sql
|
||||
row_format:
|
||||
SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
|
||||
| DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY escaped_char ] ]
|
||||
[ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ]
|
||||
[ MAP KEYS TERMINATED BY map_key_terminated_char ]
|
||||
[ LINES TERMINATED BY row_terminated_char ]
|
||||
[ NULL DEFINED AS null_char ]
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
* **SERDE serde_class**
|
||||
|
||||
Specifies a fully-qualified class name of custom SerDe.
|
||||
|
||||
* **SERDEPROPERTIES**
|
||||
|
||||
A list of key-value pairs that is used to tag the SerDe definition.
|
||||
|
||||
* **FIELDS TERMINATED BY**
|
||||
|
||||
Used to define a column separator.
|
||||
|
||||
* **COLLECTION ITEMS TERMINATED BY**
|
||||
|
||||
Used to define a collection item separator.
|
||||
|
||||
* **MAP KEYS TERMINATED BY**
|
||||
|
||||
Used to define a map key separator.
|
||||
|
||||
* **LINES TERMINATED BY**
|
||||
|
||||
Used to define a row separator.
|
||||
|
||||
* **NULL DEFINED AS**
|
||||
|
||||
Used to define the specific value for NULL.
|
||||
|
||||
* **ESCAPED BY**
|
||||
|
||||
Used for escape mechanism.
|
|
@ -33,14 +33,6 @@ SELECT TRANSFORM ( expression [ , ... ] )
|
|||
USING command_or_script [ AS ( [ col_name [ col_type ] ] [ , ... ] ) ]
|
||||
[ ROW FORMAT row_format ]
|
||||
[ RECORDREADER record_reader_class ]
|
||||
|
||||
row_format:
|
||||
SERDE serde_class [ WITH SERDEPROPERTIES (k1=v1, k2=v2, ... ) ]
|
||||
| DELIMITED [ FIELDS TERMINATED BY fields_terminated_char [ ESCAPED BY escaped_char ] ]
|
||||
[ COLLECTION ITEMS TERMINATED BY collection_items_terminated_char ]
|
||||
[ MAP KEYS TERMINATED BY map_key_terminated_char ]
|
||||
[ LINES TERMINATED BY row_terminated_char ]
|
||||
[ NULL DEFINED AS null_char ]
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
@ -51,43 +43,7 @@ row_format:
|
|||
|
||||
* **row_format**
|
||||
|
||||
Otherwise, uses the `DELIMITED` clause to specify the native SerDe and state the delimiter, escape character, null character and so on.
|
||||
|
||||
* **SERDE**
|
||||
|
||||
Specifies a custom SerDe for one table.
|
||||
|
||||
* **serde_class**
|
||||
|
||||
Specifies a fully-qualified class name of a custom SerDe.
|
||||
|
||||
* **DELIMITED**
|
||||
|
||||
The `DELIMITED` clause can be used to specify the native SerDe and state the delimiter, escape character, null character and so on.
|
||||
|
||||
* **FIELDS TERMINATED BY**
|
||||
|
||||
Used to define a column separator.
|
||||
|
||||
* **COLLECTION ITEMS TERMINATED BY**
|
||||
|
||||
Used to define a collection item separator.
|
||||
|
||||
* **MAP KEYS TERMINATED BY**
|
||||
|
||||
Used to define a map key separator.
|
||||
|
||||
* **LINES TERMINATED BY**
|
||||
|
||||
Used to define a row separator.
|
||||
|
||||
* **NULL DEFINED AS**
|
||||
|
||||
Used to define the specific value for NULL.
|
||||
|
||||
* **ESCAPED BY**
|
||||
|
||||
Used for escape mechanism.
|
||||
Specifies the row format for input and output. See [HIVE FORMAT](sql-ref-syntax-hive-format.html) for more syntax details.
|
||||
|
||||
* **RECORDWRITER**
|
||||
|
||||
|
|
Loading…
Reference in a new issue