368a2c341d
### What changes were proposed in this pull request?
followup of a3dd8dacee
via suggestion https://github.com/apache/spark/pull/30888#discussion_r547822642
### Why are the changes needed?
doc improvement
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
passing GA doc
Closes #30909 from yaooqinn/SPARK-33877-F.
Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
215 lines
8.1 KiB
Markdown
215 lines
8.1 KiB
Markdown
---
|
|
layout: global
|
|
title: INSERT OVERWRITE
|
|
displayTitle: INSERT OVERWRITE
|
|
license: |
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
this work for additional information regarding copyright ownership.
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
(the "License"); you may not use this file except in compliance with
|
|
the License. You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
---
|
|
|
|
### Description
|
|
|
|
The `INSERT OVERWRITE` statement overwrites the existing data in the table using the new values. The inserted rows can be specified by value expressions or result from a query.
|
|
|
|
### Syntax
|
|
|
|
```sql
|
|
INSERT OVERWRITE [ TABLE ] table_identifier [ partition_spec [ IF NOT EXISTS ] ] [ ( column_list ) ]
|
|
{ VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] | query }
|
|
```
|
|
|
|
### Parameters
|
|
|
|
* **table_identifier**
|
|
|
|
Specifies a table name, which may be optionally qualified with a database name.
|
|
|
|
**Syntax:** `[ database_name. ] table_name`
|
|
|
|
* **partition_spec**
|
|
|
|
An optional parameter that specifies a comma-separated list of key and value pairs
|
|
for partitions.
|
|
|
|
**Syntax:** `PARTITION ( partition_col_name [ = partition_col_val ] [ , ... ] )`
|
|
|
|
* **column_list**
|
|
|
|
An optional parameter that specifies a comma-separated list of columns belonging to the `table_identifier` table. Spark will reorder the columns of the input query to match the table schema according to the specified column list.
|
|
|
|
**Note**
|
|
|
|
The current behaviour has some limitations:
|
|
- All specified columns should exist in the table and not be duplicated from each other. It includes all columns except the static partition columns.
|
|
- The size of the column list should be exactly the size of the data from `VALUES` clause or query.
|
|
|
|
* **VALUES ( { value `|` NULL } [ , ... ] ) [ , ( ... ) ]**
|
|
|
|
Specifies the values to be inserted. Either an explicitly specified value or a NULL can be inserted.
|
|
A comma must be used to separate each value in the clause. More than one set of values can be specified to insert multiple rows.
|
|
|
|
* **query**
|
|
|
|
A query that produces the rows to be inserted. It can be in one of following formats:
|
|
* a `SELECT` statement
|
|
* a `TABLE` statement
|
|
* a `FROM` statement
|
|
|
|
### Examples
|
|
|
|
#### Insert Using a VALUES Clause
|
|
|
|
```sql
|
|
-- Assuming the students table has already been created and populated.
|
|
SELECT * FROM students;
|
|
+-------------+--------------------------+----------+
|
|
| name| address|student_id|
|
|
+-------------+--------------------------+----------+
|
|
| Amy Smith| 123 Park Ave, San Jose| 111111|
|
|
| Bob Brown| 456 Taylor St, Cupertino| 222222|
|
|
|Cathy Johnson| 789 Race Ave, Palo Alto| 333333|
|
|
|Dora Williams|134 Forest Ave, Menlo Park| 444444|
|
|
|Fleur Laurent| 345 Copper St, London| 777777|
|
|
|Gordon Martin| 779 Lake Ave, Oxford| 888888|
|
|
| Helen Davis| 469 Mission St, San Diego| 999999|
|
|
| Jason Wang| 908 Bird St, Saratoga| 121212|
|
|
+-------------+--------------------------+----------+
|
|
|
|
INSERT OVERWRITE students VALUES
|
|
('Ashua Hill', '456 Erica Ct, Cupertino', 111111),
|
|
('Brian Reed', '723 Kern Ave, Palo Alto', 222222);
|
|
|
|
SELECT * FROM students;
|
|
+----------+-----------------------+----------+
|
|
| name| address|student_id|
|
|
+----------+-----------------------+----------+
|
|
|Ashua Hill|456 Erica Ct, Cupertino| 111111|
|
|
|Brian Reed|723 Kern Ave, Palo Alto| 222222|
|
|
+----------+-----------------------+----------+
|
|
```
|
|
|
|
#### Insert Using a SELECT Statement
|
|
|
|
```sql
|
|
-- Assuming the persons table has already been created and populated.
|
|
SELECT * FROM persons;
|
|
+-------------+--------------------------+---------+
|
|
| name| address| ssn|
|
|
+-------------+--------------------------+---------+
|
|
|Dora Williams|134 Forest Ave, Menlo Park|123456789|
|
|
+-------------+--------------------------+---------+
|
|
| Eddie Davis| 245 Market St, Milpitas|345678901|
|
|
+-------------+--------------------------+---------+
|
|
|
|
INSERT OVERWRITE students PARTITION (student_id = 222222)
|
|
SELECT name, address FROM persons WHERE name = "Dora Williams";
|
|
|
|
SELECT * FROM students;
|
|
+-------------+--------------------------+----------+
|
|
| name| address|student_id|
|
|
+-------------+--------------------------+----------+
|
|
| Ashua Hill| 456 Erica Ct, Cupertino| 111111|
|
|
+-------------+--------------------------+----------+
|
|
|Dora Williams|134 Forest Ave, Menlo Park| 222222|
|
|
+-------------+--------------------------+----------+
|
|
```
|
|
|
|
#### Insert Using a TABLE Statement
|
|
|
|
```sql
|
|
-- Assuming the visiting_students table has already been created and populated.
|
|
SELECT * FROM visiting_students;
|
|
+-------------+---------------------+----------+
|
|
| name| address|student_id|
|
|
+-------------+---------------------+----------+
|
|
|Fleur Laurent|345 Copper St, London| 777777|
|
|
+-------------+---------------------+----------+
|
|
|Gordon Martin| 779 Lake Ave, Oxford| 888888|
|
|
+-------------+---------------------+----------+
|
|
|
|
INSERT OVERWRITE students TABLE visiting_students;
|
|
|
|
SELECT * FROM students;
|
|
+-------------+---------------------+----------+
|
|
| name| address|student_id|
|
|
+-------------+---------------------+----------+
|
|
|Fleur Laurent|345 Copper St, London| 777777|
|
|
+-------------+---------------------+----------+
|
|
|Gordon Martin| 779 Lake Ave, Oxford| 888888|
|
|
+-------------+---------------------+----------+
|
|
```
|
|
|
|
#### Insert Using a FROM Statement
|
|
|
|
```sql
|
|
-- Assuming the applicants table has already been created and populated.
|
|
SELECT * FROM applicants;
|
|
+-----------+--------------------------+----------+---------+
|
|
| name| address|student_id|qualified|
|
|
+-----------+--------------------------+----------+---------+
|
|
|Helen Davis| 469 Mission St, San Diego| 999999| true|
|
|
+-----------+--------------------------+----------+---------+
|
|
| Ivy King|367 Leigh Ave, Santa Clara| 101010| false|
|
|
+-----------+--------------------------+----------+---------+
|
|
| Jason Wang| 908 Bird St, Saratoga| 121212| true|
|
|
+-----------+--------------------------+----------+---------+
|
|
|
|
INSERT OVERWRITE students
|
|
FROM applicants SELECT name, address, id applicants WHERE qualified = true;
|
|
|
|
SELECT * FROM students;
|
|
+-----------+-------------------------+----------+
|
|
| name| address|student_id|
|
|
+-----------+-------------------------+----------+
|
|
|Helen Davis|469 Mission St, San Diego| 999999|
|
|
+-----------+-------------------------+----------+
|
|
| Jason Wang| 908 Bird St, Saratoga| 121212|
|
|
+-----------+-------------------------+----------+
|
|
```
|
|
|
|
#### Insert with a column list
|
|
|
|
```sql
|
|
INSERT OVERWRITE students (address, name, student_id) VALUES
|
|
('Hangzhou, China', 'Kent Yao', 11215016);
|
|
|
|
SELECT * FROM students WHERE name = 'Kent Yao';
|
|
+---------+----------------------+----------+
|
|
| name| address|student_id|
|
|
+---------+----------------------+----------+
|
|
|Kent Yao | Hangzhou, China| 11215016|
|
|
+---------+----------------------+----------+
|
|
```
|
|
|
|
#### Insert with both a partition spec and a column list
|
|
|
|
```sql
|
|
INSERT OVERWRITE students PARTITION (student_id = 11215016) (address, name) VALUES
|
|
('Hangzhou, China', 'Kent Yao Jr.');
|
|
|
|
SELECT * FROM students WHERE student_id = 11215016;
|
|
+------------+----------------------+----------+
|
|
| name| address|student_id|
|
|
+------------+----------------------+----------+
|
|
|Kent Yao Jr.| Hangzhou, China| 11215016|
|
|
+------------+----------------------+----------+
|
|
```
|
|
|
|
### Related Statements
|
|
|
|
* [INSERT INTO statement](sql-ref-syntax-dml-insert-into.html)
|
|
* [INSERT OVERWRITE DIRECTORY statement](sql-ref-syntax-dml-insert-overwrite-directory.html)
|
|
* [INSERT OVERWRITE DIRECTORY with Hive format statement](sql-ref-syntax-dml-insert-overwrite-directory-hive.html)
|