[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
---
layout: global
2020-02-13 13:53:55 -05:00
title: ANSI Compliance
displayTitle: ANSI Compliance
2019-03-30 20:49:45 -04:00
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
---
2020-02-16 19:54:00 -05:00
Since Spark 3.0, Spark SQL introduces two experimental options to comply with the SQL standard: `spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy` (See a table below for details).
2020-03-18 18:29:31 -04:00
When `spark.sql.ansi.enabled` is set to `true` , Spark SQL follows the standard in basic behaviours (e.g., arithmetic operations, type conversion, SQL functions and SQL parsing).
2020-02-13 13:53:55 -05:00
Moreover, Spark SQL has an independent option to control implicit casting behaviours when inserting rows in a table.
The casting behaviours are defined as store assignment rules in the standard.
2020-02-16 19:54:00 -05:00
When `spark.sql.storeAssignmentPolicy` is set to `ANSI` , Spark SQL complies with the ANSI store assignment rules. This is a separate configuration because its default value is `ANSI` , while the configuration `spark.sql.ansi.enabled` is disabled by default.
2020-02-13 13:53:55 -05:00
< table class = "table" >
2020-04-02 03:01:54 -04:00
< tr > < th > Property Name< / th > < th > Default< / th > < th > Meaning< / th > < th > Since Version< / th > < / tr >
2020-02-13 13:53:55 -05:00
< tr >
< td > < code > spark.sql.ansi.enabled< / code > < / td >
< td > false< / td >
< td >
2020-02-16 19:54:00 -05:00
(Experimental) When true, Spark tries to conform to the ANSI SQL specification:
2020-02-13 13:53:55 -05:00
1. Spark will throw a runtime exception if an overflow occurs in any operation on integral/decimal field.
2. Spark will forbid using the reserved keywords of ANSI SQL as identifiers in the SQL parser.
< / td >
2020-04-02 03:01:54 -04:00
< td > 3.0.0< / td >
2020-02-13 13:53:55 -05:00
< / tr >
< tr >
< td > < code > spark.sql.storeAssignmentPolicy< / code > < / td >
< td > ANSI< / td >
< td >
2020-02-16 19:54:00 -05:00
(Experimental) When inserting a value into a column with different data type, Spark will perform type coercion.
2020-02-13 13:53:55 -05:00
Currently, we support 3 policies for the type coercion rules: ANSI, legacy and strict. With ANSI policy,
Spark performs the type coercion as per ANSI SQL. In practice, the behavior is mostly the same as PostgreSQL.
It disallows certain unreasonable type conversions such as converting string to int or double to boolean.
With legacy policy, Spark allows the type coercion as long as it is a valid Cast, which is very loose.
e.g. converting string to int or double to boolean is allowed.
It is also the only behavior in Spark 2.x and it is compatible with Hive.
With strict policy, Spark doesn't allow any possible precision loss or data truncation in type coercion,
e.g. converting double to int or decimal to double is not allowed.
< / td >
2020-04-02 03:01:54 -04:00
< td > 3.0.0< / td >
2020-02-13 13:53:55 -05:00
< / tr >
< / table >
The following subsections present behaviour changes in arithmetic operations, type conversions, and SQL parsing when the ANSI mode enabled.
### Arithmetic Operations
In Spark SQL, arithmetic operations performed on numeric types (with the exception of decimal) are not checked for overflows by default.
2020-03-05 19:54:59 -05:00
This means that in case an operation causes overflows, the result is the same with the corresponding operation in a Java/Scala program (e.g., if the sum of 2 integers is higher than the maximum value representable, the result is a negative number).
2020-02-13 13:53:55 -05:00
On the other hand, Spark SQL returns null for decimal overflows.
When `spark.sql.ansi.enabled` is set to `true` and an overflow occurs in numeric and interval arithmetic operations, it throws an arithmetic exception at runtime.
{% highlight sql %}
-- `spark.sql.ansi.enabled=true`
SELECT 2147483647 + 1;
java.lang.ArithmeticException: integer overflow
-- `spark.sql.ansi.enabled=false`
SELECT 2147483647 + 1;
2020-05-01 13:11:43 -04:00
+----------------+
|(2147483647 + 1)|
+----------------+
| -2147483648|
+----------------+
2020-02-13 13:53:55 -05:00
{% endhighlight %}
### Type Conversion
Spark SQL has three kinds of type conversions: explicit casting, type coercion, and store assignment casting.
When `spark.sql.ansi.enabled` is set to `true` , explicit casting by `CAST` syntax throws a runtime exception for illegal cast patterns defined in the standard, e.g. casts from a string to an integer.
On the other hand, `INSERT INTO` syntax throws an analysis exception when the ANSI mode enabled via `spark.sql.storeAssignmentPolicy=ANSI` .
Currently, the ANSI mode affects explicit casting and assignment casting only.
In future releases, the behaviour of type coercion might change along with the other two type conversion rules.
{% highlight sql %}
-- Examples of explicit casting
-- `spark.sql.ansi.enabled=true`
SELECT CAST('a' AS INT);
java.lang.NumberFormatException: invalid input syntax for type numeric: a
SELECT CAST(2147483648L AS INT);
java.lang.ArithmeticException: Casting 2147483648 to int causes overflow
-- `spark.sql.ansi.enabled=false` (This is a default behaviour)
SELECT CAST('a' AS INT);
2020-05-01 13:11:43 -04:00
+--------------+
|CAST(a AS INT)|
+--------------+
| null|
+--------------+
2020-02-13 13:53:55 -05:00
SELECT CAST(2147483648L AS INT);
2020-05-01 13:11:43 -04:00
+-----------------------+
|CAST(2147483648 AS INT)|
+-----------------------+
| -2147483648|
+-----------------------+
2020-02-13 13:53:55 -05:00
-- Examples of store assignment rules
CREATE TABLE t (v INT);
-- `spark.sql.storeAssignmentPolicy=ANSI`
INSERT INTO t VALUES ('1');
org.apache.spark.sql.AnalysisException: Cannot write incompatible data to table '`default`.`t`':
- Cannot safely cast 'v': StringType to IntegerType;
-- `spark.sql.storeAssignmentPolicy=LEGACY` (This is a legacy behaviour until Spark 2.x)
INSERT INTO t VALUES ('1');
SELECT * FROM t;
2020-05-01 13:11:43 -04:00
+---+
| v|
+---+
| 1|
+---+
2020-02-13 13:53:55 -05:00
{% endhighlight %}
2020-03-18 18:29:31 -04:00
### SQL Functions
The behavior of some SQL functions can be different under ANSI mode (`spark.sql.ansi.enabled=true`).
- `size` : This function returns null for null input under ANSI mode.
2020-02-13 13:53:55 -05:00
### SQL Keywords
2019-12-10 12:22:34 -05:00
When `spark.sql.ansi.enabled` is true, Spark SQL will use the ANSI mode parser.
2019-11-20 11:56:48 -05:00
In this mode, Spark SQL has two kinds of keywords:
2019-03-18 02:19:52 -04:00
* Reserved keywords: Keywords that are reserved and can't be used as identifiers for table, view, column, function, alias, etc.
2020-01-03 15:51:10 -05:00
* Non-reserved keywords: Keywords that have a special meaning only in particular contexts and can be used as identifiers in other contexts. For example, `EXPLAIN SELECT ...` is a command, but EXPLAIN can be used as identifiers in other places.
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
2019-11-20 11:56:48 -05:00
When the ANSI mode is disabled, Spark SQL has two kinds of keywords:
* Non-reserved keywords: Same definition as the one when the ANSI mode enabled.
2019-03-18 02:19:52 -04:00
* Strict-non-reserved keywords: A strict version of non-reserved keywords, which can not be used as table alias.
2019-12-10 12:22:34 -05:00
By default `spark.sql.ansi.enabled` is false.
2019-03-18 02:19:52 -04:00
Below is a list of all the keywords in Spark SQL.
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< table class = "table" >
< tr > < th rowspan = "2" style = "vertical-align: middle;" > < b > Keyword< / b > < / th > < th colspan = "2" > < b > Spark SQL< / b > < / th > < th rowspan = "2" style = "vertical-align: middle;" > < b > SQL-2011< / b > < / th > < / tr >
< tr > < th > < b > ANSI mode< / b > < / th > < th > < b > default mode< / b > < / th > < / tr >
< tr > < td > ADD< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > AFTER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > ALL< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ALTER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ANALYZE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > AND< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > ANTI< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > ANY< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ARCHIVE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > ARRAY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > AS< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ASC< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > AT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > AUTHORIZATION< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > BETWEEN< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > BOTH< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > BUCKET< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > BUCKETS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > BY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CACHE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CASCADE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CASE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CAST< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CHANGE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CHECK< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CLEAR< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CLUSTER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CLUSTERED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CODEGEN< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > COLLATE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > COLLECTION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > COLUMN< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > COLUMNS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > COMMENT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > COMMIT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > COMPACT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > COMPACTIONS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > COMPUTE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CONCATENATE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CONSTRAINT< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > COST< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > CREATE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > CROSS< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > CUBE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CURRENT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CURRENT_DATE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CURRENT_TIME< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CURRENT_TIMESTAMP< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > CURRENT_USER< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > DATA< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DATABASE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DATABASES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-13 21:45:29 -04:00
< tr > < td > DAY< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > DBPROPERTIES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DEFINED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DELETE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > DELIMITED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DESC< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DESCRIBE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > DFS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DIRECTORIES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DIRECTORY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DISTINCT< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > DISTRIBUTE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DIV< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > DROP< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ELSE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > END< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-28083][SQL] Support LIKE ... ESCAPE syntax
## What changes were proposed in this pull request?
The syntax 'LIKE predicate: ESCAPE clause' is a ANSI SQL.
For example:
```
select 'abcSpark_13sd' LIKE '%Spark\\_%'; //true
select 'abcSpark_13sd' LIKE '%Spark/_%'; //false
select 'abcSpark_13sd' LIKE '%Spark"_%'; //false
select 'abcSpark_13sd' LIKE '%Spark/_%' ESCAPE '/'; //true
select 'abcSpark_13sd' LIKE '%Spark"_%' ESCAPE '"'; //true
select 'abcSpark%13sd' LIKE '%Spark\\%%'; //true
select 'abcSpark%13sd' LIKE '%Spark/%%'; //false
select 'abcSpark%13sd' LIKE '%Spark"%%'; //false
select 'abcSpark%13sd' LIKE '%Spark/%%' ESCAPE '/'; //true
select 'abcSpark%13sd' LIKE '%Spark"%%' ESCAPE '"'; //true
select 'abcSpark\\13sd' LIKE '%Spark\\\\_%'; //true
select 'abcSpark/13sd' LIKE '%Spark//_%'; //false
select 'abcSpark"13sd' LIKE '%Spark""_%'; //false
select 'abcSpark/13sd' LIKE '%Spark//_%' ESCAPE '/'; //true
select 'abcSpark"13sd' LIKE '%Spark""_%' ESCAPE '"'; //true
```
But Spark SQL only supports 'LIKE predicate'.
Note: If the input string or pattern string is null, then the result is null too.
There are some mainstream database support the syntax.
**PostgreSQL:**
https://www.postgresql.org/docs/11/functions-matching.html
**Vertica:**
https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/SQLReferenceManual/LanguageElements/Predicates/LIKE-predicate.htm?zoom_highlight=like%20escape
**MySQL:**
https://dev.mysql.com/doc/refman/5.6/en/string-comparison-functions.html
**Oracle:**
https://docs.oracle.com/en/database/oracle/oracle-database/19/jjdbc/JDBC-reference-information.html#GUID-5D371A5B-D7F6-42EB-8C0D-D317F3C53708
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Pattern-matching-Conditions.html#GUID-0779657B-06A8-441F-90C5-044B47862A0A
## How was this patch tested?
Exists UT and new UT.
This PR merged to my production environment and runs above sql:
```
spark-sql> select 'abcSpark_13sd' LIKE '%Spark\\_%';
true
Time taken: 0.119 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark_13sd' LIKE '%Spark/_%';
false
Time taken: 0.103 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark_13sd' LIKE '%Spark"_%';
false
Time taken: 0.096 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark_13sd' LIKE '%Spark/_%' ESCAPE '/';
true
Time taken: 0.096 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark_13sd' LIKE '%Spark"_%' ESCAPE '"';
true
Time taken: 0.092 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark%13sd' LIKE '%Spark\\%%';
true
Time taken: 0.109 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark%13sd' LIKE '%Spark/%%';
false
Time taken: 0.1 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark%13sd' LIKE '%Spark"%%';
false
Time taken: 0.081 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark%13sd' LIKE '%Spark/%%' ESCAPE '/';
true
Time taken: 0.095 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark%13sd' LIKE '%Spark"%%' ESCAPE '"';
true
Time taken: 0.113 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark\\13sd' LIKE '%Spark\\\\_%';
true
Time taken: 0.078 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark/13sd' LIKE '%Spark//_%';
false
Time taken: 0.067 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark"13sd' LIKE '%Spark""_%';
false
Time taken: 0.084 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark/13sd' LIKE '%Spark//_%' ESCAPE '/';
true
Time taken: 0.091 seconds, Fetched 1 row(s)
spark-sql> select 'abcSpark"13sd' LIKE '%Spark""_%' ESCAPE '"';
true
Time taken: 0.091 seconds, Fetched 1 row(s)
```
I create a table and its schema is:
```
spark-sql> desc formatted gja_test;
key string NULL
value string NULL
other string NULL
# Detailed Table Information
Database test
Table gja_test
Owner test
Created Time Wed Apr 10 11:06:15 CST 2019
Last Access Thu Jan 01 08:00:00 CST 1970
Created By Spark 2.4.1-SNAPSHOT
Type MANAGED
Provider hive
Table Properties [transient_lastDdlTime=1563443838]
Statistics 26 bytes
Location hdfs://namenode.xxx:9000/home/test/hive/warehouse/test.db/gja_test
Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
InputFormat org.apache.hadoop.mapred.TextInputFormat
OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Storage Properties [field.delim= , serialization.format= ]
Partition Provider Catalog
Time taken: 0.642 seconds, Fetched 21 row(s)
```
Table `gja_test` exists three rows of data.
```
spark-sql> select * from gja_test;
a A ao
b B bo
"__ """__ "
Time taken: 0.665 seconds, Fetched 3 row(s)
```
At finally, I test this function:
```
spark-sql> select * from gja_test where key like value escape '"';
"__ """__ "
Time taken: 0.687 seconds, Fetched 1 row(s)
```
Closes #25001 from beliefer/ansi-sql-like.
Lead-authored-by: gengjiaan <gengjiaan@360.cn>
Co-authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
2019-12-06 03:07:38 -05:00
< tr > < td > ESCAPE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > ESCAPED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > EXCEPT< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > EXCHANGE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > EXISTS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > EXPLAIN< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > EXPORT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > EXTENDED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > EXTERNAL< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > EXTRACT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > FALSE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > FETCH< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > FIELDS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-27986][SQL] Support ANSI SQL filter clause for aggregate expression
### What changes were proposed in this pull request?
The filter predicate for aggregate expression is an `ANSI SQL`.
```
<aggregate function> ::=
COUNT <left paren> <asterisk> <right paren> [ <filter clause> ]
| <general set function> [ <filter clause> ]
| <binary set function> [ <filter clause> ]
| <ordered set function> [ <filter clause> ]
| <array aggregate function> [ <filter clause> ]
| <row pattern count function> [ <filter clause> ]
```
There are some mainstream database support this syntax.
**PostgreSQL:**
https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES
For example:
```
SELECT
year,
count(*) FILTER (WHERE gdp_per_capita >= 40000)
FROM
countries
GROUP BY
year
```
```
SELECT
year,
code,
gdp_per_capita,
count(*)
FILTER (WHERE gdp_per_capita >= 40000)
OVER (PARTITION BY year)
FROM
countries
```
**jOOQ:**
https://blog.jooq.org/2014/12/30/the-awesome-postgresql-9-4-sql2003-filter-clause-for-aggregate-functions/
**Notice:**
1.This PR only supports FILTER predicate without codegen. maropu will create another PR is related to SPARK-30027 to support codegen.
2.This PR only supports FILTER predicate without DISTINCT. I will create another PR is related to SPARK-30276 to support this.
3.This PR only supports FILTER predicate that can't reference the outer query. I created ticket SPARK-30219 to support it.
4.This PR only supports FILTER predicate that can't use IN/EXISTS predicate sub-queries. I created ticket SPARK-30220 to support it.
5.Spark SQL cannot supports a SQL with nested aggregate. I created ticket SPARK-30182 to support it.
There are some show of the PR on my production environment.
```
spark-sql> desc gja_test_partition;
key string NULL
value string NULL
other string NULL
col2 int NULL
# Partition Information
# col_name data_type comment
col2 int NULL
Time taken: 0.79 s
```
```
spark-sql> select * from gja_test_partition;
a A ao 1
b B bo 1
c C co 1
d D do 1
e E eo 2
g G go 2
h H ho 2
j J jo 2
f F fo 3
k K ko 3
l L lo 4
i I io 4
Time taken: 1.75 s
```
```
spark-sql> select count(key), sum(col2) from gja_test_partition;
12 26
Time taken: 1.848 s
```
```
spark-sql> select count(key) filter (where col2 > 1) from gja_test_partition;
8
Time taken: 2.926 s
```
```
spark-sql> select sum(col2) filter (where col2 > 2) from gja_test_partition;
14
Time taken: 2.087 s
```
```
spark-sql> select count(key) filter (where col2 > 1), sum(col2) filter (where col2 > 2) from gja_test_partition;
8 14
Time taken: 2.847 s
```
```
spark-sql> select count(key), count(key) filter (where col2 > 1), sum(col2), sum(col2) filter (where col2 > 2) from gja_test_partition;
12 8 26 14
Time taken: 1.787 s
```
```
spark-sql> desc student;
id int NULL
name string NULL
sex string NULL
class_id int NULL
Time taken: 0.206 s
```
```
spark-sql> select * from student;
1 张三 man 1
2 李四 man 1
3 王五 man 2
4 赵六 man 2
5 钱小花 woman 1
6 赵九红 woman 2
7 郭丽丽 woman 2
Time taken: 0.786 s
```
```
spark-sql> select class_id, count(id), sum(id) from student group by class_id;
1 3 8
2 4 20
Time taken: 18.783 s
```
```
spark-sql> select class_id, count(id) filter (where sex = 'man'), sum(id) filter (where sex = 'woman') from student group by class_id;
1 2 5
2 2 13
Time taken: 3.887 s
```
### Why are the changes needed?
Add new SQL feature.
### Does this PR introduce any user-facing change?
'No'.
### How was this patch tested?
Exists UT and new UT.
Closes #26656 from beliefer/support-aggregate-clause.
Lead-authored-by: gengjiaan <gengjiaan@360.cn>
Co-authored-by: Jiaan Geng <beliefer@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-26 04:41:50 -05:00
< tr > < td > FILTER< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > FILEFORMAT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > FIRST< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > FOLLOWING< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > FOR< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > FOREIGN< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > FORMAT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > FORMATTED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > FROM< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > FULL< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > FUNCTION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > FUNCTIONS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > GLOBAL< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > GRANT< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > GROUP< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > GROUPING< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > HAVING< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-13 21:45:29 -04:00
< tr > < td > HOUR< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > IF< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > IGNORE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > IMPORT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > IN< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > INDEX< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > INDEXES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > INNER< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > INPATH< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > INPUTFORMAT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > INSERT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > INTERSECT< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > INTERVAL< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > INTO< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > IS< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ITEMS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > JOIN< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > KEYS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LAST< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LATERAL< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > LAZY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LEADING< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > LEFT< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > LIKE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > LIMIT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LINES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LIST< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LOAD< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LOCAL< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > LOCATION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LOCK< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LOCKS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > LOGICAL< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > MACRO< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > MAP< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-11-08 22:45:24 -05:00
< tr > < td > MATCHED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > MERGE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-19 08:18:40 -04:00
< tr > < td > MINUS< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-13 21:45:29 -04:00
< tr > < td > MINUTE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > MONTH< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > MSCK< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-10-02 09:55:21 -04:00
< tr > < td > NAMESPACE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-09-10 09:23:57 -04:00
< tr > < td > NAMESPACES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > NATURAL< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > NO< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > NOT< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > NULL< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > NULLS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > OF< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > ON< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > ONLY< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > OPTION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > OPTIONS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > OR< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ORDER< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > OUT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > OUTER< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > OUTPUTFORMAT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > OVER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > OVERLAPS< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-06-28 06:13:08 -04:00
< tr > < td > OVERLAY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > OVERWRITE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > PARTITION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > PARTITIONED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > PARTITIONS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-19 08:18:40 -04:00
< tr > < td > PERCENT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > PIVOT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-06-28 06:13:08 -04:00
< tr > < td > PLACING< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > POSITION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > PRECEDING< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > PRIMARY< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > PRINCIPALS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-10-23 00:17:20 -04:00
< tr > < td > PROPERTIES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > PURGE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-19 08:18:40 -04:00
< tr > < td > QUERY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > RANGE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > RECORDREADER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > RECORDWRITER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > RECOVER< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > REDUCE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > REFERENCES< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > REFRESH< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > RENAME< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > REPAIR< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > REPLACE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > RESET< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > RESTRICT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > REVOKE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > RIGHT< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > RLIKE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > ROLE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > ROLES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > ROLLBACK< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ROLLUP< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ROW< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > ROWS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > SCHEMA< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-13 21:45:29 -04:00
< tr > < td > SECOND< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > SELECT< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > SEMI< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > SEPARATED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SERDE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SERDEPROPERTIES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SESSION_USER< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > SET< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > SETS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SHOW< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SKEWED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SOME< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > SORT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SORTED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > START< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > STATISTICS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > STORED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > STRATIFY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > STRUCT< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-06-10 12:05:10 -04:00
< tr > < td > SUBSTR< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > SUBSTRING< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > TABLE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > TABLES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > TABLESAMPLE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > TBLPROPERTIES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > TEMPORARY< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > TERMINATED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > THEN< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > TO< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > TOUCH< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > TRAILING< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > TRANSACTION< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > TRANSACTIONS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > TRANSFORM< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-28109][SQL] Fix TRIM(type trimStr FROM str) returns incorrect value
## What changes were proposed in this pull request?
[SPARK-28093](https://issues.apache.org/jira/browse/SPARK-28093) fixed `TRIM/LTRIM/RTRIM('str', 'trimStr')` returns an incorrect value, but that fix introduced a new bug, `TRIM(type trimStr FROM str)` returns an incorrect value. This pr fix this issue.
## How was this patch tested?
unit tests and manual tests:
Before this PR:
```sql
spark-sql> SELECT trim('yxTomxx', 'xyz'), trim(BOTH 'xyz' FROM 'yxTomxx');
Tom z
spark-sql> SELECT trim('xxxbarxxx', 'x'), trim(BOTH 'x' FROM 'xxxbarxxx');
bar
spark-sql> SELECT ltrim('zzzytest', 'xyz'), trim(LEADING 'xyz' FROM 'zzzytest');
test xyz
spark-sql> SELECT ltrim('zzzytestxyz', 'xyz'), trim(LEADING 'xyz' FROM 'zzzytestxyz');
testxyz
spark-sql> SELECT ltrim('xyxXxyLAST WORD', 'xy'), trim(LEADING 'xy' FROM 'xyxXxyLAST WORD');
XxyLAST WORD
spark-sql> SELECT rtrim('testxxzx', 'xyz'), trim(TRAILING 'xyz' FROM 'testxxzx');
test xy
spark-sql> SELECT rtrim('xyztestxxzx', 'xyz'), trim(TRAILING 'xyz' FROM 'xyztestxxzx');
xyztest
spark-sql> SELECT rtrim('TURNERyxXxy', 'xy'), trim(TRAILING 'xy' FROM 'TURNERyxXxy');
TURNERyxX
```
After this PR:
```sql
spark-sql> SELECT trim('yxTomxx', 'xyz'), trim(BOTH 'xyz' FROM 'yxTomxx');
Tom Tom
spark-sql> SELECT trim('xxxbarxxx', 'x'), trim(BOTH 'x' FROM 'xxxbarxxx');
bar bar
spark-sql> SELECT ltrim('zzzytest', 'xyz'), trim(LEADING 'xyz' FROM 'zzzytest');
test test
spark-sql> SELECT ltrim('zzzytestxyz', 'xyz'), trim(LEADING 'xyz' FROM 'zzzytestxyz');
testxyz testxyz
spark-sql> SELECT ltrim('xyxXxyLAST WORD', 'xy'), trim(LEADING 'xy' FROM 'xyxXxyLAST WORD');
XxyLAST WORD XxyLAST WORD
spark-sql> SELECT rtrim('testxxzx', 'xyz'), trim(TRAILING 'xyz' FROM 'testxxzx');
test test
spark-sql> SELECT rtrim('xyztestxxzx', 'xyz'), trim(TRAILING 'xyz' FROM 'xyztestxxzx');
xyztest xyztest
spark-sql> SELECT rtrim('TURNERyxXxy', 'xy'), trim(TRAILING 'xy' FROM 'TURNERyxXxy');
TURNERyxX TURNERyxX
```
And PostgreSQL:
```sql
postgres=# SELECT trim('yxTomxx', 'xyz'), trim(BOTH 'xyz' FROM 'yxTomxx');
btrim | btrim
-------+-------
Tom | Tom
(1 row)
postgres=# SELECT trim('xxxbarxxx', 'x'), trim(BOTH 'x' FROM 'xxxbarxxx');
btrim | btrim
-------+-------
bar | bar
(1 row)
postgres=# SELECT ltrim('zzzytest', 'xyz'), trim(LEADING 'xyz' FROM 'zzzytest');
ltrim | ltrim
-------+-------
test | test
(1 row)
postgres=# SELECT ltrim('zzzytestxyz', 'xyz'), trim(LEADING 'xyz' FROM 'zzzytestxyz');
ltrim | ltrim
---------+---------
testxyz | testxyz
(1 row)
postgres=# SELECT ltrim('xyxXxyLAST WORD', 'xy'), trim(LEADING 'xy' FROM 'xyxXxyLAST WORD');
ltrim | ltrim
--------------+--------------
XxyLAST WORD | XxyLAST WORD
(1 row)
postgres=# SELECT rtrim('testxxzx', 'xyz'), trim(TRAILING 'xyz' FROM 'testxxzx');
rtrim | rtrim
-------+-------
test | test
(1 row)
postgres=# SELECT rtrim('xyztestxxzx', 'xyz'), trim(TRAILING 'xyz' FROM 'xyztestxxzx');
rtrim | rtrim
---------+---------
xyztest | xyztest
(1 row)
postgres=# SELECT rtrim('TURNERyxXxy', 'xy'), trim(TRAILING 'xy' FROM 'TURNERyxXxy');
rtrim | rtrim
-----------+-----------
TURNERyxX | TURNERyxX
(1 row)
```
Closes #24911 from wangyum/SPARK-28109.
Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-06-19 15:47:18 -04:00
< tr > < td > TRIM< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > TRUE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > TRUNCATE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > UNARCHIVE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > UNBOUNDED< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > UNCACHE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > UNION< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > UNIQUE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-07-31 02:59:50 -04:00
< tr > < td > UNKNOWN< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > UNLOCK< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > UNSET< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2019-09-23 07:25:56 -04:00
< tr > < td > UPDATE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > USE< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
< tr > < td > USER< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-18 02:19:52 -04:00
< tr > < td > USING< / td > < td > reserved< / td > < td > strict-non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > VALUES< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > VIEW< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
2020-04-07 12:25:01 -04:00
< tr > < td > VIEWS< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > non-reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< tr > < td > WHEN< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > WHERE< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > WINDOW< / td > < td > non-reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
< tr > < td > WITH< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
2019-03-13 21:45:29 -04:00
< tr > < td > YEAR< / td > < td > reserved< / td > < td > non-reserved< / td > < td > reserved< / td > < / tr >
[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard
## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes #23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2019-02-22 18:38:47 -05:00
< / table >