[SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

### What changes were proposed in this pull request?

This PR adds the SQL standard command - `SET TIME ZONE` to the current default time zone displacement for the current SQL-session, which is the same as the existing `set spark.sql.session.timeZone=xxx'.

All in all, this PR adds syntax as following,

```
SET TIME ZONE LOCAL;
SET TIME ZONE 'valid time zone';  -- zone offset or region
SET TIME ZONE INTERVAL XXXX; -- xxx must in [-18, + 18] hours, * this range is bigger than ansi  [-14, + 14]
```

### Why are the changes needed?

ANSI compliance and supply pure SQL users a way to retrieve all supported TimeZones

### Does this PR introduce _any_ user-facing change?

yes, add new syntax.

### How was this patch tested?

add unit tests.

and locally verified reference doc

![image](https://user-images.githubusercontent.com/8326978/87510244-c8dc3680-c6a5-11ea-954c-b098be84afee.png)

Closes #29064 from yaooqinn/SPARK-32272.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
Kent Yao 2020-07-16 13:01:53 +00:00 committed by Wenchen Fan
parent db47c6e340
commit bdeb626c5a
11 changed files with 308 additions and 6 deletions

View file

@ -249,6 +249,8 @@
url: sql-ref-syntax-aux-conf-mgmt-set.html
- text: RESET
url: sql-ref-syntax-aux-conf-mgmt-reset.html
- text: SET TIME ZONE
url: sql-ref-syntax-aux-conf-mgmt-set-timezone.html
- text: RESOURCE MANAGEMENT
url: sql-ref-syntax-aux-resource-mgmt.html
subitems:

View file

@ -355,6 +355,7 @@ Below is a list of all the keywords in Spark SQL.
|TEMPORARY|non-reserved|non-reserved|non-reserved|
|TERMINATED|non-reserved|non-reserved|non-reserved|
|THEN|reserved|non-reserved|reserved|
|TIME|reserved|non-reserved|reserved|
|TO|reserved|non-reserved|reserved|
|TOUCH|non-reserved|non-reserved|non-reserved|
|TRAILING|reserved|non-reserved|reserved|
@ -385,3 +386,4 @@ Below is a list of all the keywords in Spark SQL.
|WINDOW|non-reserved|non-reserved|reserved|
|WITH|reserved|non-reserved|reserved|
|YEAR|reserved|non-reserved|reserved|
|ZONE|non-reserved|non-reserved|non-reserved|

View file

@ -0,0 +1,67 @@
---
layout: global
title: SET TIME ZONE
displayTitle: SET TIME ZONE
license: |
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
---
### Description
The SET TIME ZONE command sets the time zone of the current session.
### Syntax
```sql
SET TIME ZONE LOCAL
SET TIME ZONE 'timezone_value'
SET TIME ZONE INTERVAL interval_literal
```
### Parameters
* **LOCAL**
Set the time zone to the one specified in the java `user.timezone` property, or to the environment variable `TZ` if `user.timezone` is undefined, or to the system time zone if both of them are undefined.
* **timezone_value**
The ID of session local timezone in the format of either region-based zone IDs or zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. Zone offsets must be in the format '`(+|-)HH`', '`(+|-)HH:mm`' or '`(+|-)HH:mm:ss`', e.g '-08', '+01:00' or '-13:33:33'. Also, 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not recommended to use because they can be ambiguous.
* **interval_literal**
The [interval literal](sql-ref-literals.html#interval-literal) represents the difference between the session time zone to the 'UTC'. It must be in the range of [-18, 18] hours and max to second precision, e.g. `INTERVAL 2 HOURS 30 MINITUES` or `INTERVAL '15:40:32' HOUR TO SECOND`.
### Examples
```sql
-- Set time zone to the system default.
SET TIME ZONE LOCAL;
-- Set time zone to the region-based zone ID.
SET TIME ZONE 'America/Los_Angeles';
-- Set time zone to the Zone offset.
SET TIME ZONE '+08:00';
-- Set time zone with intervals.
SET TIME ZONE INTERVAL 1 HOUR 30 MINUTES;
SET TIME ZONE INTERVAL '08:30:00' HOUR TO SECOND;
```
### Related Statements
* [SET](sql-ref-syntax-aux-conf-mgmt-set.html)

View file

@ -21,3 +21,4 @@ license: |
* [SET](sql-ref-syntax-aux-conf-mgmt-set.html)
* [RESET](sql-ref-syntax-aux-conf-mgmt-reset.html)
* [SET TIME ZONE](sql-ref-syntax-aux-conf-mgmt-set-timezone.html)

View file

@ -240,6 +240,9 @@ statement
| MSCK REPAIR TABLE multipartIdentifier #repairTable
| op=(ADD | LIST) identifier (STRING | .*?) #manageResource
| SET ROLE .*? #failNativeCommand
| SET TIME ZONE interval #setTimeZone
| SET TIME ZONE timezone=(STRING | LOCAL) #setTimeZone
| SET TIME ZONE .*? #setTimeZone
| SET .*? #setConfiguration
| RESET #resetConfiguration
| unsupportedHiveNativeCommands .*? #failNativeCommand
@ -1190,6 +1193,7 @@ ansiNonReserved
| VIEW
| VIEWS
| WINDOW
| ZONE
//--ANSI-NON-RESERVED-END
;
@ -1431,6 +1435,7 @@ nonReserved
| TEMPORARY
| TERMINATED
| THEN
| TIME
| TO
| TOUCH
| TRAILING
@ -1459,6 +1464,7 @@ nonReserved
| WINDOW
| WITH
| YEAR
| ZONE
;
// NOTE: If you add a new token in the list below, you should update the list of keywords
@ -1691,6 +1697,7 @@ TBLPROPERTIES: 'TBLPROPERTIES';
TEMPORARY: 'TEMPORARY' | 'TEMP';
TERMINATED: 'TERMINATED';
THEN: 'THEN';
TIME: 'TIME';
TO: 'TO';
TOUCH: 'TOUCH';
TRAILING: 'TRAILING';
@ -1721,6 +1728,7 @@ WHERE: 'WHERE';
WINDOW: 'WINDOW';
WITH: 'WITH';
YEAR: 'YEAR';
ZONE: 'ZONE';
//--SPARK-KEYWORD-LIST-END
//============================
// End of the keywords list

View file

@ -2090,6 +2090,13 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
* - from-to unit, for instance: interval '1-2' year to month.
*/
override def visitInterval(ctx: IntervalContext): Literal = withOrigin(ctx) {
Literal(parseIntervalLiteral(ctx), CalendarIntervalType)
}
/**
* Create a [[CalendarInterval]] object
*/
protected def parseIntervalLiteral(ctx: IntervalContext): CalendarInterval = withOrigin(ctx) {
if (ctx.errorCapturingMultiUnitsInterval != null) {
val innerCtx = ctx.errorCapturingMultiUnitsInterval
if (innerCtx.unitToUnitInterval != null) {
@ -2097,7 +2104,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
"Can only have a single from-to unit in the interval literal syntax",
innerCtx.unitToUnitInterval)
}
Literal(visitMultiUnitsInterval(innerCtx.multiUnitsInterval), CalendarIntervalType)
visitMultiUnitsInterval(innerCtx.multiUnitsInterval)
} else if (ctx.errorCapturingUnitToUnitInterval != null) {
val innerCtx = ctx.errorCapturingUnitToUnitInterval
if (innerCtx.error1 != null || innerCtx.error2 != null) {
@ -2106,7 +2113,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
"Can only have a single from-to unit in the interval literal syntax",
errorCtx)
}
Literal(visitUnitToUnitInterval(innerCtx.body), CalendarIntervalType)
visitUnitToUnitInterval(innerCtx.body)
} else {
throw new ParseException("at least one time unit should be given for interval literal", ctx)
}

View file

@ -1723,9 +1723,9 @@ object SQLConf {
val SESSION_LOCAL_TIMEZONE = buildConf("spark.sql.session.timeZone")
.doc("The ID of session local timezone in the format of either region-based zone IDs or " +
"zone offsets. Region IDs must have the form 'area/city', such as 'America/Los_Angeles'. " +
"Zone offsets must be in the format '(+|-)HH:mm', for example '-08:00' or '+01:00'. " +
"Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other short names are not " +
"recommended to use because they can be ambiguous.")
"Zone offsets must be in the format '(+|-)HH', '(+|-)HH:mm' or '(+|-)HH:mm:ss', e.g '-08', " +
"'+01:00' or '-13:33:33'. Also 'UTC' and 'Z' are supported as aliases of '+00:00'. Other " +
"short names are not recommended to use because they can be ambiguous.")
.version("2.2.0")
.stringConf
.checkValue(isValidTimezone, s"Cannot resolve the given timezone with" +

View file

@ -17,7 +17,8 @@
package org.apache.spark.sql.execution
import java.util.Locale
import java.time.ZoneOffset
import java.util.{Locale, TimeZone}
import javax.ws.rs.core.UriBuilder
import scala.collection.JavaConverters._
@ -32,6 +33,7 @@ import org.apache.spark.sql.catalyst.expressions.Expression
import org.apache.spark.sql.catalyst.parser._
import org.apache.spark.sql.catalyst.parser.SqlBaseParser._
import org.apache.spark.sql.catalyst.plans.logical._
import org.apache.spark.sql.catalyst.util.DateTimeConstants
import org.apache.spark.sql.execution.command._
import org.apache.spark.sql.execution.datasources._
import org.apache.spark.sql.internal.{HiveSerDe, SQLConf, VariableSubstitution}
@ -90,6 +92,41 @@ class SparkSqlAstBuilder(conf: SQLConf) extends AstBuilder(conf) {
ResetCommand
}
/**
* Create a [[SetCommand]] logical plan to set [[SQLConf.SESSION_LOCAL_TIMEZONE]]
* Example SQL :
* {{{
* SET TIME ZONE LOCAL;
* SET TIME ZONE 'Asia/Shanghai';
* SET TIME ZONE INTERVAL 10 HOURS;
* }}}
*/
override def visitSetTimeZone(ctx: SetTimeZoneContext): LogicalPlan = withOrigin(ctx) {
val key = SQLConf.SESSION_LOCAL_TIMEZONE.key
if (ctx.interval != null) {
val interval = parseIntervalLiteral(ctx.interval)
if (interval.months != 0 || interval.days != 0 ||
math.abs(interval.microseconds) > 18 * DateTimeConstants.MICROS_PER_HOUR ||
interval.microseconds % DateTimeConstants.MICROS_PER_SECOND != 0) {
throw new ParseException("The interval value must be in the range of [-18, +18] hours" +
" with second precision",
ctx.interval())
} else {
val seconds = (interval.microseconds / DateTimeConstants.MICROS_PER_SECOND).toInt
SetCommand(Some(key -> Some(ZoneOffset.ofTotalSeconds(seconds).toString)))
}
} else if (ctx.timezone != null) {
ctx.timezone.getType match {
case SqlBaseParser.LOCAL =>
SetCommand(Some(key -> Some(TimeZone.getDefault.getID)))
case _ =>
SetCommand(Some(key -> Some(string(ctx.STRING))))
}
} else {
throw new ParseException("Invalid time zone displacement value", ctx)
}
}
/**
* Create a [[RefreshResource]] logical plan.
*/

View file

@ -0,0 +1,15 @@
-- valid time zones
SET TIME ZONE 'Asia/Hong_Kong';
SET TIME ZONE 'GMT+1';
SET TIME ZONE INTERVAL 10 HOURS;
SET TIME ZONE INTERVAL '15:40:32' HOUR TO SECOND;
SET TIME ZONE LOCAL;
-- invalid time zone
SET TIME ZONE;
SET TIME ZONE 'invalid/zone';
SET TIME ZONE INTERVAL 3 DAYS;
SET TIME ZONE INTERVAL 24 HOURS;
SET TIME ZONE INTERVAL '19:40:32' HOUR TO SECOND;
SET TIME ZONE INTERVAL 10 HOURS 'GMT+1';
SET TIME ZONE INTERVAL 10 HOURS 1 MILLISECOND;

View file

@ -0,0 +1,135 @@
-- Automatically generated by SQLQueryTestSuite
-- Number of queries: 12
-- !query
SET TIME ZONE 'Asia/Hong_Kong'
-- !query schema
struct<key:string,value:string>
-- !query output
spark.sql.session.timeZone Asia/Hong_Kong
-- !query
SET TIME ZONE 'GMT+1'
-- !query schema
struct<key:string,value:string>
-- !query output
spark.sql.session.timeZone GMT+1
-- !query
SET TIME ZONE INTERVAL 10 HOURS
-- !query schema
struct<key:string,value:string>
-- !query output
spark.sql.session.timeZone +10:00
-- !query
SET TIME ZONE INTERVAL '15:40:32' HOUR TO SECOND
-- !query schema
struct<key:string,value:string>
-- !query output
spark.sql.session.timeZone +15:40:32
-- !query
SET TIME ZONE LOCAL
-- !query schema
struct<key:string,value:string>
-- !query output
spark.sql.session.timeZone America/Los_Angeles
-- !query
SET TIME ZONE
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException
Invalid time zone displacement value(line 1, pos 0)
== SQL ==
SET TIME ZONE
^^^
-- !query
SET TIME ZONE 'invalid/zone'
-- !query schema
struct<>
-- !query output
java.lang.IllegalArgumentException
Cannot resolve the given timezone with ZoneId.of(_, ZoneId.SHORT_IDS)
-- !query
SET TIME ZONE INTERVAL 3 DAYS
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException
The interval value must be in the range of [-18, +18] hours with second precision(line 1, pos 14)
== SQL ==
SET TIME ZONE INTERVAL 3 DAYS
--------------^^^
-- !query
SET TIME ZONE INTERVAL 24 HOURS
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException
The interval value must be in the range of [-18, +18] hours with second precision(line 1, pos 14)
== SQL ==
SET TIME ZONE INTERVAL 24 HOURS
--------------^^^
-- !query
SET TIME ZONE INTERVAL '19:40:32' HOUR TO SECOND
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException
The interval value must be in the range of [-18, +18] hours with second precision(line 1, pos 14)
== SQL ==
SET TIME ZONE INTERVAL '19:40:32' HOUR TO SECOND
--------------^^^
-- !query
SET TIME ZONE INTERVAL 10 HOURS 'GMT+1'
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException
Invalid time zone displacement value(line 1, pos 0)
== SQL ==
SET TIME ZONE INTERVAL 10 HOURS 'GMT+1'
^^^
-- !query
SET TIME ZONE INTERVAL 10 HOURS 1 MILLISECOND
-- !query schema
struct<>
-- !query output
org.apache.spark.sql.catalyst.parser.ParseException
The interval value must be in the range of [-18, +18] hours with second precision(line 1, pos 14)
== SQL ==
SET TIME ZONE INTERVAL 10 HOURS 1 MILLISECOND
--------------^^^

View file

@ -17,12 +17,15 @@
package org.apache.spark.sql.internal
import java.util.TimeZone
import scala.language.reflectiveCalls
import org.apache.hadoop.fs.Path
import org.apache.log4j.Level
import org.apache.spark.sql._
import org.apache.spark.sql.catalyst.parser.ParseException
import org.apache.spark.sql.catalyst.util.DateTimeTestUtils.MIT
import org.apache.spark.sql.internal.StaticSQLConf._
import org.apache.spark.sql.test.{SharedSparkSession, TestSQLContext}
@ -383,4 +386,29 @@ class SQLConfSuite extends QueryTest with SharedSparkSession {
}
assert(e.getMessage === "Cannot resolve the given timezone with ZoneId.of(_, ZoneId.SHORT_IDS)")
}
test("set time zone") {
TimeZone.getAvailableIDs().foreach { zid =>
sql(s"set time zone '$zid'")
assert(spark.conf.get(SQLConf.SESSION_LOCAL_TIMEZONE) === zid)
}
sql("set time zone local")
assert(spark.conf.get(SQLConf.SESSION_LOCAL_TIMEZONE) === TimeZone.getDefault.getID)
val e1 = intercept[IllegalArgumentException](sql("set time zone 'invalid'"))
assert(e1.getMessage === "Cannot resolve the given timezone with" +
" ZoneId.of(_, ZoneId.SHORT_IDS)")
(-18 to 18).map(v => (v, s"interval '$v' hours")).foreach { case (i, interval) =>
sql(s"set time zone $interval")
val zone = spark.conf.get(SQLConf.SESSION_LOCAL_TIMEZONE)
if (i == 0) {
assert(zone === "Z")
} else {
assert(zone === String.format("%+03d:00", new Integer(i)))
}
}
val e2 = intercept[ParseException](sql("set time zone interval 19 hours"))
assert(e2.getMessage contains "The interval value must be in the range of [-18, +18] hours")
}
}