[SPARK-28471][SQL] Replace yyyy
by uuuu
in date-timestamp patterns without era
## What changes were proposed in this pull request? In the PR, I propose to use `uuuu` for years instead of `yyyy` in date/timestamp patterns without the era pattern `G` (https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html). **Parsing/formatting of positive years (current era) will be the same.** The difference is in formatting negative years belong to previous era - BC (Before Christ). I replaced the `yyyy` pattern by `uuuu` everywhere except: 1. Test, Suite & Benchmark. Existing tests must work as is. 2. `SimpleDateFormat` because it doesn't support the `uuuu` pattern. 3. Comments and examples (except comments related to already replaced patterns). Before the changes, the year of common era `100` and the year of BC era `-99`, showed similarly as `100`. After the changes negative years will be formatted with the `-` sign. Before: ```Scala scala> Seq(java.time.LocalDate.of(-99, 1, 1)).toDF().show +----------+ | value| +----------+ |0100-01-01| +----------+ ``` After: ```Scala scala> Seq(java.time.LocalDate.of(-99, 1, 1)).toDF().show +-----------+ | value| +-----------+ |-0099-01-01| +-----------+ ``` ## How was this patch tested? By existing test suites, and added tests for negative years to `DateFormatterSuite` and `TimestampFormatterSuite`. Closes #25230 from MaxGekk/year-pattern-uuuu. Authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
parent
a428f40669
commit
a5a5da78cf
|
@ -2741,7 +2741,7 @@ setMethod("format_string", signature(format = "character", x = "Column"),
|
|||
#' head(tmp)}
|
||||
#' @note from_unixtime since 1.5.0
|
||||
setMethod("from_unixtime", signature(x = "Column"),
|
||||
function(x, format = "yyyy-MM-dd HH:mm:ss") {
|
||||
function(x, format = "uuuu-MM-dd HH:mm:ss") {
|
||||
jc <- callJStatic("org.apache.spark.sql.functions",
|
||||
"from_unixtime",
|
||||
x@jc, format)
|
||||
|
@ -3029,7 +3029,7 @@ setMethod("unix_timestamp", signature(x = "Column", format = "missing"),
|
|||
#' @aliases unix_timestamp,Column,character-method
|
||||
#' @note unix_timestamp(Column, character) since 1.5.0
|
||||
setMethod("unix_timestamp", signature(x = "Column", format = "character"),
|
||||
function(x, format = "yyyy-MM-dd HH:mm:ss") {
|
||||
function(x, format = "uuuu-MM-dd HH:mm:ss") {
|
||||
jc <- callJStatic("org.apache.spark.sql.functions", "unix_timestamp", x@jc, format)
|
||||
column(jc)
|
||||
})
|
||||
|
|
|
@ -1247,7 +1247,7 @@ def last_day(date):
|
|||
|
||||
@ignore_unicode_prefix
|
||||
@since(1.5)
|
||||
def from_unixtime(timestamp, format="yyyy-MM-dd HH:mm:ss"):
|
||||
def from_unixtime(timestamp, format="uuuu-MM-dd HH:mm:ss"):
|
||||
"""
|
||||
Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
|
||||
representing the timestamp of that moment in the current system time zone in the given
|
||||
|
@ -1264,9 +1264,9 @@ def from_unixtime(timestamp, format="yyyy-MM-dd HH:mm:ss"):
|
|||
|
||||
|
||||
@since(1.5)
|
||||
def unix_timestamp(timestamp=None, format='yyyy-MM-dd HH:mm:ss'):
|
||||
def unix_timestamp(timestamp=None, format='uuuu-MM-dd HH:mm:ss'):
|
||||
"""
|
||||
Convert time string with given pattern ('yyyy-MM-dd HH:mm:ss', by default)
|
||||
Convert time string with given pattern ('uuuu-MM-dd HH:mm:ss', by default)
|
||||
to Unix time stamp (in seconds), using the default timezone and the default
|
||||
locale, return null if fail.
|
||||
|
||||
|
|
|
@ -222,12 +222,12 @@ class DataFrameReader(OptionUtils):
|
|||
:param dateFormat: sets the string that indicates a date format. Custom date formats
|
||||
follow the formats at ``java.time.format.DateTimeFormatter``. This
|
||||
applies to date type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd``.
|
||||
default value, ``uuuu-MM-dd``.
|
||||
:param timestampFormat: sets the string that indicates a timestamp format.
|
||||
Custom date formats follow the formats at
|
||||
``java.time.format.DateTimeFormatter``.
|
||||
This applies to timestamp type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
default value, ``uuuu-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
:param multiLine: parse one record, which may span multiple lines, per file. If None is
|
||||
set, it uses the default value, ``false``.
|
||||
:param allowUnquotedControlChars: allows JSON Strings to contain unquoted control
|
||||
|
@ -404,12 +404,12 @@ class DataFrameReader(OptionUtils):
|
|||
:param dateFormat: sets the string that indicates a date format. Custom date formats
|
||||
follow the formats at ``java.time.format.DateTimeFormatter``. This
|
||||
applies to date type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd``.
|
||||
default value, ``uuuu-MM-dd``.
|
||||
:param timestampFormat: sets the string that indicates a timestamp format.
|
||||
Custom date formats follow the formats at
|
||||
``java.time.format.DateTimeFormatter``.
|
||||
This applies to timestamp type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
default value, ``uuuu-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
:param maxColumns: defines a hard limit of how many columns a record can have. If None is
|
||||
set, it uses the default value, ``20480``.
|
||||
:param maxCharsPerColumn: defines the maximum number of characters allowed for any given
|
||||
|
@ -806,12 +806,12 @@ class DataFrameWriter(OptionUtils):
|
|||
:param dateFormat: sets the string that indicates a date format. Custom date formats
|
||||
follow the formats at ``java.time.format.DateTimeFormatter``. This
|
||||
applies to date type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd``.
|
||||
default value, ``uuuu-MM-dd``.
|
||||
:param timestampFormat: sets the string that indicates a timestamp format.
|
||||
Custom date formats follow the formats at
|
||||
``java.time.format.DateTimeFormatter``.
|
||||
This applies to timestamp type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
default value, ``uuuu-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
:param encoding: specifies encoding (charset) of saved json files. If None is set,
|
||||
the default UTF-8 charset will be used.
|
||||
:param lineSep: defines the line separator that should be used for writing. If None is
|
||||
|
@ -909,12 +909,12 @@ class DataFrameWriter(OptionUtils):
|
|||
:param dateFormat: sets the string that indicates a date format. Custom date formats
|
||||
follow the formats at ``java.time.format.DateTimeFormatter``. This
|
||||
applies to date type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd``.
|
||||
default value, ``uuuu-MM-dd``.
|
||||
:param timestampFormat: sets the string that indicates a timestamp format.
|
||||
Custom date formats follow the formats at
|
||||
``java.time.format.DateTimeFormatter``.
|
||||
This applies to timestamp type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
default value, ``uuuu-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
:param ignoreLeadingWhiteSpace: a flag indicating whether or not leading whitespaces from
|
||||
values being written should be skipped. If None is set, it
|
||||
uses the default value, ``true``.
|
||||
|
|
|
@ -464,12 +464,12 @@ class DataStreamReader(OptionUtils):
|
|||
:param dateFormat: sets the string that indicates a date format. Custom date formats
|
||||
follow the formats at ``java.time.format.DateTimeFormatter``. This
|
||||
applies to date type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd``.
|
||||
default value, ``uuuu-MM-dd``.
|
||||
:param timestampFormat: sets the string that indicates a timestamp format.
|
||||
Custom date formats follow the formats at
|
||||
``java.time.format.DateTimeFormatter``.
|
||||
This applies to timestamp type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
default value, ``uuuu-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
:param multiLine: parse one record, which may span multiple lines, per file. If None is
|
||||
set, it uses the default value, ``false``.
|
||||
:param allowUnquotedControlChars: allows JSON Strings to contain unquoted control
|
||||
|
@ -640,12 +640,12 @@ class DataStreamReader(OptionUtils):
|
|||
:param dateFormat: sets the string that indicates a date format. Custom date formats
|
||||
follow the formats at ``java.time.format.DateTimeFormatter``. This
|
||||
applies to date type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd``.
|
||||
default value, ``uuuu-MM-dd``.
|
||||
:param timestampFormat: sets the string that indicates a timestamp format.
|
||||
Custom date formats follow the formats at
|
||||
``java.time.format.DateTimeFormatter``.
|
||||
This applies to timestamp type. If None is set, it uses the
|
||||
default value, ``yyyy-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
default value, ``uuuu-MM-dd'T'HH:mm:ss.SSSXXX``.
|
||||
:param maxColumns: defines a hard limit of how many columns a record can have. If None is
|
||||
set, it uses the default value, ``20480``.
|
||||
:param maxCharsPerColumn: defines the maximum number of characters allowed for any given
|
||||
|
|
|
@ -478,7 +478,7 @@ object CatalogColumnStat extends Logging {
|
|||
val VERSION = 2
|
||||
|
||||
private def getTimestampFormatter(): TimestampFormatter = {
|
||||
TimestampFormatter(format = "yyyy-MM-dd HH:mm:ss.SSSSSS", zoneId = ZoneOffset.UTC)
|
||||
TimestampFormatter(format = "uuuu-MM-dd HH:mm:ss.SSSSSS", zoneId = ZoneOffset.UTC)
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
@ -146,10 +146,10 @@ class CSVOptions(
|
|||
// A language tag in IETF BCP 47 format
|
||||
val locale: Locale = parameters.get("locale").map(Locale.forLanguageTag).getOrElse(Locale.US)
|
||||
|
||||
val dateFormat: String = parameters.getOrElse("dateFormat", "yyyy-MM-dd")
|
||||
val dateFormat: String = parameters.getOrElse("dateFormat", "uuuu-MM-dd")
|
||||
|
||||
val timestampFormat: String =
|
||||
parameters.getOrElse("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss.SSSXXX")
|
||||
parameters.getOrElse("timestampFormat", "uuuu-MM-dd'T'HH:mm:ss.SSSXXX")
|
||||
|
||||
val multiLine = parameters.get("multiLine").map(_.toBoolean).getOrElse(false)
|
||||
|
||||
|
|
|
@ -579,7 +579,7 @@ case class ToUnixTimestamp(
|
|||
copy(timeZoneId = Option(timeZoneId))
|
||||
|
||||
def this(time: Expression) = {
|
||||
this(time, Literal("yyyy-MM-dd HH:mm:ss"))
|
||||
this(time, Literal("uuuu-MM-dd HH:mm:ss"))
|
||||
}
|
||||
|
||||
override def prettyName: String = "to_unix_timestamp"
|
||||
|
@ -616,7 +616,7 @@ case class UnixTimestamp(timeExp: Expression, format: Expression, timeZoneId: Op
|
|||
copy(timeZoneId = Option(timeZoneId))
|
||||
|
||||
def this(time: Expression) = {
|
||||
this(time, Literal("yyyy-MM-dd HH:mm:ss"))
|
||||
this(time, Literal("uuuu-MM-dd HH:mm:ss"))
|
||||
}
|
||||
|
||||
def this() = {
|
||||
|
@ -786,7 +786,7 @@ case class FromUnixTime(sec: Expression, format: Expression, timeZoneId: Option[
|
|||
override def prettyName: String = "from_unixtime"
|
||||
|
||||
def this(unix: Expression) = {
|
||||
this(unix, Literal("yyyy-MM-dd HH:mm:ss"))
|
||||
this(unix, Literal("uuuu-MM-dd HH:mm:ss"))
|
||||
}
|
||||
|
||||
override def dataType: DataType = StringType
|
||||
|
|
|
@ -82,10 +82,10 @@ private[sql] class JSONOptions(
|
|||
val zoneId: ZoneId = DateTimeUtils.getZoneId(
|
||||
parameters.getOrElse(DateTimeUtils.TIMEZONE_OPTION, defaultTimeZoneId))
|
||||
|
||||
val dateFormat: String = parameters.getOrElse("dateFormat", "yyyy-MM-dd")
|
||||
val dateFormat: String = parameters.getOrElse("dateFormat", "uuuu-MM-dd")
|
||||
|
||||
val timestampFormat: String =
|
||||
parameters.getOrElse("timestampFormat", "yyyy-MM-dd'T'HH:mm:ss.SSSXXX")
|
||||
parameters.getOrElse("timestampFormat", "uuuu-MM-dd'T'HH:mm:ss.SSSXXX")
|
||||
|
||||
val multiLine = parameters.get("multiLine").map(_.toBoolean).getOrElse(false)
|
||||
|
||||
|
|
|
@ -43,7 +43,7 @@ class Iso8601DateFormatter(
|
|||
}
|
||||
|
||||
object DateFormatter {
|
||||
val defaultPattern: String = "yyyy-MM-dd"
|
||||
val defaultPattern: String = "uuuu-MM-dd"
|
||||
val defaultLocale: Locale = Locale.US
|
||||
|
||||
def apply(format: String, locale: Locale): DateFormatter = {
|
||||
|
|
|
@ -82,7 +82,7 @@ class FractionTimestampFormatter(zoneId: ZoneId)
|
|||
}
|
||||
|
||||
object TimestampFormatter {
|
||||
val defaultPattern: String = "yyyy-MM-dd HH:mm:ss"
|
||||
val defaultPattern: String = "uuuu-MM-dd HH:mm:ss"
|
||||
val defaultLocale: Locale = Locale.US
|
||||
|
||||
def apply(format: String, zoneId: ZoneId, locale: Locale): TimestampFormatter = {
|
||||
|
|
|
@ -95,4 +95,9 @@ class DateFormatterSuite extends SparkFunSuite with SQLHelper {
|
|||
val daysSinceEpoch = formatter.parse("2018 Dec")
|
||||
assert(daysSinceEpoch === LocalDate.of(2018, 12, 1).toEpochDay)
|
||||
}
|
||||
|
||||
test("formatting negative years with default pattern") {
|
||||
val epochDays = LocalDate.of(-99, 1, 1).toEpochDay.toInt
|
||||
assert(DateFormatter().format(epochDays) === "-0099-01-01")
|
||||
}
|
||||
}
|
||||
|
|
|
@ -123,4 +123,12 @@ class TimestampFormatterSuite extends SparkFunSuite with SQLHelper {
|
|||
assert(formatter.format(900000) === "1970-01-01 00:00:00.9")
|
||||
assert(formatter.format(1000000) === "1970-01-01 00:00:01")
|
||||
}
|
||||
|
||||
test("formatting negative years with default pattern") {
|
||||
val instant = LocalDateTime.of(-99, 1, 1, 0, 0, 0)
|
||||
.atZone(ZoneOffset.UTC)
|
||||
.toInstant
|
||||
val micros = DateTimeUtils.instantToMicros(instant)
|
||||
assert(TimestampFormatter(ZoneOffset.UTC).format(micros) === "-0099-01-01 00:00:00")
|
||||
}
|
||||
}
|
||||
|
|
|
@ -395,10 +395,10 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging {
|
|||
* <li>`columnNameOfCorruptRecord` (default is the value specified in
|
||||
* `spark.sql.columnNameOfCorruptRecord`): allows renaming the new field having malformed string
|
||||
* created by `PERMISSIVE` mode. This overrides `spark.sql.columnNameOfCorruptRecord`.</li>
|
||||
* <li>`dateFormat` (default `yyyy-MM-dd`): sets the string that indicates a date format.
|
||||
* <li>`dateFormat` (default `uuuu-MM-dd`): sets the string that indicates a date format.
|
||||
* Custom date formats follow the formats at `java.time.format.DateTimeFormatter`.
|
||||
* This applies to date type.</li>
|
||||
* <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* <li>`timestampFormat` (default `uuuu-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* indicates a timestamp format. Custom date formats follow the formats at
|
||||
* `java.time.format.DateTimeFormatter`. This applies to timestamp type.</li>
|
||||
* <li>`multiLine` (default `false`): parse one record, which may span multiple lines,
|
||||
|
@ -615,10 +615,10 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging {
|
|||
* value.</li>
|
||||
* <li>`negativeInf` (default `-Inf`): sets the string representation of a negative infinity
|
||||
* value.</li>
|
||||
* <li>`dateFormat` (default `yyyy-MM-dd`): sets the string that indicates a date format.
|
||||
* <li>`dateFormat` (default `uuuu-MM-dd`): sets the string that indicates a date format.
|
||||
* Custom date formats follow the formats at `java.time.format.DateTimeFormatter`.
|
||||
* This applies to date type.</li>
|
||||
* <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* <li>`timestampFormat` (default `uuuu-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* indicates a timestamp format. Custom date formats follow the formats at
|
||||
* `java.time.format.DateTimeFormatter`. This applies to timestamp type.</li>
|
||||
* <li>`maxColumns` (default `20480`): defines a hard limit of how many columns
|
||||
|
|
|
@ -568,10 +568,10 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) {
|
|||
* <li>`compression` (default `null`): compression codec to use when saving to file. This can be
|
||||
* one of the known case-insensitive shorten names (`none`, `bzip2`, `gzip`, `lz4`,
|
||||
* `snappy` and `deflate`). </li>
|
||||
* <li>`dateFormat` (default `yyyy-MM-dd`): sets the string that indicates a date format.
|
||||
* <li>`dateFormat` (default `uuuu-MM-dd`): sets the string that indicates a date format.
|
||||
* Custom date formats follow the formats at `java.time.format.DateTimeFormatter`.
|
||||
* This applies to date type.</li>
|
||||
* <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* <li>`timestampFormat` (default `uuuu-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* indicates a timestamp format. Custom date formats follow the formats at
|
||||
* `java.time.format.DateTimeFormatter`. This applies to timestamp type.</li>
|
||||
* <li>`encoding` (by default it is not set): specifies encoding (charset) of saved json
|
||||
|
@ -687,10 +687,10 @@ final class DataFrameWriter[T] private[sql](ds: Dataset[T]) {
|
|||
* <li>`compression` (default `null`): compression codec to use when saving to file. This can be
|
||||
* one of the known case-insensitive shorten names (`none`, `bzip2`, `gzip`, `lz4`,
|
||||
* `snappy` and `deflate`). </li>
|
||||
* <li>`dateFormat` (default `yyyy-MM-dd`): sets the string that indicates a date format.
|
||||
* <li>`dateFormat` (default `uuuu-MM-dd`): sets the string that indicates a date format.
|
||||
* Custom date formats follow the formats at `java.time.format.DateTimeFormatter`.
|
||||
* This applies to date type.</li>
|
||||
* <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* <li>`timestampFormat` (default `uuuu-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* indicates a timestamp format. Custom date formats follow the formats at
|
||||
* `java.time.format.DateTimeFormatter`. This applies to timestamp type.</li>
|
||||
* <li>`ignoreLeadingWhiteSpace` (default `true`): a flag indicating whether or not leading
|
||||
|
|
|
@ -60,7 +60,7 @@ object PartitionSpec {
|
|||
|
||||
object PartitioningUtils {
|
||||
|
||||
val timestampPartitionPattern = "yyyy-MM-dd HH:mm:ss[.S]"
|
||||
val timestampPartitionPattern = "uuuu-MM-dd HH:mm:ss[.S]"
|
||||
|
||||
private[datasources] case class PartitionValues(columnNames: Seq[String], literals: Seq[Literal])
|
||||
{
|
||||
|
|
|
@ -2830,7 +2830,7 @@ object functions {
|
|||
/**
|
||||
* Converts the number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a string
|
||||
* representing the timestamp of that moment in the current system time zone in the
|
||||
* yyyy-MM-dd HH:mm:ss format.
|
||||
* uuuu-MM-dd HH:mm:ss format.
|
||||
*
|
||||
* @param ut A number of a type that is castable to a long, such as string or integer. Can be
|
||||
* negative for timestamps before the unix epoch
|
||||
|
@ -2839,7 +2839,7 @@ object functions {
|
|||
* @since 1.5.0
|
||||
*/
|
||||
def from_unixtime(ut: Column): Column = withExpr {
|
||||
FromUnixTime(ut.expr, Literal("yyyy-MM-dd HH:mm:ss"))
|
||||
FromUnixTime(ut.expr, Literal("uuuu-MM-dd HH:mm:ss"))
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -2871,21 +2871,21 @@ object functions {
|
|||
* @since 1.5.0
|
||||
*/
|
||||
def unix_timestamp(): Column = withExpr {
|
||||
UnixTimestamp(CurrentTimestamp(), Literal("yyyy-MM-dd HH:mm:ss"))
|
||||
UnixTimestamp(CurrentTimestamp(), Literal("uuuu-MM-dd HH:mm:ss"))
|
||||
}
|
||||
|
||||
/**
|
||||
* Converts time string in format yyyy-MM-dd HH:mm:ss to Unix timestamp (in seconds),
|
||||
* Converts time string in format uuuu-MM-dd HH:mm:ss to Unix timestamp (in seconds),
|
||||
* using the default timezone and the default locale.
|
||||
*
|
||||
* @param s A date, timestamp or string. If a string, the data must be in the
|
||||
* `yyyy-MM-dd HH:mm:ss` format
|
||||
* `uuuu-MM-dd HH:mm:ss` format
|
||||
* @return A long, or null if the input was a string not of the correct format
|
||||
* @group datetime_funcs
|
||||
* @since 1.5.0
|
||||
*/
|
||||
def unix_timestamp(s: Column): Column = withExpr {
|
||||
UnixTimestamp(s.expr, Literal("yyyy-MM-dd HH:mm:ss"))
|
||||
UnixTimestamp(s.expr, Literal("uuuu-MM-dd HH:mm:ss"))
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -2894,7 +2894,7 @@ object functions {
|
|||
* See [[java.time.format.DateTimeFormatter]] for valid date and time format patterns
|
||||
*
|
||||
* @param s A date, timestamp or string. If a string, the data must be in a format that can be
|
||||
* cast to a date, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
|
||||
* cast to a date, such as `uuuu-MM-dd` or `uuuu-MM-dd HH:mm:ss.SSSS`
|
||||
* @param p A date time pattern detailing the format of `s` when `s` is a string
|
||||
* @return A long, or null if `s` was a string that could not be cast to a date or `p` was
|
||||
* an invalid format
|
||||
|
@ -2907,7 +2907,7 @@ object functions {
|
|||
* Converts to a timestamp by casting rules to `TimestampType`.
|
||||
*
|
||||
* @param s A date, timestamp or string. If a string, the data must be in a format that can be
|
||||
* cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
|
||||
* cast to a timestamp, such as `uuuu-MM-dd` or `uuuu-MM-dd HH:mm:ss.SSSS`
|
||||
* @return A timestamp, or null if the input was a string that could not be cast to a timestamp
|
||||
* @group datetime_funcs
|
||||
* @since 2.2.0
|
||||
|
@ -2922,7 +2922,7 @@ object functions {
|
|||
* See [[java.time.format.DateTimeFormatter]] for valid date and time format patterns
|
||||
*
|
||||
* @param s A date, timestamp or string. If a string, the data must be in a format that can be
|
||||
* cast to a timestamp, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
|
||||
* cast to a timestamp, such as `uuuu-MM-dd` or `uuuu-MM-dd HH:mm:ss.SSSS`
|
||||
* @param fmt A date time pattern detailing the format of `s` when `s` is a string
|
||||
* @return A timestamp, or null if `s` was a string that could not be cast to a timestamp or
|
||||
* `fmt` was an invalid format
|
||||
|
@ -2947,7 +2947,7 @@ object functions {
|
|||
* See [[java.time.format.DateTimeFormatter]] for valid date and time format patterns
|
||||
*
|
||||
* @param e A date, timestamp or string. If a string, the data must be in a format that can be
|
||||
* cast to a date, such as `yyyy-MM-dd` or `yyyy-MM-dd HH:mm:ss.SSSS`
|
||||
* cast to a date, such as `uuuu-MM-dd` or `uuuu-MM-dd HH:mm:ss.SSSS`
|
||||
* @param fmt A date time pattern detailing the format of `e` when `e`is a string
|
||||
* @return A date, or null if `e` was a string that could not be cast to a date or `fmt` was an
|
||||
* invalid format
|
||||
|
|
|
@ -263,10 +263,10 @@ final class DataStreamReader private[sql](sparkSession: SparkSession) extends Lo
|
|||
* <li>`columnNameOfCorruptRecord` (default is the value specified in
|
||||
* `spark.sql.columnNameOfCorruptRecord`): allows renaming the new field having malformed string
|
||||
* created by `PERMISSIVE` mode. This overrides `spark.sql.columnNameOfCorruptRecord`.</li>
|
||||
* <li>`dateFormat` (default `yyyy-MM-dd`): sets the string that indicates a date format.
|
||||
* <li>`dateFormat` (default `uuuu-MM-dd`): sets the string that indicates a date format.
|
||||
* Custom date formats follow the formats at `java.time.format.DateTimeFormatter`.
|
||||
* This applies to date type.</li>
|
||||
* <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* <li>`timestampFormat` (default `uuuu-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* indicates a timestamp format. Custom date formats follow the formats at
|
||||
* `java.time.format.DateTimeFormatter`. This applies to timestamp type.</li>
|
||||
* <li>`multiLine` (default `false`): parse one record, which may span multiple lines,
|
||||
|
@ -324,10 +324,10 @@ final class DataStreamReader private[sql](sparkSession: SparkSession) extends Lo
|
|||
* value.</li>
|
||||
* <li>`negativeInf` (default `-Inf`): sets the string representation of a negative infinity
|
||||
* value.</li>
|
||||
* <li>`dateFormat` (default `yyyy-MM-dd`): sets the string that indicates a date format.
|
||||
* <li>`dateFormat` (default `uuuu-MM-dd`): sets the string that indicates a date format.
|
||||
* Custom date formats follow the formats at `java.time.format.DateTimeFormatter`.
|
||||
* This applies to date type.</li>
|
||||
* <li>`timestampFormat` (default `yyyy-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* <li>`timestampFormat` (default `uuuu-MM-dd'T'HH:mm:ss.SSSXXX`): sets the string that
|
||||
* indicates a timestamp format. Custom date formats follow the formats at
|
||||
* `java.time.format.DateTimeFormatter`. This applies to timestamp type.</li>
|
||||
* <li>`maxColumns` (default `20480`): defines a hard limit of how many columns
|
||||
|
|
|
@ -520,7 +520,7 @@ select make_date(-44, 3, 15)
|
|||
-- !query 48 schema
|
||||
struct<make_date(-44, 3, 15):date>
|
||||
-- !query 48 output
|
||||
0045-03-15
|
||||
-0044-03-15
|
||||
|
||||
|
||||
-- !query 49
|
||||
|
|
|
@ -144,7 +144,7 @@ NULL
|
|||
-- !query 17
|
||||
select to_unix_timestamp(a) from t
|
||||
-- !query 17 schema
|
||||
struct<to_unix_timestamp(a, yyyy-MM-dd HH:mm:ss):bigint>
|
||||
struct<to_unix_timestamp(a, uuuu-MM-dd HH:mm:ss):bigint>
|
||||
-- !query 17 output
|
||||
NULL
|
||||
|
||||
|
@ -160,7 +160,7 @@ NULL
|
|||
-- !query 19
|
||||
select unix_timestamp(a) from t
|
||||
-- !query 19 schema
|
||||
struct<unix_timestamp(a, yyyy-MM-dd HH:mm:ss):bigint>
|
||||
struct<unix_timestamp(a, uuuu-MM-dd HH:mm:ss):bigint>
|
||||
-- !query 19 output
|
||||
NULL
|
||||
|
||||
|
@ -176,7 +176,7 @@ NULL
|
|||
-- !query 21
|
||||
select from_unixtime(a) from t
|
||||
-- !query 21 schema
|
||||
struct<from_unixtime(CAST(a AS BIGINT), yyyy-MM-dd HH:mm:ss):string>
|
||||
struct<from_unixtime(CAST(a AS BIGINT), uuuu-MM-dd HH:mm:ss):string>
|
||||
-- !query 21 output
|
||||
NULL
|
||||
|
||||
|
|
Loading…
Reference in a new issue