[SPARK-36429][SQL] JacksonParser should throw exception when data type unsupported
### What changes were proposed in this pull request? Currently, when `set spark.sql.timestampType=TIMESTAMP_NTZ`, the behavior is different between `from_json` and `from_csv`. ``` -- !query select from_json('{"t":"26/October/2015"}', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema struct<from_json({"t":"26/October/2015"}):struct<t:timestamp_ntz>> -- !query output {"t":null} ``` ``` -- !query select from_csv('26/October/2015', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy')) -- !query schema struct<> -- !query output java.lang.Exception Unsupported type: timestamp_ntz ``` We should make `from_json` throws exception too. This PR fix the discussion below https://github.com/apache/spark/pull/33640#discussion_r682862523 ### Why are the changes needed? Make the behavior of `from_json` more reasonable. ### Does this PR introduce _any_ user-facing change? 'Yes'. from_json throwing Exception when we set spark.sql.timestampType=TIMESTAMP_NTZ. ### How was this patch tested? Tests updated. Closes #33684 from beliefer/SPARK-36429-new. Authored-by: gengjiaan <gengjiaan@360.cn> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
parent
89d8a4eacf
commit
186815be1c
|
@ -330,12 +330,13 @@ class JacksonParser(
|
|||
case udt: UserDefinedType[_] =>
|
||||
makeConverter(udt.sqlType)
|
||||
|
||||
case _ =>
|
||||
(parser: JsonParser) =>
|
||||
// Here, we pass empty `PartialFunction` so that this case can be
|
||||
// handled as a failed conversion. It will throw an exception as
|
||||
// long as the value is not null.
|
||||
parseJsonToken[AnyRef](parser, dataType)(PartialFunction.empty[JsonToken, AnyRef])
|
||||
case _: NullType =>
|
||||
(parser: JsonParser) => parseJsonToken[java.lang.Long](parser, dataType) {
|
||||
case _ => null
|
||||
}
|
||||
|
||||
// We don't actually hit this exception though, we keep it for understandability
|
||||
case _ => throw QueryExecutionErrors.unsupportedTypeError(dataType)
|
||||
}
|
||||
|
||||
/**
|
||||
|
|
|
@ -661,9 +661,10 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn
|
|||
-- !query
|
||||
select from_json('{"t":"26/October/2015"}', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy'))
|
||||
-- !query schema
|
||||
struct<from_json({"t":"26/October/2015"}):struct<t:timestamp_ntz>>
|
||||
struct<>
|
||||
-- !query output
|
||||
{"t":null}
|
||||
java.lang.Exception
|
||||
Unsupported type: timestamp_ntz
|
||||
|
||||
|
||||
-- !query
|
||||
|
|
|
@ -642,9 +642,10 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn
|
|||
-- !query
|
||||
select from_json('{"t":"26/October/2015"}', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy'))
|
||||
-- !query schema
|
||||
struct<from_json({"t":"26/October/2015"}):struct<t:timestamp_ntz>>
|
||||
struct<>
|
||||
-- !query output
|
||||
{"t":null}
|
||||
java.lang.Exception
|
||||
Unsupported type: timestamp_ntz
|
||||
|
||||
|
||||
-- !query
|
||||
|
|
Loading…
Reference in a new issue