[SPARK-36429][SQL] JacksonParser should throw exception when data type unsupported

### What changes were proposed in this pull request?
Currently, when `set spark.sql.timestampType=TIMESTAMP_NTZ`, the behavior is different between `from_json` and `from_csv`.
```
-- !query
select from_json('{"t":"26/October/2015"}', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy'))
-- !query schema
struct<from_json({"t":"26/October/2015"}):struct<t:timestamp_ntz>>
-- !query output
{"t":null}
```

```
-- !query
select from_csv('26/October/2015', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy'))
-- !query schema
struct<>
-- !query output
java.lang.Exception
Unsupported type: timestamp_ntz
```

We should make `from_json` throws exception too.
This PR fix the discussion below
https://github.com/apache/spark/pull/33640#discussion_r682862523

### Why are the changes needed?
Make the behavior of `from_json` more reasonable.

### Does this PR introduce _any_ user-facing change?
'Yes'.
from_json throwing Exception when we set spark.sql.timestampType=TIMESTAMP_NTZ.

### How was this patch tested?
Tests updated.

Closes #33684 from beliefer/SPARK-36429-new.

Authored-by: gengjiaan <gengjiaan@360.cn>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
gengjiaan 2021-08-10 22:52:20 +08:00 committed by Wenchen Fan
parent 89d8a4eacf
commit 186815be1c
3 changed files with 13 additions and 10 deletions

View file

@ -330,12 +330,13 @@ class JacksonParser(
case udt: UserDefinedType[_] =>
makeConverter(udt.sqlType)
case _ =>
(parser: JsonParser) =>
// Here, we pass empty `PartialFunction` so that this case can be
// handled as a failed conversion. It will throw an exception as
// long as the value is not null.
parseJsonToken[AnyRef](parser, dataType)(PartialFunction.empty[JsonToken, AnyRef])
case _: NullType =>
(parser: JsonParser) => parseJsonToken[java.lang.Long](parser, dataType) {
case _ => null
}
// We don't actually hit this exception though, we keep it for understandability
case _ => throw QueryExecutionErrors.unsupportedTypeError(dataType)
}
/**

View file

@ -661,9 +661,10 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn
-- !query
select from_json('{"t":"26/October/2015"}', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy'))
-- !query schema
struct<from_json({"t":"26/October/2015"}):struct<t:timestamp_ntz>>
struct<>
-- !query output
{"t":null}
java.lang.Exception
Unsupported type: timestamp_ntz
-- !query

View file

@ -642,9 +642,10 @@ You may get a different result due to the upgrading of Spark 3.0: Fail to recogn
-- !query
select from_json('{"t":"26/October/2015"}', 't Timestamp', map('timestampFormat', 'dd/MMMMM/yyyy'))
-- !query schema
struct<from_json({"t":"26/October/2015"}):struct<t:timestamp_ntz>>
struct<>
-- !query output
{"t":null}
java.lang.Exception
Unsupported type: timestamp_ntz
-- !query