343e0bb3ad
# What changes were proposed in this pull request? In the PR, I propose to improve the error message from `from_json`/`from_csv` by combining errors from all schema parsers: - DataType.fromJson (except CSV) - CatalystSqlParser.parseDataType - CatalystSqlParser.parseTableSchema Before the changes, `from_json` does not show error messages from the first parser in the chain that could mislead users. ### Why are the changes needed? Currently, `from_json` outputs the error message from the fallback schema parser which can confuse end-users. For example: ```scala val invalidJsonSchema = """{"fields": [{"a":123}], "type": "struct"}""" df.select(from_json($"json", invalidJsonSchema, Map.empty[String, String])).show() ``` The JSON schema has an issue in `{"a":123}` but the error message doesn't point it out: ``` mismatched input '{' expecting {'ADD', 'AFTER', ...}(line 1, pos 0) == SQL == {"fields": [{"a":123}], "type": "struct"} ^^^ org.apache.spark.sql.catalyst.parser.ParseException: mismatched input '{' expecting {'ADD', 'AFTER', ... }(line 1, pos 0) == SQL == {"fields": [{"a":123}], "type": "struct"} ^^^ ``` ### Does this PR introduce _any_ user-facing change? Yes, after the changes for the example above: ``` Cannot parse the schema in JSON format: Failed to convert the JSON string '{"a":123}' to a field. Failed fallback parsing: Cannot parse the data type: mismatched input '{' expecting {'ADD', 'AFTER', ...}(line 1, pos 0) == SQL == {"fields": [{"a":123}], "type": "struct"} ^^^ Failed fallback parsing: mismatched input '{' expecting {'ADD', 'AFTER', ...}(line 1, pos 0) == SQL == {"fields": [{"a":123}], "type": "struct"} ^^^ ``` ### How was this patch tested? - By existing tests suites like `JsonFunctionsSuite` and `JsonExpressionsSuite`. - Add new test to `JsonFunctionsSuite`. - Re-gen results for `json-functions.sql`. Closes #30183 from MaxGekk/fromDDL-error-msg. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |