[SPARK-34665][SQL][DOCS] Revise the type coercion section of ANSI Compliance

### What changes were proposed in this pull request?

1. Fix the table of valid type coercion combinations. Binary type should be allowed casting to String type and disallowed casting to Numeric types.
2. Summary all the `CAST`s that can cause runtime exceptions.

### Why are the changes needed?

Fix a mistake in the docs.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Run `jekyll serve` and preview:

![image](https://user-images.githubusercontent.com/1097932/110334374-8fab5a80-7fd7-11eb-86e7-c519cfa41b99.png)

Closes #31781 from gengliangwang/reviseAnsiDoc2.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
This commit is contained in:
Gengliang Wang 2021-03-09 13:19:14 +09:00 committed by Takeshi Yamamuro
parent f340857757
commit ee756fd695

View file

@ -72,16 +72,23 @@ The type conversion of Spark ANSI mode follows the syntax rules of section 6.13
| Source\Target | Numeric | String | Date | Timestamp | Interval | Boolean | Binary | Array | Map | Struct |
|-----------|---------|--------|------|-----------|----------|---------|--------|-------|-----|--------|
| Numeric | Y | Y | N | N | N | Y | N | N | N | N |
| String | Y | Y | Y | Y | Y | Y | Y | N | N | N |
| Numeric | <span style="color:red">**Y**</span> | Y | N | N | N | Y | N | N | N | N |
| String | <span style="color:red">**Y**</span> | Y | <span style="color:red">**Y**</span> | <span style="color:red">**Y**</span> | <span style="color:red">**Y**</span> | <span style="color:red">**Y**</span> | Y | N | N | N |
| Date | N | Y | Y | Y | N | N | N | N | N | N |
| Timestamp | N | Y | Y | Y | N | N | N | N | N | N |
| Interval | N | Y | N | N | Y | N | N | N | N | N |
| Boolean | Y | Y | N | N | N | Y | N | N | N | N |
| Binary | Y | N | N | N | N | N | Y | N | N | N |
| Array | N | N | N | N | N | N | N | Y | N | N |
| Map | N | N | N | N | N | N | N | N | Y | N |
| Struct | N | N | N | N | N | N | N | N | N | Y |
| Binary | N | Y | N | N | N | N | Y | N | N | N |
| Array | N | N | N | N | N | N | N | <span style="color:red">**Y**</span> | N | N |
| Map | N | N | N | N | N | N | N | N | <span style="color:red">**Y**</span> | N |
| Struct | N | N | N | N | N | N | N | N | N | <span style="color:red">**Y**</span> |
In the table above, all the `CAST`s that can cause runtime exceptions are marked as red <span style="color:red">**Y**</span>:
* CAST(Numeric AS Numeric): raise an overflow exception if the value is out of the target data type's range.
* CAST(String AS (Numeric/Date/Timestamp/Interval/Boolean)): raise a runtime exception if the value can't be parsed as the target data type.
* CAST(Array AS Array): raise an exception if there is any on the conversion of the elements.
* CAST(Map AS Map): raise an exception if there is any on the conversion of the keys and the values.
* CAST(Struct AS Struct): raise an exception if there is any on the conversion of the struct fields.
Currently, the ANSI mode affects explicit casting and assignment casting only.
In future releases, the behaviour of type coercion might change along with the other two type conversion rules.
@ -163,9 +170,6 @@ The behavior of some SQL functions can be different under ANSI mode (`spark.sql.
The behavior of some SQL operators can be different under ANSI mode (`spark.sql.ansi.enabled=true`).
- `array_col[index]`: This operator throws `ArrayIndexOutOfBoundsException` if using invalid indices.
- `map_col[key]`: This operator throws `NoSuchElementException` if key does not exist in map.
- `CAST(string_col AS TIMESTAMP)`: This operator should fail with an exception if the input string can't be parsed.
- `CAST(string_col AS DATE)`: This operator should fail with an exception if the input string can't be parsed.
- `CAST(string_col AS BOOLEAN)`: This operator should fail with an exception if the input string can't be parsed.
### SQL Keywords