spark-instrumented-optimizer/python/pyspark/sql
Liang-Chi Hsieh f92d276653 [SPARK-25811][PYSPARK] Raise a proper error when unsafe cast is detected by PyArrow
## What changes were proposed in this pull request?

Since 0.11.0, PyArrow supports to raise an error for unsafe cast ([PR](https://github.com/apache/arrow/pull/2504)). We should use it to raise a proper error for pandas udf users when such cast is detected.

Added a SQL config `spark.sql.execution.pandas.arrowSafeTypeConversion` to disable Arrow safe type check.

## How was this patch tested?

Added test and manually test.

Closes #22807 from viirya/SPARK-25811.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2019-01-22 14:54:41 +08:00
..
tests [SPARK-25811][PYSPARK] Raise a proper error when unsafe cast is detected by PyArrow 2019-01-22 14:54:41 +08:00
__init__.py [SPARK-22369][PYTHON][DOCS] Exposes catalog API documentation in PySpark 2017-11-02 15:22:52 +01:00
catalog.py [SPARK-24665][PYSPARK][FOLLOWUP] Use SQLConf in PySpark to manage all sql configs 2018-08-17 10:18:08 +08:00
column.py [SPARK-23847][PYTHON][SQL] Add asc_nulls_first, asc_nulls_last to PySpark 2018-04-08 12:09:06 +08:00
conf.py [SPARK-23698][PYTHON] Resolve undefined names in Python 3 2018-08-22 10:06:59 -07:00
context.py [SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis 2019-01-17 19:40:39 -06:00
dataframe.py [SPARK-25274][PYTHON][SQL] In toPandas with Arrow send un-ordered record batches to improve performance 2018-12-06 10:07:28 -08:00
functions.py [SPARK-26451][SQL] Change lead/lag argument name from count to offset 2018-12-28 00:02:41 +08:00
group.py [SPARK-24722][SQL] pivot() with Column type argument 2018-08-04 14:17:32 +08:00
readwriter.py [SPARK-26178][SPARK-26243][SQL][FOLLOWUP] Replacing SimpleDateFormat by DateTimeFormatter in comments 2018-12-24 10:47:47 +08:00
session.py [SPARK-25811][PYSPARK] Raise a proper error when unsafe cast is detected by PyArrow 2019-01-22 14:54:41 +08:00
streaming.py [SPARK-26178][SPARK-26243][SQL][FOLLOWUP] Replacing SimpleDateFormat by DateTimeFormatter in comments 2018-12-24 10:47:47 +08:00
types.py [SPARK-26645][PYTHON] Support decimals with negative scale when parsing datatype 2019-01-20 17:43:50 +08:00
udf.py [SPARK-25601][PYTHON] Register Grouped aggregate UDF Vectorized UDFs for SQL Statement 2018-10-04 09:36:23 +08:00
utils.py [SPARK-24721][SQL] Exclude Python UDFs filters in FileSourceStrategy 2018-08-28 10:57:13 +08:00
window.py [SPARK-25842][SQL] Deprecate rangeBetween APIs introduced in SPARK-21608 2018-10-26 13:17:24 +08:00