spark-instrumented-optimizer/python/pyspark/sql/pandas
Jalpan Randeri 339b0ecadb [SPARK-25351][SQL][PYTHON] Handle Pandas category type when converting from Python with Arrow
Handle Pandas category type while converting from python with Arrow enabled. The category column will be converted to whatever type the category elements are as is the case with Arrow disabled.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
New unit tests were added for `createDataFrame` and scalar `pandas_udf`

Closes #26585 from jalpan-randeri/feature-pyarrow-dictionary-type.

Authored-by: Jalpan Randeri <randerij@amazon.com>
Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
2020-05-27 17:27:29 -07:00
..
__init__.py [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package 2020-01-09 10:22:50 +09:00
conversion.py [SPARK-31441] Support duplicated column names for toPandas with arrow execution 2020-04-14 14:08:56 +09:00
functions.py [SPARK-30722][DOCS][FOLLOW-UP] Explicitly mention the same entire input/output length restriction of Series Iterator UDF 2020-04-09 16:46:27 +09:00
group_ops.py [SPARK-30722][PYTHON][DOCS] Update documentation for Pandas UDF with Python type hints 2020-02-12 10:49:46 +09:00
map_ops.py [SPARK-30722][PYTHON][DOCS] Update documentation for Pandas UDF with Python type hints 2020-02-12 10:49:46 +09:00
serializers.py [SPARK-25351][SQL][PYTHON] Handle Pandas category type when converting from Python with Arrow 2020-05-27 17:27:29 -07:00
typehints.py [SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and rename/move inconsistent pandas UDF types 2020-01-22 15:32:58 +09:00
types.py [SPARK-25351][SQL][PYTHON] Handle Pandas category type when converting from Python with Arrow 2020-05-27 17:27:29 -07:00
utils.py [SPARK-30434][PYTHON][SQL] Move pandas related functionalities into 'pandas' sub-package 2020-01-09 10:22:50 +09:00