spark-instrumented-optimizer/python/pyspark/pandas/tests
Xinrong Meng 6ca56b01dc [SPARK-35614][PYTHON] Make the conversion to pandas data-type-based for ExtensionDtypes
### What changes were proposed in this pull request?

We propose to
- introduce the Ops class for ExtensionDtypes: `IntegralExtensionOps`, `FractionalExtensionOps`, `StringExtensionOps`
- make the "conversion to pandas" data-type-based for ExtensionDtypes

Non-goal: same arithmetic operation of ExtensionDtypes have different result dtypes between pandas and pandas API on Spark. That should be adjusted in a separated PR if needed.

### Why are the changes needed?

The conversion to pandas includes logic for checking ExtensionDtypes data types and behaving accordingly.
That makes code hard to change or maintain.

Since we have DataTypeOps defined, we are able to dispatch the specific conversion logic to the `ExtensionOps` classes.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit tests.

Closes #32910 from xinrong-databricks/datatypeops_pd_ext.

Authored-by: Xinrong Meng <xinrong.meng@databricks.com>
Signed-off-by: Takuya UESHIN <ueshin@databricks.com>
2021-06-21 13:19:55 -07:00
..
data_type_ops [SPARK-35614][PYTHON] Make the conversion to pandas data-type-based for ExtensionDtypes 2021-06-21 13:19:55 -07:00
indexes [SPARK-35683][PYTHON] Fix Index.difference to avoid collect 'other' to driver side 2021-06-15 14:18:54 +09:00
plot [SPARK-35738][PYTHON] Support 'y' properly in DataFrame with non-numeric columns with plots 2021-06-12 14:36:46 +09:00
__init__.py [SPARK-34886][PYTHON] Port/integrate Koalas DataFrame unit test into PySpark 2021-04-09 15:48:13 +09:00
test_categorical.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_config.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_csv.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_dataframe.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_dataframe_conversion.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_dataframe_spark_io.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_default_index.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_expanding.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_extension.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_frame_spark.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_groupby.py [SPARK-35705][PYTHON] Adjust pandas-on-spark test_groupby_multiindex_columns test for different pandas versions 2021-06-10 10:36:19 +09:00
test_indexing.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_indexops_spark.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_internal.py [SPARK-35343][PYTHON] Make the conversion from/to pandas data-type-based for non-ExtensionDtypes 2021-06-07 13:12:12 -07:00
test_namespace.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_numpy_compat.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_ops_on_diff_frames.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_ops_on_diff_frames_groupby.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_ops_on_diff_frames_groupby_expanding.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_ops_on_diff_frames_groupby_rolling.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_repr.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_reshape.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_rolling.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_series.py [SPARK-35499][PYTHON] Apply black to pandas API on Spark codes 2021-06-06 17:30:07 -07:00
test_series_conversion.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_series_datetime.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_series_string.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_sql.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_stats.py [SPARK-35510][PYTHON] Fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true 2021-05-28 17:35:01 +09:00
test_typedef.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_utils.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00
test_window.py [SPARK-35364][PYTHON] Renaming the existing Koalas related codes 2021-05-20 15:08:30 -07:00