spark-instrumented-optimizer/python/pyspark
Cedric-Magnan e15daa31b3 [SPARK-36370][PYTHON] _builtin_table directly imported from pandas instead of being redefined
### What changes were proposed in this pull request?
Suggesting to refactor the way the _builtin_table is defined in the `python/pyspark/pandas/groupby.py` module.
Pandas has recently refactored the way we import the _builtin_table and is now part of the pandas.core.common module instead of being an attribute of the pandas.core.base.SelectionMixin class.

### Why are the changes needed?
This change is not fully needed but the current implementation redefines this table within pyspark, so any changes of this table from the pandas library would need to be updated in the pyspark repository as well.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Ran the following command successfully :
```sh
python/run-tests --testnames 'pyspark.pandas.tests.test_groupby'
```
Tests passed in 327 seconds

Closes #33687 from Cedric-Magnan/_builtin_table_from_pandas.

Authored-by: Cedric-Magnan <cedric.magnan@artefact.com>
Signed-off-by: Takuya UESHIN <ueshin@databricks.com>
(cherry picked from commit 964dfe254f)
Signed-off-by: Takuya UESHIN <ueshin@databricks.com>
2021-08-17 10:47:01 -07:00
..
cloudpickle [SPARK-33983][PYTHON] Update cloudpickle to v1.6.0 2021-01-04 10:36:31 -08:00
ml [SPARK-36225][PYTHON][DOCS] Use DataFrame in python docstrings 2021-07-24 16:58:19 +09:00
mllib [SPARK-36092][INFRA][BUILD][PYTHON] Migrate to GitHub Actions with Codecov from Jenkins 2021-08-01 21:38:39 +09:00
pandas [SPARK-36370][PYTHON] _builtin_table directly imported from pandas instead of being redefined 2021-08-17 10:47:01 -07:00
resource [SPARK-32320][PYSPARK] Remove mutable default arguments 2020-12-08 09:35:36 +08:00
sql [SPARK-36465][SS] Dynamic gap duration in session window 2021-08-16 11:06:16 +09:00
streaming [SPARK-36092][INFRA][BUILD][PYTHON] Migrate to GitHub Actions with Codecov from Jenkins 2021-08-01 21:38:39 +09:00
testing [SPARK-35599][PYTHON] Adjust check_exact parameter for older pd.testing 2021-06-07 11:12:49 +09:00
tests [SPARK-36092][INFRA][BUILD][PYTHON] Migrate to GitHub Actions with Codecov from Jenkins 2021-08-01 21:38:39 +09:00
__init__.py [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
__init__.pyi [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
_globals.py [SPARK-23328][PYTHON] Disallow default value None in na.replace/replace when 'to_replace' is not a dictionary 2018-02-09 14:21:10 +08:00
_typing.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
accumulators.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
accumulators.pyi [SPARK-33002][PYTHON] Remove non-API annotations 2020-10-07 19:53:59 +09:00
broadcast.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
broadcast.pyi [SPARK-33457][PYTHON] Adjust mypy configuration 2020-11-25 09:27:04 +09:00
conf.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
conf.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
context.py [SPARK-35938][PYTHON] Add deprecation warning for Python 3.6 2021-07-01 09:32:25 +09:00
context.pyi [SPARK-33457][PYTHON] Adjust mypy configuration 2020-11-25 09:27:04 +09:00
daemon.py [SPARK-26175][PYTHON] Redirect the standard input of the forked child to devnull in daemon 2019-07-31 09:10:24 +09:00
files.py [SPARK-28206][PYTHON] Remove the legacy Epydoc in PySpark API documentation 2019-07-05 10:08:22 -07:00
files.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
find_spark_home.py [SPARK-32017][PYTHON][FOLLOW-UP] Rename HADOOP_VERSION to PYSPARK_HADOOP_VERSION in pip installation option 2021-01-05 17:21:32 +09:00
install.py [SPARK-33254][PYTHON][DOCS] Migration to NumPy documentation style in Core (pyspark.*, pyspark.resource.*, etc.) 2020-11-16 10:21:50 +09:00
java_gateway.py [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
join.py [SPARK-14202] [PYTHON] Use generator expression instead of list comp in python_full_outer_jo… 2016-03-28 14:51:36 -07:00
profiler.py [SPARK-33254][PYTHON][DOCS] Migration to NumPy documentation style in Core (pyspark.*, pyspark.resource.*, etc.) 2020-11-16 10:21:50 +09:00
profiler.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
py.typed [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
rdd.py [SPARK-35512][PYTHON] Fix OverflowError(cannot convert float infinity to integer) in partitionBy function 2021-06-09 10:57:27 +09:00
rdd.pyi [SPARK-35986][PYSPARK] Fix type hint for RDD.histogram's buckets 2021-07-04 10:24:55 +09:00
rddsampler.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
resultiterable.py [SPARK-32138] Drop Python 2.7, 3.4 and 3.5 2020-07-14 11:22:44 +09:00
resultiterable.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
serializers.py [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
shell.py [SPARK-33363] Add prompt information related to the current task when pyspark/sparkR starts 2020-11-10 11:12:19 +09:00
shuffle.py [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
statcounter.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
statcounter.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
status.py
status.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
storagelevel.py [SPARK-31448][PYTHON] Fix storage level used in persist() in dataframe.py 2020-09-15 08:41:22 -05:00
storagelevel.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
taskcontext.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
taskcontext.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
traceback_utils.py
util.py [SPARK-35946][PYTHON] Respect Py4J server in InheritableThread API 2021-06-29 22:18:54 -07:00
util.pyi [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
version.py [SPARK-33662][BUILD] Setting version to 3.2.0-SNAPSHOT 2020-12-04 14:10:42 -08:00
version.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
worker.py [SPARK-36062][PYTHON] Try to capture faulthanlder when a Python worker crashes 2021-07-09 11:31:00 +09:00