921abc51cf
### What changes were proposed in this pull request? This PR proposes to restructure API files according to the layout, see https://github.com/apache/spark/pull/32799. Now the pandas APIs on Spark are under a separate directory which is same level as other modules such as Spark SQL. ```bash tree reference ``` **Before:** ``` reference ├── index.rst ├── ps_extensions.rst ├── ps_frame.rst ├── ps_general_functions.rst ├── ps_groupby.rst ├── ps_indexing.rst ├── ps_io.rst ├── ps_ml.rst ├── ps_series.rst ├── ps_window.rst ├── pyspark.ml.rst ├── pyspark.mllib.rst ├── pyspark.pandas.rst ├── pyspark.resource.rst ├── pyspark.rst ├── pyspark.sql.rst ├── pyspark.ss.rst └── pyspark.streaming.rst ``` **After:** ``` reference ├── index.rst ├── pyspark.ml.rst ├── pyspark.mllib.rst ├── pyspark.pandas │ ├── extensions.rst │ ├── frame.rst │ ├── general_functions.rst │ ├── groupby.rst │ ├── index.rst │ ├── indexing.rst │ ├── io.rst │ ├── ml.rst │ ├── series.rst │ └── window.rst ├── pyspark.resource.rst ├── pyspark.rst ├── pyspark.sql.rst ├── pyspark.ss.rst └── pyspark.streaming.rst ``` ### Why are the changes needed? To make the directory structure easier to follow. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manually built and tested the docs. Closes #32812 from HyukjinKwon/SPARK-35646-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
89 lines
1.5 KiB
ReStructuredText
89 lines
1.5 KiB
ReStructuredText
.. _api.groupby:
|
|
|
|
=======
|
|
GroupBy
|
|
=======
|
|
.. currentmodule:: pyspark.pandas
|
|
|
|
GroupBy objects are returned by groupby calls: :func:`DataFrame.groupby`, :func:`Series.groupby`, etc.
|
|
|
|
.. currentmodule:: pyspark.pandas.groupby
|
|
|
|
|
|
Indexing, iteration
|
|
-------------------
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
GroupBy.get_group
|
|
|
|
Function application
|
|
--------------------
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
GroupBy.apply
|
|
GroupBy.transform
|
|
|
|
The following methods are available only for `DataFrameGroupBy` objects.
|
|
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
DataFrameGroupBy.agg
|
|
DataFrameGroupBy.aggregate
|
|
|
|
Computations / Descriptive Stats
|
|
--------------------------------
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
GroupBy.all
|
|
GroupBy.any
|
|
GroupBy.count
|
|
GroupBy.cumcount
|
|
GroupBy.cummax
|
|
GroupBy.cummin
|
|
GroupBy.cumprod
|
|
GroupBy.cumsum
|
|
GroupBy.filter
|
|
GroupBy.first
|
|
GroupBy.last
|
|
GroupBy.max
|
|
GroupBy.mean
|
|
GroupBy.median
|
|
GroupBy.min
|
|
GroupBy.rank
|
|
GroupBy.std
|
|
GroupBy.sum
|
|
GroupBy.var
|
|
GroupBy.nunique
|
|
GroupBy.size
|
|
GroupBy.diff
|
|
GroupBy.idxmax
|
|
GroupBy.idxmin
|
|
GroupBy.fillna
|
|
GroupBy.bfill
|
|
GroupBy.ffill
|
|
GroupBy.head
|
|
GroupBy.backfill
|
|
GroupBy.shift
|
|
GroupBy.tail
|
|
|
|
The following methods are available only for `DataFrameGroupBy` objects.
|
|
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
DataFrameGroupBy.describe
|
|
|
|
The following methods are available only for `SeriesGroupBy` objects.
|
|
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
SeriesGroupBy.nsmallest
|
|
SeriesGroupBy.nlargest
|
|
SeriesGroupBy.value_counts
|
|
SeriesGroupBy.unique
|