3d158f9c91
### What changes were proposed in this pull request? This PR proposes to port Koalas documentation to PySpark documentation as its initial step. It ports almost as is except these differences: - Renamed import from `databricks.koalas` to `pyspark.pandas`. - Renamed `to_koalas` -> `to_pandas_on_spark` - Renamed `(Series|DataFrame).koalas` -> `(Series|DataFrame).pandas_on_spark` - Added a `ps_` prefix in the RST file names of Koalas documentation Other then that, - Excluded `python/docs/build/html` in linter - Fixed GA dependency installataion ### Why are the changes needed? To document pandas APIs on Spark. ### Does this PR introduce _any_ user-facing change? Yes, it adds new documentations. ### How was this patch tested? Manually built the docs and checked the output. Closes #32726 from HyukjinKwon/SPARK-35587. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
89 lines
1.5 KiB
ReStructuredText
89 lines
1.5 KiB
ReStructuredText
.. _api.groupby:
|
|
|
|
=======
|
|
GroupBy
|
|
=======
|
|
.. currentmodule:: pyspark.pandas
|
|
|
|
GroupBy objects are returned by groupby calls: :func:`DataFrame.groupby`, :func:`Series.groupby`, etc.
|
|
|
|
.. currentmodule:: pyspark.pandas.groupby
|
|
|
|
|
|
Indexing, iteration
|
|
-------------------
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
GroupBy.get_group
|
|
|
|
Function application
|
|
--------------------
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
GroupBy.apply
|
|
GroupBy.transform
|
|
|
|
The following methods are available only for `DataFrameGroupBy` objects.
|
|
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
DataFrameGroupBy.agg
|
|
DataFrameGroupBy.aggregate
|
|
|
|
Computations / Descriptive Stats
|
|
--------------------------------
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
GroupBy.all
|
|
GroupBy.any
|
|
GroupBy.count
|
|
GroupBy.cumcount
|
|
GroupBy.cummax
|
|
GroupBy.cummin
|
|
GroupBy.cumprod
|
|
GroupBy.cumsum
|
|
GroupBy.filter
|
|
GroupBy.first
|
|
GroupBy.last
|
|
GroupBy.max
|
|
GroupBy.mean
|
|
GroupBy.median
|
|
GroupBy.min
|
|
GroupBy.rank
|
|
GroupBy.std
|
|
GroupBy.sum
|
|
GroupBy.var
|
|
GroupBy.nunique
|
|
GroupBy.size
|
|
GroupBy.diff
|
|
GroupBy.idxmax
|
|
GroupBy.idxmin
|
|
GroupBy.fillna
|
|
GroupBy.bfill
|
|
GroupBy.ffill
|
|
GroupBy.head
|
|
GroupBy.backfill
|
|
GroupBy.shift
|
|
GroupBy.tail
|
|
|
|
The following methods are available only for `DataFrameGroupBy` objects.
|
|
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
DataFrameGroupBy.describe
|
|
|
|
The following methods are available only for `SeriesGroupBy` objects.
|
|
|
|
.. autosummary::
|
|
:toctree: api/
|
|
|
|
SeriesGroupBy.nsmallest
|
|
SeriesGroupBy.nlargest
|
|
SeriesGroupBy.value_counts
|
|
SeriesGroupBy.unique
|