spark-instrumented-optimizer/python/docs/source/reference/pyspark.pandas/frame.rst
Hyukjin Kwon 921abc51cf [SPARK-35636][PYTHON][DOCS][FOLLOW-UP] Restructure reference API files according to the layout
### What changes were proposed in this pull request?

This PR proposes to restructure API files according to the layout, see https://github.com/apache/spark/pull/32799. Now the pandas APIs on Spark are under a separate directory which is same level as other modules such as Spark SQL.

```bash
tree reference
```

**Before:**

```
reference
├── index.rst
├── ps_extensions.rst
├── ps_frame.rst
├── ps_general_functions.rst
├── ps_groupby.rst
├── ps_indexing.rst
├── ps_io.rst
├── ps_ml.rst
├── ps_series.rst
├── ps_window.rst
├── pyspark.ml.rst
├── pyspark.mllib.rst
├── pyspark.pandas.rst
├── pyspark.resource.rst
├── pyspark.rst
├── pyspark.sql.rst
├── pyspark.ss.rst
└── pyspark.streaming.rst
```

**After:**

```
reference
├── index.rst
├── pyspark.ml.rst
├── pyspark.mllib.rst
├── pyspark.pandas
│   ├── extensions.rst
│   ├── frame.rst
│   ├── general_functions.rst
│   ├── groupby.rst
│   ├── index.rst
│   ├── indexing.rst
│   ├── io.rst
│   ├── ml.rst
│   ├── series.rst
│   └── window.rst
├── pyspark.resource.rst
├── pyspark.rst
├── pyspark.sql.rst
├── pyspark.ss.rst
└── pyspark.streaming.rst
```

### Why are the changes needed?

To make the directory structure easier to follow.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually built and tested the docs.

Closes #32812 from HyukjinKwon/SPARK-35646-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-06-08 19:01:56 +09:00

328 lines
6.2 KiB
ReStructuredText

.. _api.dataframe:
=========
DataFrame
=========
.. currentmodule:: pyspark.pandas
Constructor
-----------
.. autosummary::
:toctree: api/
DataFrame
Attributes and underlying data
------------------------------
.. autosummary::
:toctree: api/
DataFrame.index
DataFrame.columns
DataFrame.empty
.. autosummary::
:toctree: api/
DataFrame.dtypes
DataFrame.shape
DataFrame.axes
DataFrame.ndim
DataFrame.size
DataFrame.select_dtypes
DataFrame.values
Conversion
----------
.. autosummary::
:toctree: api/
DataFrame.copy
DataFrame.isna
DataFrame.astype
DataFrame.isnull
DataFrame.notna
DataFrame.notnull
DataFrame.pad
DataFrame.bool
Indexing, iteration
-------------------
.. autosummary::
:toctree: api/
DataFrame.at
DataFrame.iat
DataFrame.head
DataFrame.idxmax
DataFrame.idxmin
DataFrame.loc
DataFrame.iloc
DataFrame.items
DataFrame.iteritems
DataFrame.iterrows
DataFrame.itertuples
DataFrame.keys
DataFrame.pop
DataFrame.tail
DataFrame.xs
DataFrame.get
DataFrame.where
DataFrame.mask
DataFrame.query
Binary operator functions
-------------------------
.. autosummary::
:toctree: api/
DataFrame.add
DataFrame.radd
DataFrame.div
DataFrame.rdiv
DataFrame.truediv
DataFrame.rtruediv
DataFrame.mul
DataFrame.rmul
DataFrame.sub
DataFrame.rsub
DataFrame.pow
DataFrame.rpow
DataFrame.mod
DataFrame.rmod
DataFrame.floordiv
DataFrame.rfloordiv
DataFrame.lt
DataFrame.gt
DataFrame.le
DataFrame.ge
DataFrame.ne
DataFrame.eq
DataFrame.dot
Function application, GroupBy & Window
--------------------------------------
.. autosummary::
:toctree: api/
DataFrame.apply
DataFrame.applymap
DataFrame.pipe
DataFrame.agg
DataFrame.aggregate
DataFrame.groupby
DataFrame.rolling
DataFrame.expanding
DataFrame.transform
.. _api.dataframe.stats:
Computations / Descriptive Stats
--------------------------------
.. autosummary::
:toctree: api/
DataFrame.abs
DataFrame.all
DataFrame.any
DataFrame.clip
DataFrame.corr
DataFrame.count
DataFrame.describe
DataFrame.kurt
DataFrame.kurtosis
DataFrame.mad
DataFrame.max
DataFrame.mean
DataFrame.min
DataFrame.median
DataFrame.pct_change
DataFrame.prod
DataFrame.product
DataFrame.quantile
DataFrame.nunique
DataFrame.sem
DataFrame.skew
DataFrame.sum
DataFrame.std
DataFrame.var
DataFrame.cummin
DataFrame.cummax
DataFrame.cumsum
DataFrame.cumprod
DataFrame.round
DataFrame.diff
DataFrame.eval
Reindexing / Selection / Label manipulation
-------------------------------------------
.. autosummary::
:toctree: api/
DataFrame.add_prefix
DataFrame.add_suffix
DataFrame.align
DataFrame.at_time
DataFrame.between_time
DataFrame.drop
DataFrame.droplevel
DataFrame.drop_duplicates
DataFrame.duplicated
DataFrame.equals
DataFrame.filter
DataFrame.first
DataFrame.head
DataFrame.last
DataFrame.rename
DataFrame.rename_axis
DataFrame.reset_index
DataFrame.set_index
DataFrame.swapaxes
DataFrame.swaplevel
DataFrame.take
DataFrame.isin
DataFrame.sample
DataFrame.truncate
.. _api.dataframe.missing:
Missing data handling
---------------------
.. autosummary::
:toctree: api/
DataFrame.backfill
DataFrame.dropna
DataFrame.fillna
DataFrame.replace
DataFrame.bfill
DataFrame.ffill
Reshaping, sorting, transposing
-------------------------------
.. autosummary::
:toctree: api/
DataFrame.pivot_table
DataFrame.pivot
DataFrame.sort_index
DataFrame.sort_values
DataFrame.nlargest
DataFrame.nsmallest
DataFrame.stack
DataFrame.unstack
DataFrame.melt
DataFrame.explode
DataFrame.squeeze
DataFrame.T
DataFrame.transpose
DataFrame.reindex
DataFrame.reindex_like
DataFrame.rank
Combining / joining / merging
-----------------------------
.. autosummary::
:toctree: api/
DataFrame.append
DataFrame.assign
DataFrame.merge
DataFrame.join
DataFrame.update
DataFrame.insert
Time series-related
-------------------
.. autosummary::
:toctree: api/
DataFrame.shift
DataFrame.first_valid_index
DataFrame.last_valid_index
Serialization / IO / Conversion
-------------------------------
.. autosummary::
:toctree: api/
DataFrame.from_records
DataFrame.info
DataFrame.to_table
DataFrame.to_delta
DataFrame.to_parquet
DataFrame.to_spark_io
DataFrame.to_csv
DataFrame.to_pandas
DataFrame.to_html
DataFrame.to_numpy
DataFrame.to_pandas_on_spark
DataFrame.to_spark
DataFrame.to_string
DataFrame.to_json
DataFrame.to_dict
DataFrame.to_excel
DataFrame.to_clipboard
DataFrame.to_markdown
DataFrame.to_records
DataFrame.to_latex
DataFrame.style
Spark-related
-------------
``DataFrame.spark`` provides features that does not exist in pandas but
in Spark. These can be accessed by ``DataFrame.spark.<function/property>``.
.. autosummary::
:toctree: api/
DataFrame.spark.schema
DataFrame.spark.print_schema
DataFrame.spark.frame
DataFrame.spark.cache
DataFrame.spark.persist
DataFrame.spark.hint
DataFrame.spark.to_table
DataFrame.spark.to_spark_io
DataFrame.spark.explain
DataFrame.spark.apply
DataFrame.spark.repartition
DataFrame.spark.coalesce
DataFrame.spark.checkpoint
DataFrame.spark.local_checkpoint
.. _api.dataframe.plot:
Plotting
--------
``DataFrame.plot`` is both a callable method and a namespace attribute for
specific plotting methods of the form ``DataFrame.plot.<kind>``.
.. autosummary::
:toctree: api/
DataFrame.plot
DataFrame.plot.area
DataFrame.plot.barh
DataFrame.plot.bar
DataFrame.plot.hist
DataFrame.plot.line
DataFrame.plot.pie
DataFrame.plot.scatter
DataFrame.plot.density
DataFrame.hist
DataFrame.kde
Koalas-specific
---------------
``DataFrame.pandas_on_spark`` provides Koalas-specific features that exists only in Koalas.
These can be accessed by ``DataFrame.pandas_on_spark.<function/property>``.
.. autosummary::
:toctree: api/
DataFrame.pandas_on_spark.attach_id_column
DataFrame.pandas_on_spark.apply_batch
DataFrame.pandas_on_spark.transform_batch