[SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation
### What changes were proposed in this pull request? This PR proposes to: - Add a link of quick start in PySpark docs into "Programming Guides" in Spark main docs - `ML` / `MLlib` -> `MLlib (DataFrame-based)` / `MLlib (RDD-based)` in API reference page - Mention other user guides as well because the guide such as [ML](http://spark.apache.org/docs/latest/ml-guide.html) and [SQL](http://spark.apache.org/docs/latest/sql-programming-guide.html). - Mention other migration guides as well because PySpark can get affected by it. ### Why are the changes needed? For better documentation. ### Does this PR introduce _any_ user-facing change? It fixes user-facing docs. However, it's not released out yet. ### How was this patch tested? Manually tested by running: ```bash cd docs SKIP_SCALADOC=1 SKIP_RDOC=1 SKIP_SQLDOC=1 jekyll serve --watch ``` Closes #31082 from HyukjinKwon/SPARK-34041. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
parent
7b06acc28b
commit
aa388cf3d0
|
@ -84,6 +84,7 @@
|
|||
<a class="dropdown-item" href="ml-guide.html">MLlib (Machine Learning)</a>
|
||||
<a class="dropdown-item" href="graphx-programming-guide.html">GraphX (Graph Processing)</a>
|
||||
<a class="dropdown-item" href="sparkr.html">SparkR (R on Spark)</a>
|
||||
<a class="dropdown-item" href="api/python/getting_started/index.html">PySpark (Python on Spark)</a>
|
||||
</div>
|
||||
</li>
|
||||
|
||||
|
|
|
@ -113,6 +113,8 @@ options for deployment:
|
|||
* [Spark Streaming](streaming-programming-guide.html): processing data streams using DStreams (old API)
|
||||
* [MLlib](ml-guide.html): applying machine learning algorithms
|
||||
* [GraphX](graphx-programming-guide.html): processing graphs
|
||||
* [SparkR](sparkr.html): processing data with Spark in R
|
||||
* [PySpark](api/python/getting_started/index.html): processing data with Spark in Python
|
||||
|
||||
**API Docs:**
|
||||
|
||||
|
|
|
@ -21,6 +21,9 @@ Getting Started
|
|||
===============
|
||||
|
||||
This page summarizes the basic steps required to setup and get started with PySpark.
|
||||
There are more guides shared with other languages such as
|
||||
`Quick Start <http://spark.apache.org/docs/latest/quick-start.html>`_ in Programming Guides
|
||||
at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
|
|
@ -21,8 +21,6 @@ Migration Guide
|
|||
===============
|
||||
|
||||
This page describes the migration guide specific to PySpark.
|
||||
Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
|
||||
Please also refer other migration guides such as `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
@ -33,3 +31,13 @@ Please also refer other migration guides such as `Migration Guide: SQL, Datasets
|
|||
pyspark_2.2_to_2.3
|
||||
pyspark_1.4_to_1.5
|
||||
pyspark_1.0_1.2_to_1.3
|
||||
|
||||
|
||||
Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
|
||||
Please also refer other migration guides:
|
||||
|
||||
- `Migration Guide: Spark Core <http://spark.apache.org/docs/latest/core-migration-guide.html>`_
|
||||
- `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_
|
||||
- `Migration Guide: Structured Streaming <http://spark.apache.org/docs/latest/ss-migration-guide.html>`_
|
||||
- `Migration Guide: MLlib (Machine Learning) <http://spark.apache.org/docs/latest/ml-migration-guide.html>`_
|
||||
|
||||
|
|
|
@ -16,11 +16,11 @@
|
|||
under the License.
|
||||
|
||||
|
||||
ML
|
||||
==
|
||||
MLlib (DataFrame-based)
|
||||
=======================
|
||||
|
||||
ML Pipeline APIs
|
||||
----------------
|
||||
Pipeline APIs
|
||||
-------------
|
||||
|
||||
.. currentmodule:: pyspark.ml
|
||||
|
||||
|
@ -188,8 +188,8 @@ Clustering
|
|||
PowerIterationClustering
|
||||
|
||||
|
||||
ML Functions
|
||||
----------------------------
|
||||
Functions
|
||||
---------
|
||||
|
||||
.. currentmodule:: pyspark.ml.functions
|
||||
|
||||
|
|
|
@ -16,8 +16,8 @@
|
|||
under the License.
|
||||
|
||||
|
||||
MLlib
|
||||
=====
|
||||
MLlib (RDD-based)
|
||||
=================
|
||||
|
||||
Classification
|
||||
--------------
|
||||
|
|
|
@ -20,9 +20,21 @@
|
|||
User Guide
|
||||
==========
|
||||
|
||||
This page is the guide for PySpark users which contains PySpark specific topics.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
arrow_pandas
|
||||
python_packaging
|
||||
|
||||
|
||||
There are more guides shared with other languages in Programming Guides
|
||||
at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
|
||||
|
||||
- `RDD Programming Guide <http://spark.apache.org/docs/latest/rdd-programming-guide.html>`_
|
||||
- `Spark SQL, DataFrames and Datasets Guide <http://spark.apache.org/docs/latest/sql-programming-guide.html>`_
|
||||
- `Structured Streaming Programming Guide <http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>`_
|
||||
- `Spark Streaming Programming Guide <http://spark.apache.org/docs/latest/streaming-programming-guide.html>`_
|
||||
- `Machine Learning Library (MLlib) Guide <http://spark.apache.org/docs/latest/ml-guide.html>`_
|
||||
|
||||
|
|
Loading…
Reference in a new issue