[SPARK-34041][PYTHON][DOCS] Miscellaneous cleanup for new PySpark documentation
### What changes were proposed in this pull request? This PR proposes to: - Add a link of quick start in PySpark docs into "Programming Guides" in Spark main docs - `ML` / `MLlib` -> `MLlib (DataFrame-based)` / `MLlib (RDD-based)` in API reference page - Mention other user guides as well because the guide such as [ML](http://spark.apache.org/docs/latest/ml-guide.html) and [SQL](http://spark.apache.org/docs/latest/sql-programming-guide.html). - Mention other migration guides as well because PySpark can get affected by it. ### Why are the changes needed? For better documentation. ### Does this PR introduce _any_ user-facing change? It fixes user-facing docs. However, it's not released out yet. ### How was this patch tested? Manually tested by running: ```bash cd docs SKIP_SCALADOC=1 SKIP_RDOC=1 SKIP_SQLDOC=1 jekyll serve --watch ``` Closes #31082 from HyukjinKwon/SPARK-34041. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
parent
7b06acc28b
commit
aa388cf3d0
|
@ -84,6 +84,7 @@
|
||||||
<a class="dropdown-item" href="ml-guide.html">MLlib (Machine Learning)</a>
|
<a class="dropdown-item" href="ml-guide.html">MLlib (Machine Learning)</a>
|
||||||
<a class="dropdown-item" href="graphx-programming-guide.html">GraphX (Graph Processing)</a>
|
<a class="dropdown-item" href="graphx-programming-guide.html">GraphX (Graph Processing)</a>
|
||||||
<a class="dropdown-item" href="sparkr.html">SparkR (R on Spark)</a>
|
<a class="dropdown-item" href="sparkr.html">SparkR (R on Spark)</a>
|
||||||
|
<a class="dropdown-item" href="api/python/getting_started/index.html">PySpark (Python on Spark)</a>
|
||||||
</div>
|
</div>
|
||||||
</li>
|
</li>
|
||||||
|
|
||||||
|
|
|
@ -113,6 +113,8 @@ options for deployment:
|
||||||
* [Spark Streaming](streaming-programming-guide.html): processing data streams using DStreams (old API)
|
* [Spark Streaming](streaming-programming-guide.html): processing data streams using DStreams (old API)
|
||||||
* [MLlib](ml-guide.html): applying machine learning algorithms
|
* [MLlib](ml-guide.html): applying machine learning algorithms
|
||||||
* [GraphX](graphx-programming-guide.html): processing graphs
|
* [GraphX](graphx-programming-guide.html): processing graphs
|
||||||
|
* [SparkR](sparkr.html): processing data with Spark in R
|
||||||
|
* [PySpark](api/python/getting_started/index.html): processing data with Spark in Python
|
||||||
|
|
||||||
**API Docs:**
|
**API Docs:**
|
||||||
|
|
||||||
|
|
|
@ -21,6 +21,9 @@ Getting Started
|
||||||
===============
|
===============
|
||||||
|
|
||||||
This page summarizes the basic steps required to setup and get started with PySpark.
|
This page summarizes the basic steps required to setup and get started with PySpark.
|
||||||
|
There are more guides shared with other languages such as
|
||||||
|
`Quick Start <http://spark.apache.org/docs/latest/quick-start.html>`_ in Programming Guides
|
||||||
|
at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
|
@ -21,8 +21,6 @@ Migration Guide
|
||||||
===============
|
===============
|
||||||
|
|
||||||
This page describes the migration guide specific to PySpark.
|
This page describes the migration guide specific to PySpark.
|
||||||
Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
|
|
||||||
Please also refer other migration guides such as `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_.
|
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
@ -33,3 +31,13 @@ Please also refer other migration guides such as `Migration Guide: SQL, Datasets
|
||||||
pyspark_2.2_to_2.3
|
pyspark_2.2_to_2.3
|
||||||
pyspark_1.4_to_1.5
|
pyspark_1.4_to_1.5
|
||||||
pyspark_1.0_1.2_to_1.3
|
pyspark_1.0_1.2_to_1.3
|
||||||
|
|
||||||
|
|
||||||
|
Many items of other migration guides can also be applied when migrating PySpark to higher versions because PySpark internally shares other components.
|
||||||
|
Please also refer other migration guides:
|
||||||
|
|
||||||
|
- `Migration Guide: Spark Core <http://spark.apache.org/docs/latest/core-migration-guide.html>`_
|
||||||
|
- `Migration Guide: SQL, Datasets and DataFrame <http://spark.apache.org/docs/latest/sql-migration-guide.html>`_
|
||||||
|
- `Migration Guide: Structured Streaming <http://spark.apache.org/docs/latest/ss-migration-guide.html>`_
|
||||||
|
- `Migration Guide: MLlib (Machine Learning) <http://spark.apache.org/docs/latest/ml-migration-guide.html>`_
|
||||||
|
|
||||||
|
|
|
@ -16,11 +16,11 @@
|
||||||
under the License.
|
under the License.
|
||||||
|
|
||||||
|
|
||||||
ML
|
MLlib (DataFrame-based)
|
||||||
==
|
=======================
|
||||||
|
|
||||||
ML Pipeline APIs
|
Pipeline APIs
|
||||||
----------------
|
-------------
|
||||||
|
|
||||||
.. currentmodule:: pyspark.ml
|
.. currentmodule:: pyspark.ml
|
||||||
|
|
||||||
|
@ -188,8 +188,8 @@ Clustering
|
||||||
PowerIterationClustering
|
PowerIterationClustering
|
||||||
|
|
||||||
|
|
||||||
ML Functions
|
Functions
|
||||||
----------------------------
|
---------
|
||||||
|
|
||||||
.. currentmodule:: pyspark.ml.functions
|
.. currentmodule:: pyspark.ml.functions
|
||||||
|
|
||||||
|
|
|
@ -16,8 +16,8 @@
|
||||||
under the License.
|
under the License.
|
||||||
|
|
||||||
|
|
||||||
MLlib
|
MLlib (RDD-based)
|
||||||
=====
|
=================
|
||||||
|
|
||||||
Classification
|
Classification
|
||||||
--------------
|
--------------
|
||||||
|
|
|
@ -20,9 +20,21 @@
|
||||||
User Guide
|
User Guide
|
||||||
==========
|
==========
|
||||||
|
|
||||||
|
This page is the guide for PySpark users which contains PySpark specific topics.
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
||||||
arrow_pandas
|
arrow_pandas
|
||||||
python_packaging
|
python_packaging
|
||||||
|
|
||||||
|
|
||||||
|
There are more guides shared with other languages in Programming Guides
|
||||||
|
at `the Spark documentation <http://spark.apache.org/docs/latest/index.html#where-to-go-from-here>`_.
|
||||||
|
|
||||||
|
- `RDD Programming Guide <http://spark.apache.org/docs/latest/rdd-programming-guide.html>`_
|
||||||
|
- `Spark SQL, DataFrames and Datasets Guide <http://spark.apache.org/docs/latest/sql-programming-guide.html>`_
|
||||||
|
- `Structured Streaming Programming Guide <http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html>`_
|
||||||
|
- `Spark Streaming Programming Guide <http://spark.apache.org/docs/latest/streaming-programming-guide.html>`_
|
||||||
|
- `Machine Learning Library (MLlib) Guide <http://spark.apache.org/docs/latest/ml-guide.html>`_
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue