[MINOR][DOCS] Use ASCII characters when possible in PySpark documentation

### What changes were proposed in this pull request? This PR replaces the non-ASCII characters to ASCII characters when possible in PySpark documentation ### Why are the changes needed? To avoid unnecessarily using other non-ASCII characters which could lead to the issue such as https://github.com/apache/spark/pull/32047 or https://github.com/apache/spark/pull/22782 ### Does this PR introduce _any_ user-facing change? Virtually no. ### How was this patch tested? Found via (Mac OS): ```bash # In Spark root directory cd python pcregrep --color='auto' -n "[\x80-\xFF]" `git ls-files .` ``` Closes #32048 from HyukjinKwon/minor-fix. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Max Gekk <max.gekk@gmail.com>
2021-04-04 09:49:36 +03:00 · 2021-04-04 09:49:36 +03:00 · 2ca76a57be
parent 571acc87fe
commit 2ca76a57be
5 changed files with 6 additions and 6 deletions
--- a/python/docs/source/index.rst
+++ b/python/docs/source/index.rst
@ -42,7 +42,7 @@ SQL query engine.

 Running on top of Spark, the streaming feature in Apache Spark enables powerful
 interactive and analytical applications across both streaming and historical data,
-while inheriting Spark’s ease of use and fault tolerance characteristics.
+while inheriting Spark's ease of use and fault tolerance characteristics.

 **MLlib**

--- a/python/docs/source/migration_guide/pyspark_2.4_to_3.0.rst
+++ b/python/docs/source/migration_guide/pyspark_2.4_to_3.0.rst
@ -22,7 +22,7 @@ Upgrading from PySpark 2.4 to 3.0

 * In Spark 3.0, PySpark requires a pandas version of 0.23.2 or higher to use pandas related functionality, such as ``toPandas``, ``createDataFrame`` from pandas DataFrame, and so on.

-* In Spark 3.0, PySpark requires a PyArrow version of 0.12.1 or higher to use PyArrow related functionality, such as ``pandas_udf``, ``toPandas`` and ``createDataFrame`` with “spark.sql.execution.arrow.enabled=true”, etc.
+* In Spark 3.0, PySpark requires a PyArrow version of 0.12.1 or higher to use PyArrow related functionality, such as ``pandas_udf``, ``toPandas`` and ``createDataFrame`` with "spark.sql.execution.arrow.enabled=true", etc.

 * In PySpark, when creating a ``SparkSession`` with ``SparkSession.builder.getOrCreate()``, if there is an existing ``SparkContext``, the builder was trying to update the ``SparkConf`` of the existing ``SparkContext`` with configurations specified to the builder, but the ``SparkContext`` is shared by all ``SparkSession`` s, so we should not update them. In 3.0, the builder comes to not update the configurations. This is the same behavior as Java/Scala API in 2.3 and above. If you want to update them, you need to update them prior to creating a ``SparkSession``.

--- a/python/docs/source/user_guide/python_packaging.rst
+++ b/python/docs/source/user_guide/python_packaging.rst
@ -107,7 +107,7 @@ In the case of a ``spark-submit`` script, you can use it as follows:

 Note that ``PYSPARK_DRIVER_PYTHON`` above should not be set for cluster modes in YARN or Kubernetes.

-If you’re on a regular Python shell or notebook, you can try it as shown below:
+If you're on a regular Python shell or notebook, you can try it as shown below:

 .. code-block:: python

--- a/python/pyspark/ml/fpm.py
+++ b/python/pyspark/ml/fpm.py
@ -161,11 +161,11 @@ class FPGrowth(JavaEstimator, _FPGrowthParams, JavaMLWritable, JavaMLReadable):
    .. [1] Haoyuan Li, Yi Wang, Dong Zhang, Ming Zhang, and Edward Y. Chang. 2008.
        Pfp: parallel fp-growth for query recommendation.
        In Proceedings of the 2008 ACM conference on Recommender systems (RecSys '08).
-        Association for Computing Machinery, New York, NY, USA, 107–114.
+        Association for Computing Machinery, New York, NY, USA, 107-114.
        DOI: https://doi.org/10.1145/1454008.1454027
    .. [2] Jiawei Han, Jian Pei, and Yiwen Yin. 2000.
        Mining frequent patterns without candidate generation.
-        SIGMOD Rec. 29, 2 (June 2000), 1–12.
+        SIGMOD Rec. 29, 2 (June 2000), 1-12.
        DOI: https://doi.org/10.1145/335191.335372


--- a/python/pyspark/mllib/clustering.py
+++ b/python/pyspark/mllib/clustering.py
@ -143,7 +143,7 @@ class BisectingKMeans(object):
    -----
    See the original paper [1]_

-    .. [1] Steinbach, M. et al. “A Comparison of Document Clustering Techniques.” (2000).
+    .. [1] Steinbach, M. et al. "A Comparison of Document Clustering Techniques." (2000).
        KDD Workshop on Text Mining, 2000
        http://glaros.dtc.umn.edu/gkhome/fetch/papers/docclusterKDDTMW00.pdf
    """