[SPARK-9705] [DOC] fix docs about Python version

cc JoshRosen

Author: Davies Liu <davies@databricks.com>

Closes #8245 from davies/python_doc.
This commit is contained in:
Davies Liu 2015-08-18 22:11:27 -07:00 committed by Reynold Xin
parent 1ff0580eda
commit de3223872a
2 changed files with 15 additions and 3 deletions

View file

@ -1561,7 +1561,11 @@ The following variables can be set in `spark-env.sh`:
</tr>
<tr>
<td><code>PYSPARK_PYTHON</code></td>
<td>Python binary executable to use for PySpark.</td>
<td>Python binary executable to use for PySpark in both driver and workers (default is `python`).</td>
</tr>
<tr>
<td><code>PYSPARK_DRIVER_PYTHON</code></td>
<td>Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON).</td>
</tr>
<tr>
<td><code>SPARK_LOCAL_IP</code></td>

View file

@ -85,8 +85,8 @@ import org.apache.spark.SparkConf
<div data-lang="python" markdown="1">
Spark {{site.SPARK_VERSION}} works with Python 2.6 or higher (but not Python 3). It uses the standard CPython interpreter,
so C libraries like NumPy can be used.
Spark {{site.SPARK_VERSION}} works with Python 2.6+ or Python 3.4+. It can use the standard CPython interpreter,
so C libraries like NumPy can be used. It also works with PyPy 2.3+.
To run Spark applications in Python, use the `bin/spark-submit` script located in the Spark directory.
This script will load Spark's Java/Scala libraries and allow you to submit applications to a cluster.
@ -104,6 +104,14 @@ Finally, you need to import some Spark classes into your program. Add the follow
from pyspark import SparkContext, SparkConf
{% endhighlight %}
PySpark requires the same minor version of Python in both driver and workers. It uses the default python version in PATH,
you can specify which version of Python you want to use by `PYSPARK_PYTHON`, for example:
{% highlight bash %}
$ PYSPARK_PYTHON=python3.4 bin/pyspark
$ PYSPARK_PYTHON=/opt/pypy-2.5/bin/pypy bin/spark-submit examples/src/main/python/pi.py
{% endhighlight %}
</div>
</div>