[MINOR][DOCS] Mention other Python dependency tools in documentation

### What changes were proposed in this pull request?

Self-contained.

### Why are the changes needed?

For user's more information on available Python dependency management in PySpark.

### Does this PR introduce _any_ user-facing change?
Yes, documentation change.

### How was this patch tested?
Manaully built the docs and checked the results:
<img width="918" alt="Screen Shot 2021-09-29 at 10 11 56 AM" src="https://user-images.githubusercontent.com/6477701/135186536-2f271378-d06b-4c6b-a4be-691ce395db9f.png">
<img width="976" alt="Screen Shot 2021-09-29 at 10 12 22 AM" src="https://user-images.githubusercontent.com/6477701/135186541-0f4c5615-bc49-48e2-affd-dc2f5c0334bf.png">
<img width="920" alt="Screen Shot 2021-09-29 at 10 12 42 AM" src="https://user-images.githubusercontent.com/6477701/135186551-0b613096-7c86-4562-b345-ddd60208367b.png">

Closes #34134 from HyukjinKwon/minor-docs-py-deps.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
Hyukjin Kwon 2021-09-29 14:45:58 +09:00
parent d2bb359338
commit 13c2b711e4
3 changed files with 8 additions and 5 deletions

View file

@ -450,6 +450,8 @@ Lines with a: 46, Lines with b: 23
</div>
</div>
Other dependency management tools such as Conda and pip can be also used for custom classes or third-party libraries. See also [Python Package Management](api/python/user_guide/python_packaging.html).
# Where to Go from Here
Congratulations on running your first Spark application!

View file

@ -241,12 +241,12 @@ For a complete list of options, run `spark-shell --help`. Behind the scenes,
In the PySpark shell, a special interpreter-aware SparkContext is already created for you, in the
variable called `sc`. Making your own SparkContext will not work. You can set which master the
context connects to using the `--master` argument, and you can add Python .zip, .egg or .py files
to the runtime path by passing a comma-separated list to `--py-files`. You can also add dependencies
to the runtime path by passing a comma-separated list to `--py-files`. For third-party Python dependencies,
see [Python Package Management](api/python/user_guide/python_packaging.html). You can also add dependencies
(e.g. Spark Packages) to your shell session by supplying a comma-separated list of Maven coordinates
to the `--packages` argument. Any additional repositories where dependencies might exist (e.g. Sonatype)
can be passed to the `--repositories` argument. Any Python dependencies a Spark package has (listed in
the requirements.txt of that package) must be manually installed using `pip` when necessary.
For example, to run `bin/pyspark` on exactly four cores, use:
can be passed to the `--repositories` argument. For example, to run
`bin/pyspark` on exactly four cores, use:
{% highlight bash %}
$ ./bin/pyspark --master local[4]

View file

@ -35,7 +35,8 @@ script as shown here while passing your jar.
For Python, you can use the `--py-files` argument of `spark-submit` to add `.py`, `.zip` or `.egg`
files to be distributed with your application. If you depend on multiple Python files we recommend
packaging them into a `.zip` or `.egg`.
packaging them into a `.zip` or `.egg`. For third-party Python dependencies,
see [Python Package Management](api/python/user_guide/python_packaging.html).
# Launching Applications with spark-submit