spark-instrumented-optimizer/dev/create-release
HyukjinKwon b54103016a [SPARK-32204][SPARK-32182][DOCS] Add a quickstart page with Binder integration in PySpark documentation
### What changes were proposed in this pull request?

This PR proposes to:
- add a notebook with a Binder integration which allows users to try PySpark in a live notebook. Please [try this here](https://mybinder.org/v2/gh/HyukjinKwon/spark/SPARK-32204?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart.ipynb).
- reuse this notebook as a quickstart guide in PySpark documentation.

Note that Binder turns a Git repo into a collection of interactive notebooks. It works based on Docker image. Once somebody builds, other people can reuse the image against a specific commit.
Therefore, if we run Binder with the images based on released tags in Spark, virtually all users can instantly launch the Jupyter notebooks.

<br/>

I made a simple demo to make it easier to review. Please see:
- [Main page](https://hyukjin-spark.readthedocs.io/en/stable/). Note that the link ("Live Notebook") in the main page wouldn't work since this PR is not merged yet.
- [Quickstart page](https://hyukjin-spark.readthedocs.io/en/stable/getting_started/quickstart.html)

<br/>

When reviewing the notebook file itself, please give my direct feedback which I will appreciate and address.
Another way might be:
- open [here](https://mybinder.org/v2/gh/HyukjinKwon/spark/SPARK-32204?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart.ipynb).
- edit / change / update the notebook. Please feel free to change as whatever you want. I can apply as are or slightly update more when I apply to this PR.
- download it as a `.ipynb` file:
    ![Screen Shot 2020-08-20 at 10 12 19 PM](https://user-images.githubusercontent.com/6477701/90774311-3e38c800-e332-11ea-8476-699a653984db.png)
- upload the `.ipynb` file here in a GitHub comment. Then, I will push a commit with that file with crediting correctly, of course.
- alternatively, push a commit into this PR right away if that's easier for you (if you're a committer).

References:
- https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html
- https://databricks.com/jp/blog/2020/03/31/10-minutes-from-pandas-to-koalas-on-apache-spark.html - my own blog post .. :-) and https://koalas.readthedocs.io/en/latest/getting_started/10min.html

### Why are the changes needed?

To improve PySpark's usability. The current quickstart for Python users are very friendly.

### Does this PR introduce _any_ user-facing change?

Yes, it will add a documentation page, and expose a live notebook to PySpark users.

### How was this patch tested?

Manually tested, and GitHub Actions builds will test.

Closes #29491 from HyukjinKwon/SPARK-32204.

Lead-authored-by: HyukjinKwon <gurwls223@apache.org>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-08-26 12:23:24 +09:00
..
spark-rm [SPARK-32204][SPARK-32182][DOCS] Add a quickstart page with Binder integration in PySpark documentation 2020-08-26 12:23:24 +09:00
announce.tmpl [SPARK-28951][INFRA] Add release announce template 2019-09-02 14:55:05 -07:00
do-release-docker.sh [SPARK-31889][BUILD] Docker release script does not allocate enough memory to reliably publish 2020-06-01 15:49:17 -07:00
do-release.sh Revert "[SPARK-31860][BUILD] only push release tags on succes" 2020-06-12 17:50:43 +08:00
generate-contributors.py [SPARK-29802][BUILD] Use python3 in build scripts 2020-07-19 11:02:37 +09:00
known_translations [MINOR] update dev/create-release/known_translations 2020-06-29 11:53:56 +00:00
release-build.sh [SPARK-32058][BUILD] Use Apache Hadoop 3.2.0 dependency by default 2020-06-26 19:43:29 -07:00
release-tag.sh [SPARK-32556][INFRA] Fix release script to have urlencoded passwords where required 2020-08-07 03:31:45 -05:00
release-util.sh [SPARK-32556][INFRA] Fix release script to have urlencoded passwords where required 2020-08-07 03:31:45 -05:00
releaseutils.py [SPARK-32319][PYSPARK] Disallow the use of unused imports 2020-08-08 08:51:57 -07:00
translate-contributors.py [SPARK-29802][BUILD] Use python3 in build scripts 2020-07-19 11:02:37 +09:00
vote.tmpl [MINOR][DOCS] Tighten up some key links to the project and download pages to use HTTPS 2019-05-21 10:56:42 -07:00