Commit graph

9 commits

Author SHA1 Message Date
Kousuke Saruta 513b6f5af2 [SPARK-33079][TESTS] Replace the existing Maven job for Scala 2.13 in Github Actions with SBT job
### What changes were proposed in this pull request?

SPARK-32926 added a build test to GitHub Action for Scala 2.13 but it's only with Maven.
As SPARK-32873 reported, some compilation error happens only with SBT so I think we need to add another build test to GitHub Action for SBT.
Unfortunately, we don't have abundant resources for GitHub Actions so instead of just adding the new SBT job, let's replace the existing Maven job with the new SBT job for Scala 2.13.

### Why are the changes needed?

To ensure build test passes even with SBT for Scala 2.13.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

GitHub Actions' job.

Closes #29958 from sarutak/add-sbt-job-for-scala-2.13.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-10-15 20:51:20 +09:00
Dongjoon Hyun e85ed8a14c [SPARK-33156][INFRA] Upgrade GithubAction image from 18.04 to 20.04
### What changes were proposed in this pull request?

This PR aims to upgrade `Github Action` runner image from `Ubuntu 18.04 (LTS)` to `Ubuntu 20.04 (LTS)`.

### Why are the changes needed?

`ubuntu-latest` in `GitHub Action` is still `Ubuntu 18.04 (LTS)`.
- https://github.com/actions/virtual-environments#available-environments

This upgrade will help Apache Spark 3.1+ preparation for vote and release on the latest OS.

This is tested here.
- https://github.com/dongjoon-hyun/spark/pull/36

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the `Github Action` in this PR.

Closes #30050 from dongjoon-hyun/ubuntu_20.04.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-10-15 02:24:49 -07:00
HyukjinKwon b205be5ff6 [SPARK-33051][INFRA][R] Uses setup-r to install R in GitHub Actions build
### What changes were proposed in this pull request?

At SPARK-32493, the R installation was switched to manual installation because setup-r was broken. This seems fixed in the upstream so we should better switch it back.

### Why are the changes needed?

To avoid maintaining the installation steps by ourselve.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions build in this PR should test it.

Closes #29931 from HyukjinKwon/recover-r-build.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-10-02 15:12:33 +09:00
Dongjoon Hyun a8442c2826 [SPARK-32926][TESTS] Add Scala 2.13 build test in GitHub Action
### What changes were proposed in this pull request?

The PR aims to add Scala 2.13 build test coverage into GitHub Action for Apache Spark 3.1.0.

### Why are the changes needed?

The branch is ready for Scala 2.13 and this will prevent any regression.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

Pass the GitHub Action.

Closes #29793 from dongjoon-hyun/SPARK-32926.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-09-17 14:01:52 -07:00
HyukjinKwon b07e7429a6 [SPARK-32695][INFRA] Explicitly cache and hash 'build' directly in GitHub Actions
### What changes were proposed in this pull request?

This PR proposes to explicitly cache and hash the files/directories under 'build' for SBT and Zinc at GitHub Actions. Otherwise, it can end up with overwriting `build` directory. See also https://github.com/apache/spark/pull/29286#issuecomment-679368436

Previously, other files like `build/mvn` and `build/sbt` are also cached and overwritten. So, when you have some changes there, they are ignored.

### Why are the changes needed?

To make GitHub Actions build stable.

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

The builds in this PR test it out.

Closes #29536 from HyukjinKwon/SPARK-32695.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-08-26 12:25:59 +09:00
HyukjinKwon b54103016a [SPARK-32204][SPARK-32182][DOCS] Add a quickstart page with Binder integration in PySpark documentation
### What changes were proposed in this pull request?

This PR proposes to:
- add a notebook with a Binder integration which allows users to try PySpark in a live notebook. Please [try this here](https://mybinder.org/v2/gh/HyukjinKwon/spark/SPARK-32204?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart.ipynb).
- reuse this notebook as a quickstart guide in PySpark documentation.

Note that Binder turns a Git repo into a collection of interactive notebooks. It works based on Docker image. Once somebody builds, other people can reuse the image against a specific commit.
Therefore, if we run Binder with the images based on released tags in Spark, virtually all users can instantly launch the Jupyter notebooks.

<br/>

I made a simple demo to make it easier to review. Please see:
- [Main page](https://hyukjin-spark.readthedocs.io/en/stable/). Note that the link ("Live Notebook") in the main page wouldn't work since this PR is not merged yet.
- [Quickstart page](https://hyukjin-spark.readthedocs.io/en/stable/getting_started/quickstart.html)

<br/>

When reviewing the notebook file itself, please give my direct feedback which I will appreciate and address.
Another way might be:
- open [here](https://mybinder.org/v2/gh/HyukjinKwon/spark/SPARK-32204?filepath=python%2Fdocs%2Fsource%2Fgetting_started%2Fquickstart.ipynb).
- edit / change / update the notebook. Please feel free to change as whatever you want. I can apply as are or slightly update more when I apply to this PR.
- download it as a `.ipynb` file:
    ![Screen Shot 2020-08-20 at 10 12 19 PM](https://user-images.githubusercontent.com/6477701/90774311-3e38c800-e332-11ea-8476-699a653984db.png)
- upload the `.ipynb` file here in a GitHub comment. Then, I will push a commit with that file with crediting correctly, of course.
- alternatively, push a commit into this PR right away if that's easier for you (if you're a committer).

References:
- https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html
- https://databricks.com/jp/blog/2020/03/31/10-minutes-from-pandas-to-koalas-on-apache-spark.html - my own blog post .. :-) and https://koalas.readthedocs.io/en/latest/getting_started/10min.html

### Why are the changes needed?

To improve PySpark's usability. The current quickstart for Python users are very friendly.

### Does this PR introduce _any_ user-facing change?

Yes, it will add a documentation page, and expose a live notebook to PySpark users.

### How was this patch tested?

Manually tested, and GitHub Actions builds will test.

Closes #29491 from HyukjinKwon/SPARK-32204.

Lead-authored-by: HyukjinKwon <gurwls223@apache.org>
Co-authored-by: Fokko Driesprong <fokko@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-08-26 12:23:24 +09:00
Takeshi Yamamuro 6dd37cbaac [SPARK-32682][INFRA] Use workflow_dispatch to enable manual test triggers
### What changes were proposed in this pull request?

This PR proposes to add a `workflow_dispatch` entry in the GitHub Action script (`build_and_test.yml`). This update can enable developers to run the Spark tests for a specific branch on their own local repository, so I think it might help to check if al the tests can pass before opening a new PR.

<img width="944" alt="Screen Shot 2020-08-21 at 16 28 41" src="https://user-images.githubusercontent.com/692303/90866249-96250c80-e3ce-11ea-8496-3dd6683e92ea.png">

### Why are the changes needed?

To reduce the pressure of GitHub Actions on the Spark repository.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually checked.

Closes #29504 from maropu/DispatchTest.

Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
2020-08-21 21:23:41 +09:00
HyukjinKwon bfd8c34154 [SPARK-32645][INFRA] Upload unit-tests.log as an artifact
### What changes were proposed in this pull request?

This PR proposes to upload `target/unit-tests.log` into the artifact so it will be able to download here:
![Screen Shot 2020-08-18 at 2 23 18 PM](https://user-images.githubusercontent.com/6477701/90474095-789e3b80-e15f-11ea-87f8-e7da3df3c03e.png)

### Why are the changes needed?

Jenkins has this feature. It should be best to have the same dev functionalities with it.
Also, note that this was pointed out https://github.com/apache/spark/pull/29225#discussion_r471485011.

### Does this PR introduce _any_ user-facing change?

No, dev-only

### How was this patch tested?

https://github.com/apache/spark/actions/runs/213000777 should demonstrate it

Closes #29454 from HyukjinKwon/SPARK-32645.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-08-19 12:28:36 +09:00
HyukjinKwon d0dfe4986b [MINOR][INFRA] Rename master.yml to build_and_test.yml
### What changes were proposed in this pull request?

This PR renames `master.yml` to `build_and_test.yml` to indicate this is the workflow that builds and runs the tests.

### Why are the changes needed?

Just for readability. `master.yml` looks like the name of the branch (to me).

### Does this PR introduce _any_ user-facing change?

No, dev-only.

### How was this patch tested?

GitHub Actions build in this PR will test it out.

Closes #29459 from HyukjinKwon/minor-rename.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
2020-08-18 18:18:47 +08:00
Renamed from .github/workflows/master.yml (Browse further)