[SPARK-36345][INFRA] Update PySpark GitHubAction docker image to 20210730

### What changes were proposed in this pull request?

This PR aims to upgrade PySpark GitHub Action job to use the latest docker image `20210730` having `sklearn` and `mlflow` additionally.
- 5ca94453d1

```
$ docker run -it --rm dongjoon/apache-spark-github-action-image:20210730 python3.9 -m pip list | grep mlflow
mlflow                    1.19.0

$ docker run -it --rm dongjoon/apache-spark-github-action-image:20210730 python3.9 -m pip list | grep sklearn
sklearn                   0.0
```

### Why are the changes needed?

This will save the installation time.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the GitHub Action PySpark jobs.

Closes #33595 from dongjoon-hyun/SPARK-36345.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
Dongjoon Hyun 2021-07-31 07:20:17 +09:00 committed by Hyukjin Kwon
parent 0f538402fb
commit 0e65ed5fb9

View file

@ -186,7 +186,7 @@ jobs:
name: "Build modules: ${{ matrix.modules }}"
runs-on: ubuntu-20.04
container:
image: dongjoon/apache-spark-github-action-image:20210602
image: dongjoon/apache-spark-github-action-image:20210730
strategy:
fail-fast: false
matrix:
@ -252,8 +252,6 @@ jobs:
# Run the tests.
- name: Run tests
run: |
# TODO(SPARK-36345): Install mlflow>=1.0 and sklearn in Python 3.9 of the base image
python3.9 -m pip install 'mlflow>=1.0' sklearn
export PATH=$PATH:$HOME/miniconda/bin
./dev/run-tests --parallelism 1 --modules "$MODULES_TO_TEST"
- name: Upload test results to report