310cd8eef1
This PR proposes to migrate Coverage report from Jenkins to GitHub Actions by setting a dailly cron job. For some background, currently PySpark code coverage is being reported in this specific Jenkins job: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/ Because of the security issue between [Codecov service](https://app.codecov.io/gh/) and Jenkins machines, we had to work around by manually hosting a coverage site via GitHub pages, see also https://spark-test.github.io/pyspark-coverage-site/ by spark-test account (which is shared to only subset of PMC members). Since we now run the build via GitHub Actions, we can leverage [Codecov plugin](https://github.com/codecov/codecov-action), and remove the workaround we used. Virtually no. Coverage site (UI) might change but the information it holds should be virtually the same. I manually tested: - Scheduled run: https://github.com/HyukjinKwon/spark/actions/runs/1082261484 - Coverage report:73f0291a7d/python/pyspark
- Run against a PR: https://github.com/HyukjinKwon/spark/actions/runs/1082367175 Closes #33591 from HyukjinKwon/SPARK-36092. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commitc0d1860f25
) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
42 lines
489 B
Plaintext
42 lines
489 B
Plaintext
# PySpark dependencies (required)
|
|
py4j
|
|
|
|
# PySpark dependencies (optional)
|
|
numpy
|
|
pyarrow
|
|
pandas
|
|
scipy
|
|
plotly
|
|
mlflow>=1.0
|
|
sklearn
|
|
matplotlib<3.3.0
|
|
|
|
# PySpark test dependencies
|
|
xmlrunner
|
|
|
|
# PySpark test dependencies (optional)
|
|
coverage
|
|
|
|
# Linter
|
|
mypy
|
|
flake8
|
|
|
|
# Documentation (SQL)
|
|
mkdocs
|
|
|
|
# Documentation (Python)
|
|
pydata_sphinx_theme
|
|
ipython
|
|
nbsphinx
|
|
numpydoc
|
|
jinja2<3.0.0
|
|
sphinx<3.1.0
|
|
sphinx-plotly-directive
|
|
|
|
# Development scripts
|
|
jira
|
|
PyGithub
|
|
|
|
# pandas API on Spark Code formatter.
|
|
black
|