c0d1860f25
### What changes were proposed in this pull request?
This PR proposes to migrate Coverage report from Jenkins to GitHub Actions by setting a dailly cron job.
### Why are the changes needed?
For some background, currently PySpark code coverage is being reported in this specific Jenkins job: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/
Because of the security issue between [Codecov service](https://app.codecov.io/gh/) and Jenkins machines, we had to work around by manually hosting a coverage site via GitHub pages, see also https://spark-test.github.io/pyspark-coverage-site/ by spark-test account (which is shared to only subset of PMC members).
Since we now run the build via GitHub Actions, we can leverage [Codecov plugin](https://github.com/codecov/codecov-action), and remove the workaround we used.
### Does this PR introduce _any_ user-facing change?
Virtually no. Coverage site (UI) might change but the information it holds should be virtually the same.
### How was this patch tested?
I manually tested:
- Scheduled run: https://github.com/HyukjinKwon/spark/actions/runs/1082261484
- Coverage report: 73f0291a7d/python/pyspark
- Run against a PR: https://github.com/HyukjinKwon/spark/actions/runs/1082367175
Closes #33591 from HyukjinKwon/SPARK-36092.
Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
49 lines
1.8 KiB
Python
49 lines
1.8 KiB
Python
#
|
|
# Licensed to the Apache Software Foundation (ASF) under one or more
|
|
# contributor license agreements. See the NOTICE file distributed with
|
|
# this work for additional information regarding copyright ownership.
|
|
# The ASF licenses this file to You under the Apache License, Version 2.0
|
|
# (the "License"); you may not use this file except in compliance with
|
|
# the License. You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
#
|
|
|
|
import os
|
|
import imp
|
|
import platform
|
|
|
|
|
|
# This is a hack to always refer the main code rather than built zip.
|
|
main_code_dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
|
|
daemon = imp.load_source("daemon", "%s/pyspark/daemon.py" % main_code_dir)
|
|
|
|
if "COVERAGE_PROCESS_START" in os.environ:
|
|
# PyPy with coverage makes the tests flaky, and CPython is enough for coverage report.
|
|
if "pypy" not in platform.python_implementation().lower():
|
|
worker = imp.load_source("worker", "%s/pyspark/worker.py" % main_code_dir)
|
|
|
|
def _cov_wrapped(*args, **kwargs):
|
|
import coverage
|
|
cov = coverage.coverage(
|
|
config_file=os.environ["COVERAGE_PROCESS_START"])
|
|
cov.start()
|
|
try:
|
|
worker.main(*args, **kwargs)
|
|
finally:
|
|
cov.stop()
|
|
cov.save()
|
|
daemon.worker_main = _cov_wrapped
|
|
else:
|
|
raise RuntimeError("COVERAGE_PROCESS_START environment variable is not set, exiting.")
|
|
|
|
|
|
if __name__ == '__main__':
|
|
daemon.manager()
|