spark-instrumented-optimizer/python/test_coverage/coverage_daemon.py
Hyukjin Kwon c0d1860f25 [SPARK-36092][INFRA][BUILD][PYTHON] Migrate to GitHub Actions with Codecov from Jenkins
### What changes were proposed in this pull request?

This PR proposes to migrate Coverage report from Jenkins to GitHub Actions by setting a dailly cron job.

### Why are the changes needed?

For some background, currently PySpark code coverage is being reported in this specific Jenkins job: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/

Because of the security issue between [Codecov service](https://app.codecov.io/gh/) and Jenkins machines, we had to work around by manually hosting a coverage site via GitHub pages, see also https://spark-test.github.io/pyspark-coverage-site/ by spark-test account (which is shared to only subset of PMC members).

Since we now run the build via GitHub Actions, we can leverage [Codecov plugin](https://github.com/codecov/codecov-action), and remove the workaround we used.

### Does this PR introduce _any_ user-facing change?

Virtually no. Coverage site (UI) might change but the information it holds should be virtually the same.

### How was this patch tested?

I manually tested:
- Scheduled run: https://github.com/HyukjinKwon/spark/actions/runs/1082261484
- Coverage report: 73f0291a7d/python/pyspark
- Run against a PR: https://github.com/HyukjinKwon/spark/actions/runs/1082367175

Closes #33591 from HyukjinKwon/SPARK-36092.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-08-01 21:37:19 +09:00

49 lines
1.8 KiB
Python

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os
import imp
import platform
# This is a hack to always refer the main code rather than built zip.
main_code_dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
daemon = imp.load_source("daemon", "%s/pyspark/daemon.py" % main_code_dir)
if "COVERAGE_PROCESS_START" in os.environ:
# PyPy with coverage makes the tests flaky, and CPython is enough for coverage report.
if "pypy" not in platform.python_implementation().lower():
worker = imp.load_source("worker", "%s/pyspark/worker.py" % main_code_dir)
def _cov_wrapped(*args, **kwargs):
import coverage
cov = coverage.coverage(
config_file=os.environ["COVERAGE_PROCESS_START"])
cov.start()
try:
worker.main(*args, **kwargs)
finally:
cov.stop()
cov.save()
daemon.worker_main = _cov_wrapped
else:
raise RuntimeError("COVERAGE_PROCESS_START environment variable is not set, exiting.")
if __name__ == '__main__':
daemon.manager()