310cd8eef1
This PR proposes to migrate Coverage report from Jenkins to GitHub Actions by setting a dailly cron job. For some background, currently PySpark code coverage is being reported in this specific Jenkins job: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-sbt-hadoop-2.7/ Because of the security issue between [Codecov service](https://app.codecov.io/gh/) and Jenkins machines, we had to work around by manually hosting a coverage site via GitHub pages, see also https://spark-test.github.io/pyspark-coverage-site/ by spark-test account (which is shared to only subset of PMC members). Since we now run the build via GitHub Actions, we can leverage [Codecov plugin](https://github.com/codecov/codecov-action), and remove the workaround we used. Virtually no. Coverage site (UI) might change but the information it holds should be virtually the same. I manually tested: - Scheduled run: https://github.com/HyukjinKwon/spark/actions/runs/1082261484 - Coverage report:73f0291a7d/python/pyspark
- Run against a PR: https://github.com/HyukjinKwon/spark/actions/runs/1082367175 Closes #33591 from HyukjinKwon/SPARK-36092. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org> (cherry picked from commitc0d1860f25
) Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
70 lines
2.6 KiB
Bash
Executable file
70 lines
2.6 KiB
Bash
Executable file
#!/usr/bin/env bash
|
|
|
|
#
|
|
# Licensed to the Apache Software Foundation (ASF) under one or more
|
|
# contributor license agreements. See the NOTICE file distributed with
|
|
# this work for additional information regarding copyright ownership.
|
|
# The ASF licenses this file to You under the Apache License, Version 2.0
|
|
# (the "License"); you may not use this file except in compliance with
|
|
# the License. You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
#
|
|
|
|
set -o pipefail
|
|
set -e
|
|
|
|
# This variable indicates which coverage executable to run to combine coverages
|
|
# and generate HTMLs, for example, 'coverage3' in Python 3.
|
|
COV_EXEC="${COV_EXEC:-coverage}"
|
|
FWDIR="$(cd "`dirname $0`"; pwd)"
|
|
pushd "$FWDIR" > /dev/null
|
|
|
|
# Ensure that coverage executable is installed.
|
|
if ! hash $COV_EXEC 2>/dev/null; then
|
|
echo "Missing coverage executable in your path, skipping PySpark coverage"
|
|
exit 1
|
|
fi
|
|
|
|
# Set up the directories for coverage results.
|
|
export COVERAGE_DIR="$FWDIR/test_coverage"
|
|
rm -fr "$COVERAGE_DIR/coverage_data"
|
|
rm -fr "$COVERAGE_DIR/htmlcov"
|
|
mkdir -p "$COVERAGE_DIR/coverage_data"
|
|
|
|
# Current directory are added in the python path so that it doesn't refer our built
|
|
# pyspark zip library first.
|
|
export PYTHONPATH="$FWDIR:$PYTHONPATH"
|
|
# Also, our sitecustomize.py and coverage_daemon.py are included in the path.
|
|
export PYTHONPATH="$COVERAGE_DIR:$PYTHONPATH"
|
|
|
|
# We use 'spark.python.daemon.module' configuration to insert the coverage supported workers.
|
|
export SPARK_CONF_DIR="$COVERAGE_DIR/conf"
|
|
|
|
# This environment variable enables the coverage.
|
|
export COVERAGE_PROCESS_START="$FWDIR/.coveragerc"
|
|
|
|
./run-tests "$@"
|
|
|
|
# Don't run coverage for the coverage command itself
|
|
unset COVERAGE_PROCESS_START
|
|
|
|
# Coverage could generate empty coverage data files. Remove it to get rid of warnings when combining.
|
|
find $COVERAGE_DIR/coverage_data -size 0 -print0 | xargs -0 rm -fr
|
|
echo "Combining collected coverage data under $COVERAGE_DIR/coverage_data"
|
|
$COV_EXEC combine
|
|
echo "Creating XML report file at python/coverage.xml"
|
|
$COV_EXEC xml --ignore-errors --include "pyspark/*"
|
|
echo "Reporting the coverage data at $COVERAGE_DIR/coverage_data/coverage"
|
|
$COV_EXEC report --include "pyspark/*"
|
|
echo "Generating HTML files for PySpark coverage under $COVERAGE_DIR/htmlcov"
|
|
$COV_EXEC html --ignore-errors --include "pyspark/*" --directory "$COVERAGE_DIR/htmlcov"
|
|
|
|
popd
|