902e1342a3
### What changes were proposed in this pull request?
Currently the Jenkins PIP packaging test fails as below intermediately:
```
Installing dist into virtual env
Processing ./python/dist/pyspark-3.1.0.dev0.tar.gz
Collecting py4j==0.10.9 (from pyspark==3.1.0.dev0)
Downloading 6a4fb90cd2/py4j-0.10.9-py2.py3-none-any.whl
(198kB)
Installing collected packages: py4j, pyspark
Found existing installation: py4j 0.10.9
Uninstalling py4j-0.10.9:
Successfully uninstalled py4j-0.10.9
Found existing installation: pyspark 3.1.0.dev0
Exception:
Traceback (most recent call last):
File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/cli/base_command.py", line 179, in main
status = self.run(options, args)
File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/commands/install.py", line 393, in run
use_user_site=options.use_user_site,
File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/req/__init__.py", line 50, in install_given_reqs
auto_confirm=True
File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/req/req_install.py", line 816, in uninstall
uninstalled_pathset = UninstallPathSet.from_dist(dist)
File "/home/anaconda/envs/py36/lib/python3.6/site-packages/pip/_internal/req/req_uninstall.py", line 505, in from_dist
'(at %s)' % (link_pointer, dist.project_name, dist.location)
AssertionError: Egg-link /home/jenkins/workspace/SparkPullRequestBuilder3/python does not match installed
```
- https://github.com/apache/spark/pull/29099#issuecomment-658073453 (amp-jenkins-worker-04)
- https://github.com/apache/spark/pull/29090#issuecomment-657819973 (amp-jenkins-worker-03)
Seems like the previous installation of editable mode affects other PRs.
This PR simply works around by removing the symbolic link from the previous editable installation. This is a common workaround up to my knowledge.
### Why are the changes needed?
To recover the Jenkins build.
### Does this PR introduce _any_ user-facing change?
No, dev-only.
### How was this patch tested?
Jenkins build will test it out.
Closes #29102 from HyukjinKwon/SPARK-32303.
Lead-authored-by: HyukjinKwon <gurwls223@apache.org>
Co-authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
139 lines
5.1 KiB
Bash
Executable file
139 lines
5.1 KiB
Bash
Executable file
#!/usr/bin/env bash
|
|
|
|
#
|
|
# Licensed to the Apache Software Foundation (ASF) under one or more
|
|
# contributor license agreements. See the NOTICE file distributed with
|
|
# this work for additional information regarding copyright ownership.
|
|
# The ASF licenses this file to You under the Apache License, Version 2.0
|
|
# (the "License"); you may not use this file except in compliance with
|
|
# the License. You may obtain a copy of the License at
|
|
#
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
#
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
# See the License for the specific language governing permissions and
|
|
# limitations under the License.
|
|
#
|
|
|
|
# Stop on error
|
|
set -e
|
|
# Set nullglob for when we are checking existence based on globs
|
|
shopt -s nullglob
|
|
|
|
FWDIR="$(cd "$(dirname "$0")"/..; pwd)"
|
|
cd "$FWDIR"
|
|
|
|
echo "Constructing virtual env for testing"
|
|
VIRTUALENV_BASE=$(mktemp -d)
|
|
|
|
# Clean up the virtual env environment used if we created one.
|
|
function delete_virtualenv() {
|
|
echo "Cleaning up temporary directory - $VIRTUALENV_BASE"
|
|
rm -rf "$VIRTUALENV_BASE"
|
|
}
|
|
trap delete_virtualenv EXIT
|
|
|
|
PYTHON_EXECS=()
|
|
# Some systems don't have pip or virtualenv - in those cases our tests won't work.
|
|
if hash virtualenv 2>/dev/null && [ ! -n "$USE_CONDA" ]; then
|
|
echo "virtualenv installed - using. Note if this is a conda virtual env you may wish to set USE_CONDA"
|
|
# test only against python3
|
|
if hash python3 2>/dev/null; then
|
|
PYTHON_EXECS=('python3')
|
|
else
|
|
echo "Python3 not installed on system, skipping pip installability tests"
|
|
exit 0
|
|
fi
|
|
elif hash conda 2>/dev/null; then
|
|
echo "Using conda virtual environments"
|
|
PYTHON_EXECS=('3.6')
|
|
USE_CONDA=1
|
|
else
|
|
echo "Missing virtualenv & conda, skipping pip installability tests"
|
|
exit 0
|
|
fi
|
|
if ! hash pip 2>/dev/null; then
|
|
echo "Missing pip, skipping pip installability tests."
|
|
exit 0
|
|
fi
|
|
|
|
# Determine which version of PySpark we are building for archive name
|
|
PYSPARK_VERSION=$(python3 -c "exec(open('python/pyspark/version.py').read());print(__version__)")
|
|
PYSPARK_DIST="$FWDIR/python/dist/pyspark-$PYSPARK_VERSION.tar.gz"
|
|
# The pip install options we use for all the pip commands
|
|
PIP_OPTIONS="--user --upgrade --no-cache-dir --force-reinstall "
|
|
# Test both regular user and edit/dev install modes.
|
|
PIP_COMMANDS=("pip install $PIP_OPTIONS $PYSPARK_DIST"
|
|
"pip install $PIP_OPTIONS -e python/")
|
|
|
|
for python in "${PYTHON_EXECS[@]}"; do
|
|
for install_command in "${PIP_COMMANDS[@]}"; do
|
|
echo "Testing pip installation with python $python"
|
|
# Create a temp directory for us to work in and save its name to a file for cleanup
|
|
echo "Using $VIRTUALENV_BASE for virtualenv"
|
|
VIRTUALENV_PATH="$VIRTUALENV_BASE"/$python
|
|
rm -rf "$VIRTUALENV_PATH"
|
|
if [ -n "$USE_CONDA" ]; then
|
|
if [ -f "$CONDA_PREFIX/etc/profile.d/conda.sh" ]; then
|
|
# See also https://github.com/conda/conda/issues/7980
|
|
source "$CONDA_PREFIX/etc/profile.d/conda.sh"
|
|
fi
|
|
conda create -y -p "$VIRTUALENV_PATH" python=$python numpy pandas pip setuptools
|
|
conda activate "$VIRTUALENV_PATH" || (echo "Falling back to 'source activate'" && source activate "$VIRTUALENV_PATH")
|
|
else
|
|
mkdir -p "$VIRTUALENV_PATH"
|
|
virtualenv --python=$python "$VIRTUALENV_PATH"
|
|
source "$VIRTUALENV_PATH"/bin/activate
|
|
fi
|
|
# Upgrade pip & friends if using virtual env
|
|
if [ ! -n "$USE_CONDA" ]; then
|
|
pip install --upgrade pip wheel numpy
|
|
fi
|
|
|
|
echo "Creating pip installable source dist"
|
|
cd "$FWDIR"/python
|
|
# Delete the egg info file if it exists, this can cache the setup file.
|
|
rm -rf pyspark.egg-info || echo "No existing egg info file, skipping deletion"
|
|
# Also, delete the symbolic link if exists. It can be left over from the previous editable mode installation.
|
|
python3 -c "from distutils.sysconfig import get_python_lib; import os; f = os.path.join(get_python_lib(), 'pyspark.egg-link'); os.unlink(f) if os.path.isfile(f) else 0"
|
|
python3 setup.py sdist
|
|
|
|
|
|
echo "Installing dist into virtual env"
|
|
cd dist
|
|
# Verify that the dist directory only contains one thing to install
|
|
sdists=(*.tar.gz)
|
|
if [ ${#sdists[@]} -ne 1 ]; then
|
|
echo "Unexpected number of targets found in dist directory - please cleanup existing sdists first."
|
|
exit -1
|
|
fi
|
|
# Do the actual installation
|
|
cd "$FWDIR"
|
|
$install_command
|
|
|
|
cd /
|
|
|
|
echo "Run basic sanity check on pip installed version with spark-submit"
|
|
export PATH="$(python3 -m site --user-base)/bin:$PATH"
|
|
spark-submit "$FWDIR"/dev/pip-sanity-check.py
|
|
echo "Run basic sanity check with import based"
|
|
python3 "$FWDIR"/dev/pip-sanity-check.py
|
|
echo "Run the tests for context.py"
|
|
python3 "$FWDIR"/python/pyspark/context.py
|
|
|
|
cd "$FWDIR"
|
|
|
|
# conda / virtualenv environments need to be deactivated differently
|
|
if [ -n "$USE_CONDA" ]; then
|
|
conda deactivate || (echo "Falling back to 'source deactivate'" && source deactivate)
|
|
else
|
|
deactivate
|
|
fi
|
|
|
|
done
|
|
done
|
|
|
|
exit 0
|