spark-instrumented-optimizer/dev/lint-python
“attilapiros” bdcad33d8b [SPARK-34433][DOCS] Lock Jekyll version by Gemfile and Bundler
### What changes were proposed in this pull request?

Improving the documentation and release process by pinning Jekyll version by Gemfile and Bundler.

Some files and their responsibilities within this PR:
- `docs/.bundle/config` is used to specify a directory "docs/.local_ruby_bundle" which will be used as destination to install the ruby packages into instead of the global one which requires root access
- `docs/Gemfile` is specifying the required Jekyll version and other top level gem versions
- `docs/Gemfile.lock` is generated by the "bundle install". This file contains the exact resolved versions of all the gems including the top level gems and all the direct and transitive dependencies of those gems. When this file is generated it contains a platform related section "PLATFORMS" (in my case after the generation it was "universal-darwin-19"). Still this file must be under version control as when the version of a gem does not fit to the one specified in `Gemfile` an error comes (i.e. if the `Gemfile.lock` was generated for Jekyll 4.1.0 and its version is updated in the `Gemfile` to 4.2.0 then it triggers the error: "The bundle currently has jekyll locked at 4.1.0."). This is solution is also suggested officially in [its documentation](https://bundler.io/rationale.html#checking-your-code-into-version-control). To get rid of the specific platform (like "universal-darwin-19") first we have to add "ruby" as platform [which means this should work on every platform where Ruby runs](https://guides.rubygems.org/what-is-a-gem/)) by running "bundle lock --add-platform ruby" then the specific platform can be removed by "bundle lock --remove-platform universal-darwin-19".

After this the correct process to update Jekyll version is the following:
1. update the version in `Gemfile`
2. run "bundle update" which updates the `Gemfile.lock`
3. commit both files

This process for version update is tested for details please check the testing section.

### Why are the changes needed?

Using different Jekyll versions can generate different output documents.
This PR standardize the process.

### Does this PR introduce _any_ user-facing change?

No, assuming the release was done via docker by using `do-release-docker.sh`.
In that case  there should be no difference at all as the same Jekyll version is specified in the Gemfile.

### How was this patch tested?

#### Testing document generation

Doc generation step was triggered via  the docker release:

```
$ ./do-release-docker.sh -d ~/working -n -s docs
...
========================
= Building documentation...
Command: /opt/spark-rm/release-build.sh docs
Log file: docs.log
Skipping publish step.
```

The docs.log contains the followings:
```
Building Spark docs
Fetching gem metadata from https://rubygems.org/.........
Using bundler 2.2.9
Fetching rb-fsevent 0.10.4
Fetching forwardable-extended 2.6.0
Fetching public_suffix 4.0.6
Fetching colorator 1.1.0
Fetching eventmachine 1.2.7
Fetching http_parser.rb 0.6.0
Fetching ffi 1.14.2
Fetching concurrent-ruby 1.1.8
Installing colorator 1.1.0
Installing forwardable-extended 2.6.0
Installing rb-fsevent 0.10.4
Installing public_suffix 4.0.6
Installing http_parser.rb 0.6.0 with native extensions
Installing eventmachine 1.2.7 with native extensions
Installing concurrent-ruby 1.1.8
Fetching rexml 3.2.4
Fetching liquid 4.0.3
Installing ffi 1.14.2 with native extensions
Installing rexml 3.2.4
Installing liquid 4.0.3
Fetching mercenary 0.4.0
Installing mercenary 0.4.0
Fetching rouge 3.26.0
Installing rouge 3.26.0
Fetching safe_yaml 1.0.5
Installing safe_yaml 1.0.5
Fetching unicode-display_width 1.7.0
Installing unicode-display_width 1.7.0
Fetching webrick 1.7.0
Installing webrick 1.7.0
Fetching pathutil 0.16.2
Fetching kramdown 2.3.0
Fetching terminal-table 2.0.0
Fetching addressable 2.7.0
Fetching i18n 1.8.9
Installing terminal-table 2.0.0
Installing pathutil 0.16.2
Installing i18n 1.8.9
Installing addressable 2.7.0
Installing kramdown 2.3.0
Fetching kramdown-parser-gfm 1.1.0
Installing kramdown-parser-gfm 1.1.0
Fetching rb-inotify 0.10.1
Fetching sassc 2.4.0
Fetching em-websocket 0.5.2
Installing rb-inotify 0.10.1
Installing em-websocket 0.5.2
Installing sassc 2.4.0 with native extensions
Fetching listen 3.4.1
Installing listen 3.4.1
Fetching jekyll-watch 2.2.1
Installing jekyll-watch 2.2.1
Fetching jekyll-sass-converter 2.1.0
Installing jekyll-sass-converter 2.1.0
Fetching jekyll 4.2.0
Installing jekyll 4.2.0
Fetching jekyll-redirect-from 0.16.0
Installing jekyll-redirect-from 0.16.0
Bundle complete! 4 Gemfile dependencies, 30 gems now installed.
Bundled gems are installed into `./.local_ruby_bundle`
```

#### Testing Jekyll (or other gem) update

First locally I reverted Jekyll to 4.1.0:
```
$ rm Gemfile.lock
$ rm -rf .local_ruby_bundle

# edited Gemfile to use version 4.1.0
$ cat Gemfile
source "https://rubygems.org"

gem "jekyll", "4.1.0"
gem "rouge", "3.26.0"
gem "jekyll-redirect-from", "0.16.0"
gem "webrick", "1.7"
$ bundle install
...
```

Testing Jekyll version before the update:

```
$ bundle exec jekyll --version
jekyll 4.1.0
```

Imitating Jekyll update coming from git by reverting my local changes:

```
$ git checkout Gemfile
Updated 1 path from the index
$ cat Gemfile
source "https://rubygems.org"

gem "jekyll", "4.2.0"
gem "rouge", "3.26.0"
gem "jekyll-redirect-from", "0.16.0"
gem "webrick", "1.7"

$ git checkout Gemfile.lock
Updated 1 path from the index
```

Run the install:

```
$ bundle install
...
```

Checking the updated Jekyll version:
```
$ bundle exec jekyll --version
jekyll 4.2.0
```

Closes #31559 from attilapiros/pin-jekyll-version.

Lead-authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
Co-authored-by: Attila Zsolt Piros <2017933+attilapiros@users.noreply.github.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2021-02-18 12:17:57 +09:00

290 lines
9.8 KiB
Bash
Executable file

#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# define test binaries + versions
FLAKE8_BUILD="flake8"
MINIMUM_FLAKE8="3.5.0"
MYPY_BUILD="mypy"
PYCODESTYLE_BUILD="pycodestyle"
MINIMUM_PYCODESTYLE="2.6.0"
SPHINX_BUILD="sphinx-build"
PYTHON_EXECUTABLE="python3"
function satisfies_min_version {
local provided_version="$1"
local expected_version="$2"
echo "$(
"$PYTHON_EXECUTABLE" << EOM
from setuptools.extern.packaging import version
print(version.parse('$provided_version') >= version.parse('$expected_version'))
EOM
)"
}
function compile_python_test {
local COMPILE_STATUS=
local COMPILE_REPORT=
if [[ ! "$1" ]]; then
echo "No python files found! Something is very wrong -- exiting."
exit 1;
fi
# compileall: https://docs.python.org/3/library/compileall.html
echo "starting python compilation test..."
COMPILE_REPORT=$( ("$PYTHON_EXECUTABLE" -B -mcompileall -q -l -x "[/\\\\][.]git" $1) 2>&1)
COMPILE_STATUS=$?
if [ $COMPILE_STATUS -ne 0 ]; then
echo "Python compilation failed with the following errors:"
echo "$COMPILE_REPORT"
echo "$COMPILE_STATUS"
exit "$COMPILE_STATUS"
else
echo "python compilation succeeded."
echo
fi
}
function pycodestyle_test {
local PYCODESTYLE_STATUS=
local PYCODESTYLE_REPORT=
local RUN_LOCAL_PYCODESTYLE=
local PYCODESTYLE_VERSION=
local EXPECTED_PYCODESTYLE=
local PYCODESTYLE_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pycodestyle-$MINIMUM_PYCODESTYLE.py"
local PYCODESTYLE_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/PyCQA/pycodestyle/$MINIMUM_PYCODESTYLE/pycodestyle.py"
if [[ ! "$1" ]]; then
echo "No python files found! Something is very wrong -- exiting."
exit 1;
fi
# check for locally installed pycodestyle & version
RUN_LOCAL_PYCODESTYLE="False"
if hash "$PYCODESTYLE_BUILD" 2> /dev/null; then
PYCODESTYLE_VERSION="$($PYCODESTYLE_BUILD --version)"
EXPECTED_PYCODESTYLE="$(satisfies_min_version $PYCODESTYLE_VERSION $MINIMUM_PYCODESTYLE)"
if [ "$EXPECTED_PYCODESTYLE" == "True" ]; then
RUN_LOCAL_PYCODESTYLE="True"
fi
fi
# download the right version or run locally
if [ $RUN_LOCAL_PYCODESTYLE == "False" ]; then
# Get pycodestyle at runtime so that we don't rely on it being installed on the build server.
# See: https://github.com/apache/spark/pull/1744#issuecomment-50982162
# Updated to the latest official version of pep8. pep8 is formally renamed to pycodestyle.
echo "downloading pycodestyle from $PYCODESTYLE_SCRIPT_REMOTE_PATH..."
if [ ! -e "$PYCODESTYLE_SCRIPT_PATH" ]; then
curl --silent -o "$PYCODESTYLE_SCRIPT_PATH" "$PYCODESTYLE_SCRIPT_REMOTE_PATH"
local curl_status="$?"
if [ "$curl_status" -ne 0 ]; then
echo "Failed to download pycodestyle.py from $PYCODESTYLE_SCRIPT_REMOTE_PATH"
exit "$curl_status"
fi
fi
echo "starting pycodestyle test..."
PYCODESTYLE_REPORT=$( ("$PYTHON_EXECUTABLE" "$PYCODESTYLE_SCRIPT_PATH" --config=dev/tox.ini $1) 2>&1)
PYCODESTYLE_STATUS=$?
else
# we have the right version installed, so run locally
echo "starting pycodestyle test..."
PYCODESTYLE_REPORT=$( ($PYCODESTYLE_BUILD --config=dev/tox.ini $1) 2>&1)
PYCODESTYLE_STATUS=$?
fi
if [ $PYCODESTYLE_STATUS -ne 0 ]; then
echo "pycodestyle checks failed:"
echo "$PYCODESTYLE_REPORT"
exit "$PYCODESTYLE_STATUS"
else
echo "pycodestyle checks passed."
echo
fi
}
function mypy_test {
local MYPY_REPORT=
local MYPY_STATUS=
# TODO(SPARK-32797): Install mypy on the Jenkins CI workers
if ! hash "$MYPY_BUILD" 2> /dev/null; then
echo "The $MYPY_BUILD command was not found. Skipping for now."
return
fi
echo "starting $MYPY_BUILD test..."
MYPY_REPORT=$( ($MYPY_BUILD --config-file python/mypy.ini python/pyspark) 2>&1)
MYPY_STATUS=$?
if [ "$MYPY_STATUS" -ne 0 ]; then
echo "mypy checks failed:"
echo "$MYPY_REPORT"
echo "$MYPY_STATUS"
exit "$MYPY_STATUS"
else
echo "mypy checks passed."
echo
fi
}
function flake8_test {
local FLAKE8_VERSION=
local EXPECTED_FLAKE8=
local FLAKE8_REPORT=
local FLAKE8_STATUS=
if ! hash "$FLAKE8_BUILD" 2> /dev/null; then
echo "The flake8 command was not found."
echo "flake8 checks failed."
exit 1
fi
_FLAKE8_VERSION=($($FLAKE8_BUILD --version))
FLAKE8_VERSION="${_FLAKE8_VERSION[0]}"
EXPECTED_FLAKE8="$(satisfies_min_version $FLAKE8_VERSION $MINIMUM_FLAKE8)"
if [[ "$EXPECTED_FLAKE8" == "False" ]]; then
echo "\
The minimum flake8 version needs to be $MINIMUM_FLAKE8. Your current version is $FLAKE8_VERSION
flake8 checks failed."
exit 1
fi
echo "starting $FLAKE8_BUILD test..."
FLAKE8_REPORT=$( ($FLAKE8_BUILD --append-config dev/tox.ini --count --show-source --statistics .) 2>&1)
FLAKE8_STATUS=$?
if [ "$FLAKE8_STATUS" -ne 0 ]; then
echo "flake8 checks failed:"
echo "$FLAKE8_REPORT"
echo "$FLAKE8_STATUS"
exit "$FLAKE8_STATUS"
else
echo "flake8 checks passed."
echo
fi
}
function sphinx_test {
local SPHINX_REPORT=
local SPHINX_STATUS=
# Check that the documentation builds acceptably, skip check if sphinx is not installed.
if ! hash "$SPHINX_BUILD" 2> /dev/null; then
echo "The $SPHINX_BUILD command was not found. Skipping Sphinx build for now."
echo
return
fi
PYTHON_HAS_SPHINX=$("$PYTHON_EXECUTABLE" -c 'import importlib.util; print(importlib.util.find_spec("sphinx") is not None)')
if [[ "$PYTHON_HAS_SPHINX" == "False" ]]; then
echo "$PYTHON_EXECUTABLE does not have Sphinx installed. Skipping Sphinx build for now."
echo
return
fi
# TODO(SPARK-32407): Sphinx 3.1+ does not correctly index nested classes.
# See also https://github.com/sphinx-doc/sphinx/issues/7551.
PYTHON_HAS_SPHINX_3_0=$("$PYTHON_EXECUTABLE" -c 'from distutils.version import LooseVersion; import sphinx; print(LooseVersion(sphinx.__version__) < LooseVersion("3.1.0"))')
if [[ "$PYTHON_HAS_SPHINX_3_0" == "False" ]]; then
echo "$PYTHON_EXECUTABLE has Sphinx 3.1+ installed but it requires lower than 3.1. Skipping Sphinx build for now."
echo
return
fi
# TODO(SPARK-32391): Install pydata_sphinx_theme in Jenkins machines
PYTHON_HAS_THEME=$("$PYTHON_EXECUTABLE" -c 'import importlib.util; print(importlib.util.find_spec("pydata_sphinx_theme") is not None)')
if [[ "$PYTHON_HAS_THEME" == "False" ]]; then
echo "$PYTHON_EXECUTABLE does not have pydata_sphinx_theme installed. Skipping Sphinx build for now."
echo
return
fi
# TODO(SPARK-32666): Install nbsphinx in Jenkins machines
PYTHON_HAS_NBSPHINX=$("$PYTHON_EXECUTABLE" -c 'import importlib.util; print(importlib.util.find_spec("nbsphinx") is not None)')
if [[ "$PYTHON_HAS_NBSPHINX" == "False" ]]; then
echo "$PYTHON_EXECUTABLE does not have nbsphinx installed. Skipping Sphinx build for now."
echo
return
fi
# TODO(SPARK-32666): Install ipython in Jenkins machines
PYTHON_HAS_IPYTHON=$("$PYTHON_EXECUTABLE" -c 'import importlib.util; print(importlib.util.find_spec("IPython") is not None)')
if [[ "$PYTHON_HAS_IPYTHON" == "False" ]]; then
echo "$PYTHON_EXECUTABLE does not have ipython installed. Skipping Sphinx build for now."
echo
return
fi
# TODO(SPARK-33242): Install numpydoc in Jenkins machines
PYTHON_HAS_NUMPYDOC=$("$PYTHON_EXECUTABLE" -c 'import importlib.util; print(importlib.util.find_spec("numpydoc") is not None)')
if [[ "$PYTHON_HAS_NUMPYDOC" == "False" ]]; then
echo "$PYTHON_EXECUTABLE does not have numpydoc installed. Skipping Sphinx build for now."
echo
return
fi
echo "starting $SPHINX_BUILD tests..."
pushd python/docs &> /dev/null
make clean &> /dev/null
# Treat warnings as errors so we stop correctly
SPHINX_REPORT=$( (SPHINXOPTS="-a -W" make html) 2>&1)
SPHINX_STATUS=$?
if [ "$SPHINX_STATUS" -ne 0 ]; then
echo "$SPHINX_BUILD checks failed:"
echo "$SPHINX_REPORT"
echo
echo "re-running make html to print full warning list:"
make clean &> /dev/null
SPHINX_REPORT=$( (SPHINXOPTS="-a" make html) 2>&1)
echo "$SPHINX_REPORT"
exit "$SPHINX_STATUS"
else
echo "$SPHINX_BUILD checks passed."
echo
fi
popd &> /dev/null
}
SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
SPARK_ROOT_DIR="$(dirname "${SCRIPT_DIR}")"
pushd "$SPARK_ROOT_DIR" &> /dev/null
# skipping local ruby bundle directory from the search
PYTHON_SOURCE="$(find . -path ./docs/.local_ruby_bundle -prune -false -o -name "*.py")"
compile_python_test "$PYTHON_SOURCE"
pycodestyle_test "$PYTHON_SOURCE"
flake8_test
mypy_test
sphinx_test
echo
echo "all lint-python tests passed!"
popd &> /dev/null