spark-instrumented-optimizer/dev/lint-python
hyukjinkwon 46b2126024
[SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts
## What changes were proposed in this pull request?

This PR proposes to check pep8 against all other Python scripts and fix the errors as below:

```bash
./dev/create-release/generate-contributors.py
./dev/create-release/releaseutils.py
./dev/create-release/translate-contributors.py
./dev/lint-python
./python/docs/epytext.py
./examples/src/main/python/mllib/decision_tree_classification_example.py
./examples/src/main/python/mllib/decision_tree_regression_example.py
./examples/src/main/python/mllib/gradient_boosting_classification_example.py
./examples/src/main/python/mllib/gradient_boosting_regression_example.py
./examples/src/main/python/mllib/linear_regression_with_sgd_example.py
./examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
./examples/src/main/python/mllib/naive_bayes_example.py
./examples/src/main/python/mllib/random_forest_classification_example.py
./examples/src/main/python/mllib/random_forest_regression_example.py
./examples/src/main/python/mllib/svm_with_sgd_example.py
./examples/src/main/python/streaming/network_wordjoinsentiments.py
./sql/hive/src/test/resources/data/scripts/cat.py
./sql/hive/src/test/resources/data/scripts/cat_error.py
./sql/hive/src/test/resources/data/scripts/doubleescapedtab.py
./sql/hive/src/test/resources/data/scripts/dumpdata_script.py
./sql/hive/src/test/resources/data/scripts/escapedcarriagereturn.py
./sql/hive/src/test/resources/data/scripts/escapednewline.py
./sql/hive/src/test/resources/data/scripts/escapedtab.py
./sql/hive/src/test/resources/data/scripts/input20_script.py
./sql/hive/src/test/resources/data/scripts/newline.py
```

## How was this patch tested?

- `./python/docs/epytext.py`

  ```bash
  cd ./python/docs $$ make html
  ```

- pep8 check (Python 2.7 / Python 3.3.6)

  ```
  ./dev/lint-python
  ```

- `./dev/merge_spark_pr.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python -m doctest -v ./dev/merge_spark_pr.py
  ```

- `./dev/create-release/releaseutils.py` `./dev/create-release/generate-contributors.py` `./dev/create-release/translate-contributors.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python generate-contributors.py
  python translate-contributors.py
  ```

- Examples (Python 2.7 / Python 3.3.6)

  ```bash
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_regression_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_regression_example.p
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_regression_example.py
  ```

- Examples (Python 2.7 only / Python 3.3.6 not working)
  ```
  ./bin/spark-submit examples/src/main/python/mllib/linear_regression_with_sgd_example.py
  ./bin/spark-submit examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
  ./bin/spark-submit examples/src/main/python/mllib/naive_bayes_example.py
  ./bin/spark-submit examples/src/main/python/mllib/svm_with_sgd_example.py
  ```

- `sql/hive/src/test/resources/data/scripts/*.py` (Python 2.7 / Python 3.3.6 within suggested changes)

  Manually tested only changed ones.

- `./dev/github_jira_sync.py` (Python 2.7 only / Python 3.3.6 not working)

  Manually tested this after disabling actually adding comments and links.

And also via Jenkins tests.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #16405 from HyukjinKwon/minor-pep8.
2017-01-02 15:23:19 +00:00

108 lines
3.9 KiB
Bash
Executable file

#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
SCRIPT_DIR="$( cd "$( dirname "$0" )" && pwd )"
SPARK_ROOT_DIR="$(dirname "$SCRIPT_DIR")"
# Exclude auto-geneated configuration file.
PATHS_TO_CHECK="$( cd "$SPARK_ROOT_DIR" && find . -name "*.py" -not -path "*python/docs/conf.py" )"
PEP8_REPORT_PATH="$SPARK_ROOT_DIR/dev/pep8-report.txt"
PYLINT_REPORT_PATH="$SPARK_ROOT_DIR/dev/pylint-report.txt"
PYLINT_INSTALL_INFO="$SPARK_ROOT_DIR/dev/pylint-info.txt"
SPHINXBUILD=${SPHINXBUILD:=sphinx-build}
SPHINX_REPORT_PATH="$SPARK_ROOT_DIR/dev/sphinx-report.txt"
cd "$SPARK_ROOT_DIR"
# compileall: https://docs.python.org/2/library/compileall.html
python -B -m compileall -q -l $PATHS_TO_CHECK > "$PEP8_REPORT_PATH"
compile_status="${PIPESTATUS[0]}"
# Get pep8 at runtime so that we don't rely on it being installed on the build server.
#+ See: https://github.com/apache/spark/pull/1744#issuecomment-50982162
#+ TODOs:
#+ - Download pep8 from PyPI. It's more "official".
PEP8_VERSION="1.7.0"
PEP8_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pep8-$PEP8_VERSION.py"
PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/jcrocholl/pep8/$PEP8_VERSION/pep8.py"
if [ ! -e "$PEP8_SCRIPT_PATH" ]; then
curl --silent -o "$PEP8_SCRIPT_PATH" "$PEP8_SCRIPT_REMOTE_PATH"
curl_status="$?"
if [ "$curl_status" -ne 0 ]; then
echo "Failed to download pep8.py from \"$PEP8_SCRIPT_REMOTE_PATH\"."
exit "$curl_status"
fi
fi
# Easy install pylint in /dev/pylint. To easy_install into a directory, the PYTHONPATH should
# be set to the directory.
# dev/pylint should be appended to the PATH variable as well.
# Jenkins by default installs the pylint3 version, so for now this just checks the code quality
# of python3.
export "PYTHONPATH=$SPARK_ROOT_DIR/dev/pylint"
export "PYLINT_HOME=$PYTHONPATH"
export "PATH=$PYTHONPATH:$PATH"
# There is no need to write this output to a file
#+ first, but we do so so that the check status can
#+ be output before the report, like with the
#+ scalastyle and RAT checks.
python "$PEP8_SCRIPT_PATH" --ignore=E402,E731,E241,W503,E226 --config=dev/tox.ini $PATHS_TO_CHECK >> "$PEP8_REPORT_PATH"
pep8_status="${PIPESTATUS[0]}"
if [ "$compile_status" -eq 0 -a "$pep8_status" -eq 0 ]; then
lint_status=0
else
lint_status=1
fi
if [ "$lint_status" -ne 0 ]; then
echo "PEP8 checks failed."
cat "$PEP8_REPORT_PATH"
rm "$PEP8_REPORT_PATH"
exit "$lint_status"
else
echo "PEP8 checks passed."
rm "$PEP8_REPORT_PATH"
fi
# Check that the documentation builds acceptably, skip check if sphinx is not installed.
if hash "$SPHINXBUILD" 2> /dev/null; then
cd python/docs
make clean
# Treat warnings as errors so we stop correctly
SPHINXOPTS="-a -W" make html &> "$SPHINX_REPORT_PATH" || lint_status=1
if [ "$lint_status" -ne 0 ]; then
echo "pydoc checks failed."
cat "$SPHINX_REPORT_PATH"
echo "re-running make html to print full warning list"
make clean
SPHINXOPTS="-a" make html
rm "$SPHINX_REPORT_PATH"
exit "$lint_status"
else
echo "pydoc checks passed."
rm "$SPHINX_REPORT_PATH"
fi
cd ../..
else
echo >&2 "The $SPHINXBUILD command was not found. Skipping pydoc checks for now"
fi