[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
#
|
|
|
|
# Licensed to the Apache Software Foundation (ASF) under one or more
|
|
|
|
# contributor license agreements. See the NOTICE file distributed with
|
|
|
|
# this work for additional information regarding copyright ownership.
|
|
|
|
# The ASF licenses this file to You under the Apache License, Version 2.0
|
|
|
|
# (the "License"); you may not use this file except in compliance with
|
|
|
|
# the License. You may obtain a copy of the License at
|
|
|
|
#
|
|
|
|
# http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
#
|
|
|
|
# Unless required by applicable law or agreed to in writing, software
|
|
|
|
# distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
# See the License for the specific language governing permissions and
|
|
|
|
# limitations under the License.
|
|
|
|
#
|
|
|
|
|
|
|
|
import itertools
|
|
|
|
import re
|
|
|
|
|
|
|
|
all_modules = []
|
|
|
|
|
|
|
|
|
|
|
|
class Module(object):
|
|
|
|
"""
|
|
|
|
A module is the basic abstraction in our test runner script. Each module consists of a set of
|
|
|
|
source files, a set of test commands, and a set of dependencies on other modules. We use modules
|
|
|
|
to define a dependency graph that lets determine which tests to run based on which files have
|
|
|
|
changed.
|
|
|
|
"""
|
|
|
|
|
2015-07-30 03:46:36 -04:00
|
|
|
def __init__(self, name, dependencies, source_file_regexes, build_profile_flags=(), environ={},
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
sbt_test_goals=(), python_test_goals=(), blacklisted_python_implementations=(),
|
|
|
|
should_run_r_tests=False):
|
|
|
|
"""
|
|
|
|
Define a new module.
|
|
|
|
|
|
|
|
:param name: A short module name, for display in logging and error messages.
|
|
|
|
:param dependencies: A set of dependencies for this module. This should only include direct
|
|
|
|
dependencies; transitive dependencies are resolved automatically.
|
|
|
|
:param source_file_regexes: a set of regexes that match source files belonging to this
|
|
|
|
module. These regexes are applied by attempting to match at the beginning of the
|
|
|
|
filename strings.
|
|
|
|
:param build_profile_flags: A set of profile flags that should be passed to Maven or SBT in
|
|
|
|
order to build and test this module (e.g. '-PprofileName').
|
2015-07-30 03:46:36 -04:00
|
|
|
:param environ: A dict of environment variables that should be set when files in this
|
|
|
|
module are changed.
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
:param sbt_test_goals: A set of SBT test goals for testing this module.
|
|
|
|
:param python_test_goals: A set of Python test goals for testing this module.
|
|
|
|
:param blacklisted_python_implementations: A set of Python implementations that are not
|
|
|
|
supported by this module's Python components. The values in this set should match
|
|
|
|
strings returned by Python's `platform.python_implementation()`.
|
|
|
|
:param should_run_r_tests: If true, changes in this module will trigger all R tests.
|
|
|
|
"""
|
|
|
|
self.name = name
|
|
|
|
self.dependencies = dependencies
|
|
|
|
self.source_file_prefixes = source_file_regexes
|
|
|
|
self.sbt_test_goals = sbt_test_goals
|
|
|
|
self.build_profile_flags = build_profile_flags
|
2015-07-30 03:46:36 -04:00
|
|
|
self.environ = environ
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
self.python_test_goals = python_test_goals
|
|
|
|
self.blacklisted_python_implementations = blacklisted_python_implementations
|
|
|
|
self.should_run_r_tests = should_run_r_tests
|
|
|
|
|
|
|
|
self.dependent_modules = set()
|
|
|
|
for dep in dependencies:
|
|
|
|
dep.dependent_modules.add(self)
|
|
|
|
all_modules.append(self)
|
|
|
|
|
|
|
|
def contains_file(self, filename):
|
|
|
|
return any(re.match(p, filename) for p in self.source_file_prefixes)
|
|
|
|
|
|
|
|
|
|
|
|
sql = Module(
|
|
|
|
name="sql",
|
|
|
|
dependencies=[],
|
|
|
|
source_file_regexes=[
|
|
|
|
"sql/(?!hive-thriftserver)",
|
|
|
|
"bin/spark-sql",
|
|
|
|
],
|
|
|
|
build_profile_flags=[
|
|
|
|
"-Phive",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"catalyst/test",
|
|
|
|
"sql/test",
|
|
|
|
"hive/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
hive_thriftserver = Module(
|
|
|
|
name="hive-thriftserver",
|
|
|
|
dependencies=[sql],
|
|
|
|
source_file_regexes=[
|
|
|
|
"sql/hive-thriftserver",
|
|
|
|
"sbin/start-thriftserver.sh",
|
|
|
|
],
|
|
|
|
build_profile_flags=[
|
|
|
|
"-Phive-thriftserver",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"hive-thriftserver/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
graphx = Module(
|
|
|
|
name="graphx",
|
|
|
|
dependencies=[],
|
|
|
|
source_file_regexes=[
|
|
|
|
"graphx/",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"graphx/test"
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming = Module(
|
|
|
|
name="streaming",
|
|
|
|
dependencies=[],
|
|
|
|
source_file_regexes=[
|
|
|
|
"streaming",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
2015-07-30 03:46:36 -04:00
|
|
|
# Don't set the dependencies because changes in other modules should not trigger Kinesis tests.
|
|
|
|
# Kinesis tests depends on external Amazon kinesis service. We should run these tests only when
|
|
|
|
# files in streaming_kinesis_asl are changed, so that if Kinesis experiences an outage, we don't
|
|
|
|
# fail other PRs.
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
streaming_kinesis_asl = Module(
|
|
|
|
name="kinesis-asl",
|
2015-07-30 03:46:36 -04:00
|
|
|
dependencies=[],
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
source_file_regexes=[
|
|
|
|
"extras/kinesis-asl/",
|
|
|
|
],
|
|
|
|
build_profile_flags=[
|
|
|
|
"-Pkinesis-asl",
|
|
|
|
],
|
2015-07-30 03:46:36 -04:00
|
|
|
environ={
|
|
|
|
"ENABLE_KINESIS_TESTS": "1"
|
|
|
|
},
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
sbt_test_goals=[
|
|
|
|
"kinesis-asl/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming_zeromq = Module(
|
|
|
|
name="streaming-zeromq",
|
|
|
|
dependencies=[streaming],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/zeromq",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming-zeromq/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming_twitter = Module(
|
|
|
|
name="streaming-twitter",
|
|
|
|
dependencies=[streaming],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/twitter",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming-twitter/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming_mqtt = Module(
|
|
|
|
name="streaming-mqtt",
|
|
|
|
dependencies=[streaming],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/mqtt",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming-mqtt/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming_kafka = Module(
|
|
|
|
name="streaming-kafka",
|
|
|
|
dependencies=[streaming],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/kafka",
|
|
|
|
"external/kafka-assembly",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming-kafka/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming_flume_sink = Module(
|
|
|
|
name="streaming-flume-sink",
|
|
|
|
dependencies=[streaming],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/flume-sink",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming-flume-sink/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
streaming_flume = Module(
|
2015-07-01 14:59:24 -04:00
|
|
|
name="streaming-flume",
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
dependencies=[streaming],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/flume",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"streaming-flume/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
2015-07-01 14:59:24 -04:00
|
|
|
streaming_flume_assembly = Module(
|
|
|
|
name="streaming-flume-assembly",
|
|
|
|
dependencies=[streaming_flume, streaming_flume_sink],
|
|
|
|
source_file_regexes=[
|
|
|
|
"external/flume-assembly",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
mllib = Module(
|
|
|
|
name="mllib",
|
|
|
|
dependencies=[streaming, sql],
|
|
|
|
source_file_regexes=[
|
|
|
|
"data/mllib/",
|
|
|
|
"mllib/",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"mllib/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
examples = Module(
|
|
|
|
name="examples",
|
|
|
|
dependencies=[graphx, mllib, streaming, sql],
|
|
|
|
source_file_regexes=[
|
|
|
|
"examples/",
|
|
|
|
],
|
|
|
|
sbt_test_goals=[
|
|
|
|
"examples/test",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
pyspark_core = Module(
|
|
|
|
name="pyspark-core",
|
2015-07-01 14:59:24 -04:00
|
|
|
dependencies=[],
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
source_file_regexes=[
|
|
|
|
"python/(?!pyspark/(ml|mllib|sql|streaming))"
|
|
|
|
],
|
|
|
|
python_test_goals=[
|
|
|
|
"pyspark.rdd",
|
|
|
|
"pyspark.context",
|
|
|
|
"pyspark.conf",
|
|
|
|
"pyspark.broadcast",
|
|
|
|
"pyspark.accumulators",
|
|
|
|
"pyspark.serializers",
|
|
|
|
"pyspark.profiler",
|
|
|
|
"pyspark.shuffle",
|
|
|
|
"pyspark.tests",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
pyspark_sql = Module(
|
|
|
|
name="pyspark-sql",
|
|
|
|
dependencies=[pyspark_core, sql],
|
|
|
|
source_file_regexes=[
|
|
|
|
"python/pyspark/sql"
|
|
|
|
],
|
|
|
|
python_test_goals=[
|
|
|
|
"pyspark.sql.types",
|
|
|
|
"pyspark.sql.context",
|
|
|
|
"pyspark.sql.column",
|
|
|
|
"pyspark.sql.dataframe",
|
|
|
|
"pyspark.sql.group",
|
|
|
|
"pyspark.sql.functions",
|
|
|
|
"pyspark.sql.readwriter",
|
|
|
|
"pyspark.sql.window",
|
|
|
|
"pyspark.sql.tests",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
pyspark_streaming = Module(
|
|
|
|
name="pyspark-streaming",
|
2015-07-01 14:59:24 -04:00
|
|
|
dependencies=[pyspark_core, streaming, streaming_kafka, streaming_flume_assembly],
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
source_file_regexes=[
|
|
|
|
"python/pyspark/streaming"
|
|
|
|
],
|
|
|
|
python_test_goals=[
|
|
|
|
"pyspark.streaming.util",
|
|
|
|
"pyspark.streaming.tests",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
pyspark_mllib = Module(
|
|
|
|
name="pyspark-mllib",
|
|
|
|
dependencies=[pyspark_core, pyspark_streaming, pyspark_sql, mllib],
|
|
|
|
source_file_regexes=[
|
|
|
|
"python/pyspark/mllib"
|
|
|
|
],
|
|
|
|
python_test_goals=[
|
|
|
|
"pyspark.mllib.classification",
|
|
|
|
"pyspark.mllib.clustering",
|
|
|
|
"pyspark.mllib.evaluation",
|
|
|
|
"pyspark.mllib.feature",
|
|
|
|
"pyspark.mllib.fpm",
|
|
|
|
"pyspark.mllib.linalg",
|
|
|
|
"pyspark.mllib.random",
|
|
|
|
"pyspark.mllib.recommendation",
|
|
|
|
"pyspark.mllib.regression",
|
|
|
|
"pyspark.mllib.stat._statistics",
|
|
|
|
"pyspark.mllib.stat.KernelDensity",
|
|
|
|
"pyspark.mllib.tree",
|
|
|
|
"pyspark.mllib.util",
|
|
|
|
"pyspark.mllib.tests",
|
|
|
|
],
|
|
|
|
blacklisted_python_implementations=[
|
|
|
|
"PyPy" # Skip these tests under PyPy since they require numpy and it isn't available there
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
pyspark_ml = Module(
|
|
|
|
name="pyspark-ml",
|
|
|
|
dependencies=[pyspark_core, pyspark_mllib],
|
|
|
|
source_file_regexes=[
|
|
|
|
"python/pyspark/ml/"
|
|
|
|
],
|
|
|
|
python_test_goals=[
|
|
|
|
"pyspark.ml.feature",
|
|
|
|
"pyspark.ml.classification",
|
2015-07-17 21:30:04 -04:00
|
|
|
"pyspark.ml.clustering",
|
[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
This patch refactors the `python/run-tests` script:
- It's now written in Python instead of Bash.
- The descriptions of the tests to run are now stored in `dev/run-tests`'s modules. This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes. For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
- `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
```
Usage: run-tests [options]
Options:
-h, --help show this help message and exit
--python-executables=PYTHON_EXECUTABLES
A comma-separated list of Python executables to test
against (default: python2.6,python3.4,pypy)
--modules=MODULES A comma-separated list of Python modules to test
(default: pyspark-core,pyspark-ml,pyspark-mllib
,pyspark-sql,pyspark-streaming)
```
- `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
Author: Josh Rosen <joshrosen@databricks.com>
Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
f578d6d [Josh Rosen] Fix print for Python 2.x
8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
8f65ed0 [Josh Rosen] Fix handling of module in python/run-tests
37aff00 [Josh Rosen] Python 3 fix
27a389f [Josh Rosen] Skip MLLib tests for PyPy
c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
568a3fd [Josh Rosen] Fix hashbang
3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
4f8902c [Josh Rosen] Python lint fixes.
8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
f542ac5 [Josh Rosen] Fix lint check for Python 3
fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
caeb040 [Josh Rosen] Fixes to PySpark test module definitions
d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
def2d8a [Josh Rosen] Two minor fixes
aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
4c97136 [Josh Rosen] PYTHONPATH fixes
dcc9c09 [Josh Rosen] Fix time division
32660fc [Josh Rosen] Initial cut at Python test runner refactoring
311c6a9 [Josh Rosen] Move shell utility functions to own module.
1bdeb87 [Josh Rosen] Move module definitions to separate file.
2015-06-27 23:24:34 -04:00
|
|
|
"pyspark.ml.recommendation",
|
|
|
|
"pyspark.ml.regression",
|
|
|
|
"pyspark.ml.tuning",
|
|
|
|
"pyspark.ml.tests",
|
|
|
|
"pyspark.ml.evaluation",
|
|
|
|
],
|
|
|
|
blacklisted_python_implementations=[
|
|
|
|
"PyPy" # Skip these tests under PyPy since they require numpy and it isn't available there
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
sparkr = Module(
|
|
|
|
name="sparkr",
|
|
|
|
dependencies=[sql, mllib],
|
|
|
|
source_file_regexes=[
|
|
|
|
"R/",
|
|
|
|
],
|
|
|
|
should_run_r_tests=True
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
docs = Module(
|
|
|
|
name="docs",
|
|
|
|
dependencies=[],
|
|
|
|
source_file_regexes=[
|
|
|
|
"docs/",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
ec2 = Module(
|
|
|
|
name="ec2",
|
|
|
|
dependencies=[],
|
|
|
|
source_file_regexes=[
|
|
|
|
"ec2/",
|
|
|
|
]
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
|
|
# The root module is a dummy module which is used to run all of the tests.
|
|
|
|
# No other modules should directly depend on this module.
|
|
|
|
root = Module(
|
|
|
|
name="root",
|
|
|
|
dependencies=[],
|
|
|
|
source_file_regexes=[],
|
|
|
|
# In order to run all of the tests, enable every test profile:
|
|
|
|
build_profile_flags=list(set(
|
|
|
|
itertools.chain.from_iterable(m.build_profile_flags for m in all_modules))),
|
|
|
|
sbt_test_goals=[
|
|
|
|
"test",
|
|
|
|
],
|
|
|
|
python_test_goals=list(itertools.chain.from_iterable(m.python_test_goals for m in all_modules)),
|
|
|
|
should_run_r_tests=True
|
|
|
|
)
|