[SPARK-32812][PYTHON][TESTS] Avoid initiating a process during the main process for run-tests.py

### What changes were proposed in this pull request?

In certain environments, seems it fails to run `run-tests.py` script as below:

```
Traceback (most recent call last):
 File "<string>", line 1, in <module>
...

raise RuntimeError('''
RuntimeError:
 An attempt has been made to start a new process before the
 current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
 child processes and you have forgotten to use the proper idiom
 in the main module:

if __name__ == '__main__':
 freeze_support()
 ...

The "freeze_support()" line can be omitted if the program
 is not going to be frozen to produce an executable.
Traceback (most recent call last):
...
 raise EOFError
EOFError

```

The reason is that `Manager.dict()` launches another process when the main process is initiated.

It works in most environments for an unknown reason but it should be good to avoid such pattern as guided from Python itself.

### Why are the changes needed?

To prevent the test failure for Python.

### Does this PR introduce _any_ user-facing change?

No, it fixes a test script.

### How was this patch tested?

Manually ran the script after fixing.

```
Running PySpark tests. Output is in /.../python/unit-tests.log
Will test against the following Python executables: ['/.../python3', 'python3.8']
Will test the following Python tests: ['pyspark.sql.dataframe']
/.../python3 python_implementation is CPython
/.../python3 version is: Python 3.8.5
python3.8 python_implementation is CPython
python3.8 version is: Python 3.8.5
Starting test(/.../python3): pyspark.sql.dataframe
Starting test(python3.8): pyspark.sql.dataframe
Finished test(/.../python3): pyspark.sql.dataframe (33s)
Finished test(python3.8): pyspark.sql.dataframe (34s)
Tests passed in 34 seconds
```

Closes #29666 from itholic/SPARK-32812.

Authored-by: itholic <haejoon309@naver.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
itholic 2020-09-08 12:22:13 +09:00 committed by HyukjinKwon
parent c336ae39cd
commit c8c082ce38

View file

@ -48,7 +48,7 @@ def print_red(text):
print('\033[31m' + text + '\033[0m')
SKIPPED_TESTS = Manager().dict()
SKIPPED_TESTS = None
LOG_FILE = os.path.join(SPARK_HOME, "python/unit-tests.log")
FAILURE_REPORTING_LOCK = Lock()
LOGGER = logging.getLogger()
@ -138,6 +138,7 @@ def run_individual_python_test(target_dir, test_name, pyspark_python):
skipped_counts = len(skipped_tests)
if skipped_counts > 0:
key = (pyspark_python, test_name)
assert SKIPPED_TESTS is not None
SKIPPED_TESTS[key] = skipped_tests
per_test_output.close()
except:
@ -318,4 +319,5 @@ def main():
if __name__ == "__main__":
SKIPPED_TESTS = Manager().dict()
main()