spark-instrumented-optimizer/python/pyspark/tests
Takuya UESHIN 55111cafd1 [SPARK-36062][PYTHON] Try to capture faulthanlder when a Python worker crashes
### What changes were proposed in this pull request?

Try to capture the error message from the `faulthandler` when the Python worker crashes.

### Why are the changes needed?

Currently, we just see an error message saying `"exited unexpectedly (crashed)"` when the UDFs causes the Python worker to crash by like segmentation fault.
We should take advantage of [`faulthandler`](https://docs.python.org/3/library/faulthandler.html) and try to capture the error message from the `faulthandler`.

### Does this PR introduce _any_ user-facing change?

Yes, when a Spark config `spark.python.worker.faulthandler.enabled` is `true`, the stack trace will be seen in the error message when the Python worker crashes.

```py
>>> def f():
...   import ctypes
...   ctypes.string_at(0)
...
>>> sc.parallelize([1]).map(lambda x: f()).count()
```

```
org.apache.spark.SparkException: Python worker exited unexpectedly (crashed): Fatal Python error: Segmentation fault

Current thread 0x000000010965b5c0 (most recent call first):
  File "/.../ctypes/__init__.py", line 525 in string_at
  File "<stdin>", line 3 in f
  File "<stdin>", line 1 in <lambda>
...
```

### How was this patch tested?

Added some tests, and manually.

Closes #33273 from ueshin/issues/SPARK-36062/faulthandler.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 115b8a180f)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-07-09 11:31:00 +09:00
..
__init__.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_appsubmit.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
test_broadcast.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_conf.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_context.py [SPARK-35303][PYTHON] Enable pinned thread mode by default 2021-06-18 12:02:29 +09:00
test_daemon.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_install_spark.py [SPARK-32714][FOLLOW-UP][PYTHON] Address pyspark.install typing errors 2020-09-27 16:21:23 +09:00
test_join.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_pin_thread.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_profiler.py [SPARK-33613][PYTHON][TESTS] Replace deprecated APIs in pyspark tests 2020-12-01 10:34:40 +09:00
test_rdd.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
test_rddbarrier.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_readwrite.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_serializers.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
test_shuffle.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_taskcontext.py [SPARK-32194][PYTHON] Use proper exception classes instead of plain Exception 2021-05-26 11:54:40 +09:00
test_util.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_worker.py [SPARK-36062][PYTHON] Try to capture faulthanlder when a Python worker crashes 2021-07-09 11:31:00 +09:00