ddbce4edee
### What changes were proposed in this pull request? This PR try to fix a bug in `org.apache.spark.sql.hive.execution.ScriptTransformationExec`. This bug appears in our online cluster. `ScriptTransformationExec` should throw an exception, when user uses a python script which contains parse error. But current implementation may miss this case of failure. ### Why are the changes needed? When user uses a python script which contains a parse error, there will be no output. So `scriptOutputReader.next(scriptOutputWritable) <= 0` matches, then we use `checkFailureAndPropagate()` to check the `proc`. But the `proc` may still be alive and `writerThread.exception` is not defined, `checkFailureAndPropagate` cannot check this case of failure. In the end, the Spark SQL job runs successfully and returns no result. In fact, the SparK SQL job should fails and shows the exception properly. For example, the error python script is blow. ``` python # encoding: utf8 import unknow_module import sys for line in sys.stdin: print line ``` The bug can be reproduced by running the following code in our cluter. ``` spark.range(100*100).toDF("index").createOrReplaceTempView("test") spark.sql("select TRANSFORM(index) USING 'python error_python.py' as new_index from test").collect.foreach(println) ``` ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing UT Closes #27724 from slamke/transformation. Authored-by: sunke.03 <sunke.03@bytedance.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> |
||
---|---|---|
.. | ||
benchmarks | ||
compatibility/src/test/scala/org/apache/spark/sql/hive/execution | ||
src | ||
pom.xml |