spark-instrumented-optimizer

History

HyukjinKwon 7c05f61514 [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark ## What changes were proposed in this pull request? Currently, pretty skipped message added by `f7435bec6a` mechanism seems not working when xmlrunner is installed apparently. This PR fixes two things: 1. When `xmlrunner` is installed, seems `xmlrunner` does not respect `vervosity` level in unittests (default is level 1). So the output looks as below ``` Running tests... ---------------------------------------------------------------------- SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS ---------------------------------------------------------------------- ``` So it is not caught by our message detection mechanism. 2. If we manually set the `vervocity` level to `xmlrunner`, it prints messages as below: ``` test_mixed_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s) test_mixed_udf_and_sql (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s) ... ``` This is different in our Jenkins machine: ``` test_createDataFrame_column_name_encoding (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.' test_createDataFrame_does_not_modify_input (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.' ... ``` Note that last `SKIP` is different. This PR fixes the regular expression to catch `SKIP` case as well. ## How was this patch tested? Manually tested. Before: ``` Starting test(python2.7): pyspark.... Finished test(python2.7): pyspark.... (0s) ... Tests passed in 562 seconds ======================================================================== ... ``` After: ``` Starting test(python2.7): pyspark.... Finished test(python2.7): pyspark.... (48s) ... 93 tests were skipped ... Tests passed in 560 seconds Skipped tests pyspark.... with python2.7: pyspark...(...) ... SKIP (0.000s) ... ======================================================================== ... ``` Closes #24927 from HyukjinKwon/SPARK-28130. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>		2019-06-24 09:58:17 +09:00
..
linalg	[SPARK-9792] Make DenseMatrix equality semantical	2019-04-01 09:30:33 -07:00
param	[SPARK-24333][ML][PYTHON] Add fit with validation set to spark.ml GBT: Python API	2018-12-07 13:53:35 -08:00
tests	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
__init__.py	[SPARK-24477][SPARK-24454][ML][PYTHON] Imports submodule in ml/__init__.py and add ImageSchema into __all__	2018-06-08 09:32:11 -07:00
base.py	[SPARK-22922][ML][PYSPARK] Pyspark portion of the fit-multiple API	2017-12-29 16:31:25 -08:00
classification.py	[SPARK-27007][PYTHON] add rawPrediction to OneVsRest in PySpark	2019-03-02 09:09:28 -06:00
clustering.py	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
common.py	[SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConverter patch	2016-10-03 14:12:03 -07:00
evaluation.py	[SPARK-28044][ML][PYTHON] MulticlassClassificationEvaluator support more metrics	2019-06-19 08:56:15 -05:00
feature.py	[SPARK-18570][ML][R] RFormula support * and ^ operators	2019-06-04 08:59:30 -05:00
fpm.py	[SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis	2019-01-17 19:40:39 -06:00
image.py	[SPARK-26559][ML][PYSPARK] ML image can't work with numpy versions prior to 1.9	2019-01-07 18:36:52 +08:00
pipeline.py	[SPARK-17025][ML][PYTHON] Persistence for Pipelines with Python-only Stages	2017-08-11 23:57:08 -07:00
recommendation.py	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
regression.py	[SPARK-19591][ML][PYSPARK][FOLLOWUP] Add sample weights to decision trees	2019-02-27 21:11:30 -06:00
stat.py	[MINOR] Fix typos and misspellings	2018-11-05 17:34:23 -06:00
tuning.py	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
util.py	[SPARK-26754][PYTHON] Add hasTrainingSummary to replace duplicate code in PySpark	2019-02-01 17:29:58 -06:00
wrapper.py	[SPARK-22798][PYTHON][ML] Add multiple column support to PySpark StringIndexer	2019-02-20 08:52:46 -06:00