spark-instrumented-optimizer

History

zero323 df22535bbd [SPARK-28985][PYTHON][ML][FOLLOW-UP] Add _AFTSurvivalRegressionParams ### What changes were proposed in this pull request? Adds ```python _AFTSurvivalRegressionParams(HasFeaturesCol, HasLabelCol, HasPredictionCol, HasMaxIter, HasTol, HasFitIntercept, HasAggregationDepth): ... ``` with related Params and uses it to replace `HasFitIntercept`, `HasMaxIter`, `HasTol` and `HasAggregationDepth` in `AFTSurvivalRegression` base classes and `JavaPredictionModel,` in `AFTSurvivalRegressionModel` base classes. ### Why are the changes needed? Previous work (#25776) on [SPARK-28985](https://issues.apache.org/jira/browse/SPARK-28985) replaced `JavaEstimator`, `HasFeaturesCol`, `HasLabelCol`, `HasPredictionCol` in `AFTSurvivalRegression` and `JavaModel` in `AFTSurvivalRegressionModel` with newly added `JavaPredictor`: `e97b55d322/python/pyspark/ml/wrapper.py (L377)` and `JavaPredictionModel` `e97b55d322/python/pyspark/ml/wrapper.py (L405)` respectively. This however is inconsistent with Scala counterpart where both classes extend private `AFTSurvivalRegressionBase` `eb037a8180/mllib/src/main/scala/org/apache/spark/ml/regression/AFTSurvivalRegression.scala (L48-L50)` This preserves some of the existing inconsistencies (variables as defined in [the official example](https://github.com/apache/spark/blob/master/examples/src/main/python/ml/aft_survival_regression.p)) ``` from pyspark.ml.regression import AFTSurvivalRegression, AFTSurvivalRegressionModel from pyspark.ml.param.shared import HasMaxIter, HasTol, HasFitIntercept, HasAggregationDepth from pyspark.ml.param import Param issubclass(AFTSurvivalRegressionModel, HasMaxIter) # False hasattr(model, "maxIter") and isinstance(model.maxIter, Param) # True issubclass(AFTSurvivalRegressionModel, HasTol) # False hasattr(model, "tol") and isinstance(model.tol, Param) # True ``` and can cause problems in the future, if Predictor / PredictionModel API changes (unlike [`IsotonicRegression`](https://github.com/apache/spark/pull/26023), current implementation is technically speaking correct, though incomplete). ### Does this PR introduce any user-facing change? Yes, it adds a number of base classes to `AFTSurvivalRegressionModel`. These change purely additive and have negligible potential for breaking existing code (and none, compared to changes already made in #25776). Additionally affected API hasn't been released in the current form yet. ### How was this patch tested? - Existing unit tests. - Manual testing. CC huaxingao, zhengruifeng Closes #26024 from zero323/SPARK-28985-FOLLOW-UP-aftsurival-regression. Authored-by: zero323 <mszymkiewicz@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>		2019-10-04 18:04:21 -05:00
..
linalg	[SPARK-28206][PYTHON] Remove the legacy Epydoc in PySpark API documentation	2019-07-05 10:08:22 -07:00
param	[SPARK-22797][ML][PYTHON] Bucketizer support multi-column	2019-09-17 11:52:20 +08:00
tests	[SPARK-28985][PYTHON][ML] Add common classes (JavaPredictor/JavaClassificationModel/JavaProbabilisticClassifier) in PYTHON	2019-09-19 08:17:25 -05:00
__init__.py	[SPARK-24477][SPARK-24454][ML][PYTHON] Imports submodule in ml/__init__.py and add ImageSchema into __all__	2018-06-08 09:32:11 -07:00
base.py	[SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations	2019-09-01 10:15:00 -05:00
classification.py	[SPARK-28985][PYTHON][ML] Add common classes (JavaPredictor/JavaClassificationModel/JavaProbabilisticClassifier) in PYTHON	2019-09-19 08:17:25 -05:00
clustering.py	[SPARK-29142][PYTHON][ML][FOLLOWUP][DOC] Replace incorrect :py:attr: applications	2019-10-03 02:45:44 -07:00
common.py	[SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConverter patch	2016-10-03 14:12:03 -07:00
evaluation.py	[SPARK-29258][ML][PYSPARK] parity between ml.evaluator and mllib.metrics	2019-09-27 13:30:03 +08:00
feature.py	[SPARK-22796][PYTHON][ML] Add multiple columns support to PySpark QuantileDiscretizer	2019-09-18 12:16:06 -07:00
fpm.py	[SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations	2019-09-01 10:15:00 -05:00
image.py	[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0	2019-07-31 14:26:18 +09:00
pipeline.py	[SPARK-17025][ML][PYTHON] Persistence for Pipelines with Python-only Stages	2017-08-11 23:57:08 -07:00
recommendation.py	[SPARK-28927][ML] Rethrow block mismatch exception in ALS when input data is nondeterministic	2019-09-18 09:22:13 -05:00
regression.py	[SPARK-28985][PYTHON][ML][FOLLOW-UP] Add _AFTSurvivalRegressionParams	2019-10-04 18:04:21 -05:00
stat.py	[SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations	2019-09-01 10:15:00 -05:00
tuning.py	[SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations	2019-09-01 10:15:00 -05:00
util.py	[SPARK-28985][PYTHON][ML] Add common classes (JavaPredictor/JavaClassificationModel/JavaProbabilisticClassifier) in PYTHON	2019-09-19 08:17:25 -05:00
wrapper.py	[SPARK-28985][PYTHON][ML] Add common classes (JavaPredictor/JavaClassificationModel/JavaProbabilisticClassifier) in PYTHON	2019-09-19 08:17:25 -05:00