spark-instrumented-optimizer

History

Huaxin Gao f83cb3cbb3 [SPARK-31925][ML] Summary.totalIterations greater than maxIters ### What changes were proposed in this pull request? In LogisticRegression and LinearRegression, if set maxIter=n, the model.summary.totalIterations returns n+1 if the training procedure does not drop out. This is because we use ```objectiveHistory.length``` as totalIterations, but ```objectiveHistory``` contains init sate, thus ```objectiveHistory.length``` is 1 larger than number of training iterations. ### Why are the changes needed? correctness ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? add new tests and also modify existing tests Closes #28786 from huaxingao/summary_iter. Authored-by: Huaxin Gao <huaxing@us.ibm.com> Signed-off-by: Sean Owen <srowen@gmail.com>		2020-06-15 08:49:03 -05:00
..
__init__.py	[SPARK-26033][PYTHON][TESTS] Break large ml/tests.py file into smaller files	2018-11-18 16:02:15 +08:00
test_algorithms.py	[MINOR][ML] Change DecisionTreeClassifier to FMClassifier in OneVsRest setWeightCol test	2020-01-17 10:04:41 +08:00
test_base.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_evaluation.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_feature.py	[SPARK-23578][ML][PYSPARK] Binarizer support multi-column	2019-10-16 18:32:07 +08:00
test_image.py	[SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3	2019-09-09 10:19:40 -05:00
test_linalg.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_param.py	[SPARK-31652][ML][PYSPARK] Add ANOVASelector and FValueSelector to PySpark	2020-05-08 11:02:24 +08:00
test_persistence.py	[SPARK-30504][PYTHON][ML] Set weightCol in OneVsRest(Model) _to_java and _from_java	2020-01-15 08:42:24 -06:00
test_pipeline.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_stat.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_training_summary.py	[SPARK-31925][ML] Summary.totalIterations greater than maxIters	2020-06-15 08:49:03 -05:00
test_tuning.py	[SPARK-31497][ML][PYSPARK] Fix Pyspark CrossValidator/TrainValidationSplit with pipeline estimator cannot save and load model	2020-04-26 21:04:14 -07:00
test_wrapper.py	[SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's	2019-11-08 06:44:58 +09:00