spark-instrumented-optimizer/python/pyspark/ml/tests
John Bauer e804ed5e33 [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform
modify Param._copyValues to check valid Param objects supplied as extra

### What changes were proposed in this pull request?

Estimator.fit() and Model.transform() accept a dictionary of extra parameters whose values are used to overwrite those supplied at initialization or by default.  Additionally, the ParamGridBuilder.addGrid accepts a parameter and list of values. The keys are presumed to be valid Param objects. This change adds a check that only Param objects are supplied as keys.

### Why are the changes needed?

Param objects are created by and bound to an instance of Params (Estimator, Model, or Transformer). They may be obtained from their parent as attributes, or by name through getParam.

The documentation does not state that keys must be valid Param objects, nor describe how one may be obtained. The current behavior is to silently ignore keys which are not valid Param objects.

### Does this PR introduce any user-facing change?

If the user does not pass in a Param object as required for keys in `extra` for Estimator.fit() and Model.transform(), and `param` for ParamGridBuilder.addGrid, an error will be raised indicating it is an invalid object.

### How was this patch tested?

Added method test_copy_param_extras_check to test_param.py.   Tested with Python 3.7

Closes #26527 from JohnHBauer/paramExtra.

Authored-by: John Bauer <john.h.bauer@gmail.com>
Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
2019-11-19 14:15:00 -08:00
..
__init__.py [SPARK-26033][PYTHON][TESTS] Break large ml/tests.py file into smaller files 2018-11-18 16:02:15 +08:00
test_algorithms.py [SPARK-28736][SPARK-28735][PYTHON][ML] Fix PySpark ML tests to pass in JDK 11 2019-08-16 19:47:29 +09:00
test_base.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_evaluation.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_feature.py [SPARK-23578][ML][PYSPARK] Binarizer support multi-column 2019-10-16 18:32:07 +08:00
test_image.py [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 2019-09-09 10:19:40 -05:00
test_linalg.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_param.py [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform 2019-11-19 14:15:00 -08:00
test_persistence.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_pipeline.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_stat.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_training_summary.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_tuning.py [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform 2019-11-19 14:15:00 -08:00
test_wrapper.py [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's 2019-11-08 06:44:58 +09:00