spark-instrumented-optimizer/python/pyspark/ml
John Bauer e804ed5e33 [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform
modify Param._copyValues to check valid Param objects supplied as extra

### What changes were proposed in this pull request?

Estimator.fit() and Model.transform() accept a dictionary of extra parameters whose values are used to overwrite those supplied at initialization or by default.  Additionally, the ParamGridBuilder.addGrid accepts a parameter and list of values. The keys are presumed to be valid Param objects. This change adds a check that only Param objects are supplied as keys.

### Why are the changes needed?

Param objects are created by and bound to an instance of Params (Estimator, Model, or Transformer). They may be obtained from their parent as attributes, or by name through getParam.

The documentation does not state that keys must be valid Param objects, nor describe how one may be obtained. The current behavior is to silently ignore keys which are not valid Param objects.

### Does this PR introduce any user-facing change?

If the user does not pass in a Param object as required for keys in `extra` for Estimator.fit() and Model.transform(), and `param` for ParamGridBuilder.addGrid, an error will be raised indicating it is an invalid object.

### How was this patch tested?

Added method test_copy_param_extras_check to test_param.py.   Tested with Python 3.7

Closes #26527 from JohnHBauer/paramExtra.

Authored-by: John Bauer <john.h.bauer@gmail.com>
Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
2019-11-19 14:15:00 -08:00
..
linalg [SPARK-28206][PYTHON] Remove the legacy Epydoc in PySpark API documentation 2019-07-05 10:08:22 -07:00
param [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform 2019-11-19 14:15:00 -08:00
tests [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform 2019-11-19 14:15:00 -08:00
__init__.py [SPARK-24477][SPARK-24454][ML][PYTHON] Imports submodule in ml/__init__.py and add ImageSchema into __all__ 2018-06-08 09:32:11 -07:00
base.py [SPARK-29093][PYTHON][ML] Remove automatically generated param setters in _shared_params_code_gen.py 2019-10-28 11:36:10 +08:00
classification.py [SPARK-16872][ML][PYSPARK] Impl Gaussian Naive Bayes Classifier 2019-11-18 10:05:42 +08:00
clustering.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00
common.py [SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConverter patch 2016-10-03 14:12:03 -07:00
evaluation.py [SPARK-29093][PYTHON][ML] Remove automatically generated param setters in _shared_params_code_gen.py 2019-10-28 11:36:10 +08:00
feature.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00
fpm.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00
image.py [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0 2019-07-31 14:26:18 +09:00
pipeline.py [SPARK-17025][ML][PYTHON] Persistence for Pipelines with Python-only Stages 2017-08-11 23:57:08 -07:00
recommendation.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00
regression.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00
stat.py [SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations 2019-09-01 10:15:00 -05:00
tree.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00
tuning.py [SPARK-29691][ML][PYTHON] ensure Param objects are valid in fit, transform 2019-11-19 14:15:00 -08:00
util.py [SPARK-28985][PYTHON][ML] Add common classes (JavaPredictor/JavaClassificationModel/JavaProbabilisticClassifier) in PYTHON 2019-09-19 08:17:25 -05:00
wrapper.py [SPARK-29867][ML][PYTHON] Add __repr__ in Python ML Models 2019-11-15 21:44:39 -08:00