spark-instrumented-optimizer/python/pyspark/ml
Yanbo Liang c19680be1c [SPARK-19852][PYSPARK][ML] Python StringIndexer supports 'keep' to handle invalid data
## What changes were proposed in this pull request?
This PR is to maintain API parity with changes made in SPARK-17498 to support a new option
'keep' in StringIndexer to handle unseen labels or NULL values with PySpark.

Note: This is updated version of #17237 , the primary author of this PR is VinceShieh .
## How was this patch tested?
Unit tests.

Author: VinceShieh <vincent.xie@intel.com>
Author: Yanbo Liang <ybliang8@gmail.com>

Closes #18453 from yanboliang/spark-19852.
2017-07-02 16:17:03 +08:00
..
linalg [SPARK-20214][ML] Make sure converted csc matrix has sorted indices 2017-04-05 17:46:44 -07:00
param [SPARK-14772][PYTHON][ML] Fixed Params.copy method to match Scala implementation 2017-02-23 18:05:58 -08:00
__init__.py [SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLlib guide 2016-07-15 13:38:23 -07:00
base.py [SPARK-15364][ML][PYSPARK] Implement PySpark picklers for ml.Vector and ml.Matrix under spark.ml.python 2016-06-13 19:59:53 -07:00
classification.py [SPARK-18518][ML] HasSolver supports override 2017-07-01 15:37:41 +08:00
clustering.py [SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe 2017-03-03 16:43:45 -08:00
common.py [SPARK-17679] [PYSPARK] remove unnecessary Py4J ListConverter patch 2016-10-03 14:12:03 -07:00
evaluation.py [SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe 2017-03-03 16:43:45 -08:00
feature.py [SPARK-19852][PYSPARK][ML] Python StringIndexer supports 'keep' to handle invalid data 2017-07-02 16:17:03 +08:00
fpm.py [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert) param of PySpark FPGrowth. 2017-05-25 21:40:39 +08:00
pipeline.py [SPARK-19348][PYTHON] PySpark keyword_only decorator is not thread-safe 2017-03-03 16:43:45 -08:00
recommendation.py [SPARK-20300][ML][PYSPARK] Python API for ALSModel.recommendForAllUsers,Items 2017-05-02 10:49:13 +02:00
regression.py [SPARK-18518][ML] HasSolver supports override 2017-07-01 15:37:41 +08:00
stat.py [SPARK-20076][ML][PYSPARK] Add Python interface for ml.stats.Correlation 2017-04-07 11:00:10 +02:00
tests.py [SPARK-19852][PYSPARK][ML] Python StringIndexer supports 'keep' to handle invalid data 2017-07-02 16:17:03 +08:00
tuning.py [SPARK-20861][ML][PYTHON] Delegate looping over paramMaps to estimators 2017-05-23 20:56:01 -07:00
util.py [SPARK-20707][ML] ML deprecated APIs should be removed in major release. 2017-05-16 10:08:23 +08:00
wrapper.py [SPARK-17161][PYSPARK][ML] Add PySpark-ML JavaWrapper convenience function to create Py4J JavaArrays 2017-01-31 15:42:36 -08:00