spark-instrumented-optimizer/python/pyspark/ml
Eric Liang 922338812c [SPARK-9681] [ML] Support R feature interactions in RFormula
This integrates the Interaction feature transformer with SparkR R formula support (i.e. support `:`).

To generate reasonable ML attribute names for feature interactions, it was necessary to add the ability to read attribute the original attribute names back from `StructField`, and also to specify custom group prefixes in `VectorAssembler`. This also has the side-benefit of cleaning up the double-underscores in the attributes generated for non-interaction terms.

mengxr

Author: Eric Liang <ekl@databricks.com>

Closes #8830 from ericl/interaction-2.
2015-09-25 00:43:22 -07:00
..
param [DOC] [PYSPARK] [MLLIB] Added newlines to docstrings to fix parameter formatting 2015-09-21 14:24:19 -07:00
__init__.py [SPARK-7535] [.0] [MLLIB] Audit the pipeline APIs for 1.4 2015-05-21 22:57:33 -07:00
classification.py [SPARK-9773] [ML] [PySpark] Add Python API for MultilayerPerceptronClassifier 2015-09-11 08:52:28 -07:00
clustering.py [SPARK-10281] [ML] [PYSPARK] [DOCS] Add @since annotation to pyspark.ml.clustering 2015-09-17 08:47:21 -07:00
evaluation.py [SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model 2015-08-27 23:59:30 -07:00
feature.py [SPARK-9681] [ML] Support R feature interactions in RFormula 2015-09-25 00:43:22 -07:00
pipeline.py [DOC] [PYSPARK] [MLLIB] Added newlines to docstrings to fix parameter formatting 2015-09-21 14:24:19 -07:00
recommendation.py [SPARK-10282] [ML] [PYSPARK] [DOCS] Add @since annotation to pyspark.ml.recommendation 2015-09-17 08:51:19 -07:00
regression.py [SPARK-10283] [ML] [PYSPARK] [DOCS] Add @since annotation to pyspark.ml.regression 2015-09-17 08:45:20 -07:00
tests.py [SPARK-10615] [PYSPARK] change assertEquals to assertEqual 2015-09-18 09:53:52 -07:00
tuning.py [DOC] [PYSPARK] [MLLIB] Added newlines to docstrings to fix parameter formatting 2015-09-21 14:24:19 -07:00
util.py [SPARK-7380] [MLLIB] pipeline stages should be copyable in Python 2015-05-18 12:02:18 -07:00
wrapper.py [DOC] [PYSPARK] [MLLIB] Added newlines to docstrings to fix parameter formatting 2015-09-21 14:24:19 -07:00