spark-instrumented-optimizer/python
Joseph K. Bradley 7e3423b9c0 [SPARK-13951][ML][PYTHON] Nested Pipeline persistence
Adds support for saving and loading nested ML Pipelines from Python.  Pipeline and PipelineModel do not extend JavaWrapper, but they are able to utilize the JavaMLWriter, JavaMLReader implementations.

Also:
* Separates out interfaces from Java wrapper implementations for MLWritable, MLReadable, MLWriter, MLReader.
* Moves methods _stages_java2py, _stages_py2java into Pipeline, PipelineModel as _transfer_stage_from_java, _transfer_stage_to_java

Added new unit test for nested Pipelines.  Abstracted validity check into a helper method for the 2 unit tests.

Author: Joseph K. Bradley <joseph@databricks.com>

Closes #11866 from jkbradley/nested-pipeline-io.
Closes #11835
2016-03-22 12:11:37 -07:00
..
docs [SPARK-13848][SPARK-5185] Update to Py4J 0.9.2 in order to fix classloading issue 2016-03-14 12:22:02 -07:00
lib [SPARK-13848][SPARK-5185] Update to Py4J 0.9.2 in order to fix classloading issue 2016-03-14 12:22:02 -07:00
pyspark [SPARK-13951][ML][PYTHON] Nested Pipeline persistence 2016-03-22 12:11:37 -07:00
test_support [SPARK-13509][SPARK-13507][SQL] Support for writing CSV with a single function call 2016-02-29 09:44:29 -08:00
.gitignore [SPARK-3946] gitignore in /python includes wrong directory 2014-10-14 14:09:39 -07:00
pylintrc [SPARK-13596][BUILD] Move misc top-level build files into appropriate subdirs 2016-03-07 14:48:02 -08:00
run-tests [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system 2015-06-27 20:24:34 -07:00
run-tests.py [SPARK-12243][BUILD][PYTHON] PySpark tests are slow in Jenkins. 2016-03-07 12:06:46 -08:00