spark-instrumented-optimizer

History

Joseph K. Bradley 7e3423b9c0 [SPARK-13951][ML][PYTHON] Nested Pipeline persistence Adds support for saving and loading nested ML Pipelines from Python. Pipeline and PipelineModel do not extend JavaWrapper, but they are able to utilize the JavaMLWriter, JavaMLReader implementations. Also: * Separates out interfaces from Java wrapper implementations for MLWritable, MLReadable, MLWriter, MLReader. * Moves methods _stages_java2py, _stages_py2java into Pipeline, PipelineModel as _transfer_stage_from_java, _transfer_stage_to_java Added new unit test for nested Pipelines. Abstracted validity check into a helper method for the 2 unit tests. Author: Joseph K. Bradley <joseph@databricks.com> Closes #11866 from jkbradley/nested-pipeline-io. Closes #11835		2016-03-22 12:11:37 -07:00
..
ml	[SPARK-13951][ML][PYTHON] Nested Pipeline persistence	2016-03-22 12:11:37 -07:00
mllib	[SPARK-13672][ML] Add python examples of BisectingKMeans in ML and MLLIB	2016-03-11 09:21:12 +02:00
sql	[SPARK-13953][SQL] Specifying the field name for corrupted record via option at JSON datasource	2016-03-22 20:30:48 +08:00
streaming	[SPARK-13843][STREAMING] Remove streaming-flume, streaming-mqtt, streaming-zeromq, streaming-akka, streaming-twitter to Spark packages	2016-03-14 16:56:04 -07:00
__init__.py	[SPARK-10380][SQL] Fix confusing documentation examples for astype/drop_duplicates.	2016-03-14 19:25:49 -07:00
accumulators.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
broadcast.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
cloudpickle.py	[SPARK-13697] [PYSPARK] Fix the missing module name of TransformFunctionSerializer.loads	2016-03-06 08:57:01 -08:00
conf.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
context.py	[SPARK-12617][PYSPARK] Move Py4jCallbackConnectionCleaner to Streaming	2016-01-06 12:03:01 -08:00
daemon.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
files.py	[SPARK-3309] [PySpark] Put all public API in __all__	2014-09-03 11:49:45 -07:00
heapq3.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
java_gateway.py	[SPARK-9700] Pick default page size more intelligently.	2015-08-06 23:18:29 -07:00
join.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
profiler.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
rdd.py	[SPARK-13467] [PYSPARK] abstract python function to simplify pyspark code	2016-02-24 12:44:54 -08:00
rddsampler.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
resultiterable.py	[SPARK-3074] [PySpark] support groupByKey() with single huge key	2015-04-09 17:07:23 -07:00
serializers.py	[SPARK-10542] [PYSPARK] fix serialize namedtuple	2015-09-14 19:46:34 -07:00
shell.py	[SPARK-12993][PYSPARK] Remove usage of ADD_FILES in pyspark	2016-01-26 14:58:39 -08:00
shuffle.py	[SPARK-10710] Remove ability to disable spilling in core and SQL	2015-09-19 21:40:21 -07:00
statcounter.py	[SPARK-6919] [PYSPARK] Add asDict method to StatCounter	2015-09-29 13:38:15 -07:00
status.py	[SPARK-4172] [PySpark] Progress API in Python	2015-02-17 13:36:43 -08:00
storagelevel.py	[SPARK-12091] [PYSPARK] Deprecate the JAVA-specific deserialized storage levels	2015-12-18 20:06:05 -08:00
tests.py	[SPARK-13697] [PYSPARK] Fix the missing module name of TransformFunctionSerializer.loads	2016-03-06 08:57:01 -08:00
traceback_utils.py	[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.	2014-09-15 19:28:17 -07:00
worker.py	[SPARK-8976] [PYSPARK] fix open mode in python3	2015-08-13 17:33:37 -07:00