spark-instrumented-optimizer

History

Davies Liu 091d32c52e [SPARK-3971] [MLLib] [PySpark] hotfix: Customized pickler should work in cluster mode Customized pickler should be registered before unpickling, but in executor, there is no way to register the picklers before run the tasks. So, we need to register the picklers in the tasks itself, duplicate the javaToPython() and pythonToJava() in MLlib, call SerDe.initialize() before pickling or unpickling. Author: Davies Liu <davies.liu@gmail.com> Closes #2830 from davies/fix_pickle and squashes the following commits: 0c85fb9 [Davies Liu] revert the privacy change 6b94e15 [Davies Liu] use JavaConverters instead of JavaConversions 0f02050 [Davies Liu] hotfix: Customized pickler does not work in cluster		2014-10-16 14:56:50 -07:00
..
mllib	[SPARK-3971] [MLLib] [PySpark] hotfix: Customized pickler should work in cluster mode	2014-10-16 14:56:50 -07:00
streaming	[SPARK-2377] Python API for Streaming	2014-10-12 02:46:56 -07:00
__init__.py	[SPARK-3412] [PySpark] Replace Epydoc with Sphinx to generate Python API docs	2014-10-07 18:09:27 -07:00
accumulators.py	[SPARK-3478] [PySpark] Profile the Python tasks	2014-09-30 18:24:57 -07:00
broadcast.py	[SPARK-3430] [PySpark] [Doc] generate PySpark API docs using Sphinx	2014-09-16 12:51:58 -07:00
cloudpickle.py	[SPARK-3679] [PySpark] pickle the exact globals of functions	2014-09-24 13:00:05 -07:00
conf.py	[SPARK-3412] [PySpark] Replace Epydoc with Sphinx to generate Python API docs	2014-10-07 18:09:27 -07:00
context.py	[SPARK-3971] [MLLib] [PySpark] hotfix: Customized pickler should work in cluster mode	2014-10-16 14:56:50 -07:00
daemon.py	[SPARK-3030] [PySpark] Reuse Python worker	2014-09-13 16:22:04 -07:00
files.py	[SPARK-3309] [PySpark] Put all public API in __all__	2014-09-03 11:49:45 -07:00
heapq3.py	[SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey()	2014-08-26 16:57:40 -07:00
java_gateway.py	[SPARK-3167] Handle special driver configs in Windows	2014-08-26 22:52:16 -07:00
join.py	[SPARK-546] Add full outer join to RDD and DStream.	2014-09-24 20:39:09 -07:00
rdd.py	[Spark] RDD take() method: overestimate too much	2014-10-13 13:11:55 -07:00
rddsampler.py	[SPARK-2627] [PySpark] have the build enforce PEP 8 automatically	2014-08-06 12:58:24 -07:00
resultiterable.py	[SPARK-2627] [PySpark] have the build enforce PEP 8 automatically	2014-08-06 12:58:24 -07:00
serializers.py	[SPARK-2377] Python API for Streaming	2014-10-12 02:46:56 -07:00
shell.py	[SPARK-3273][SPARK-3301]We should read the version information from the same place	2014-09-06 15:08:43 -07:00
shuffle.py	[SPARK-3786] [PySpark] speedup tests	2014-10-06 14:07:53 -07:00
sql.py	[SPARK-3909][PySpark][Doc] A corrupted format in Sphinx documents and building warnings	2014-10-11 11:51:59 -07:00
statcounter.py	StatCounter on NumPy arrays [PYSPARK][SPARK-2012]	2014-08-01 22:33:25 -07:00
storagelevel.py	[SPARK-3417] Use new-style classes in PySpark	2014-09-08 15:45:36 -07:00
tests.py	[SPARK-3867][PySpark] ./python/run-tests failed when it run with Python 2.6 and unittest2 is not installed	2014-10-11 11:26:17 -07:00
traceback_utils.py	[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.	2014-09-15 19:28:17 -07:00
worker.py	[SPARK-3478] [PySpark] Profile the Python tasks	2014-09-30 18:24:57 -07:00