spark-instrumented-optimizer/python/pyspark
Josh Rosen 1381fc72f7 Switch from MUTF8 to UTF8 in PySpark serializers.
This fixes SPARK-1043, a bug introduced in 0.9.0
where PySpark couldn't serialize strings > 64kB.

This fix was written by @tyro89 and @bouk in #512.
This commit squashes and rebases their pull request
in order to fix some merge conflicts.
2014-01-28 20:20:08 -08:00
..
mllib Complain if Python and NumPy versions are too old for MLlib 2014-01-14 12:27:58 -08:00
__init__.py Changes on top of Prashant's patch. 2014-01-03 18:30:17 -08:00
accumulators.py Add custom serializer support to PySpark. 2013-11-10 16:45:38 -08:00
broadcast.py Fix some Python docs and make sure to unset SPARK_TESTING in Python 2013-12-29 20:15:07 -05:00
cloudpickle.py Rename top-level 'pyspark' directory to 'python' 2013-01-01 15:05:00 -08:00
conf.py Merge pull request #462 from mateiz/conf-file-fix 2014-01-18 16:20:00 -08:00
context.py Switch from MUTF8 to UTF8 in PySpark serializers. 2014-01-28 20:20:08 -08:00
daemon.py Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
files.py Initial work to rename package to org.apache.spark 2013-09-01 14:13:13 -07:00
java_gateway.py sbin/spark-class* -> bin/spark-class* 2014-01-03 15:08:01 +05:30
join.py Change numSplits to numPartitions in PySpark. 2013-02-24 13:25:09 -08:00
rdd.py Deprecate mapPartitionsWithSplit in PySpark. 2014-01-23 20:01:36 -08:00
rddsampler.py RDD sample() and takeSample() prototypes for PySpark 2013-08-28 16:46:13 -07:00
serializers.py Switch from MUTF8 to UTF8 in PySpark serializers. 2014-01-28 20:20:08 -08:00
shell.py pyspark -> bin/pyspark 2014-01-02 18:50:12 +05:30
statcounter.py Implementing SPARK-838: Add DoubleRDDFunctions methods to PySpark 2013-08-21 17:05:58 -07:00
storagelevel.py Export StorageLevel and refactor 2013-09-07 14:41:31 -07:00
tests.py Fix for SPARK-1025: PySpark hang on missing files. 2014-01-23 18:24:51 -08:00
worker.py Switch from MUTF8 to UTF8 in PySpark serializers. 2014-01-28 20:20:08 -08:00