Josh Rosen
cbb7f04aef
Add custom serializer support to PySpark.
...
For now, this only adds MarshalSerializer, but it lays the groundwork
for other supporting custom serializers. Many of these mechanisms
can also be used to support deserialization of different data formats
sent by Java, such as data encoded by MsgPack.
This also fixes a bug in SparkContext.union().
2013-11-10 16:45:38 -08:00
Josh Rosen
7a9abb9ddc
Fix PySpark unit tests on Python 2.6.
2013-08-14 15:12:12 -07:00
Matei Zaharia
96b50e82dc
Allow python/run-tests to run from any directory
2013-07-29 02:51:43 -04:00
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Josh Rosen
ef711902c1
Don't download files to master's working directory.
...
This should avoid exceptions caused by existing
files with different contents.
I also removed some unused code.
2013-01-21 17:34:17 -08:00
Josh Rosen
7ed1bf4b48
Add RDD checkpointing to Python API.
2013-01-20 13:19:19 -08:00
Matei Zaharia
61b6382a35
Launch accumulator tests in run-tests
2013-01-20 01:59:07 -08:00
Josh Rosen
1a64432ba5
Indicate success/failure in PySpark test script.
2013-01-09 20:30:36 -08:00
Josh Rosen
ce9f1bbe20
Add pyspark
script to replace the other scripts.
...
Expand the PySpark programming guide.
2013-01-01 21:25:49 -08:00