Josh Rosen
2415c18f48
Fix reporting of PySpark doctest failures.
2013-02-03 06:44:11 +00:00
Josh Rosen
e211f405bc
Use spark.local.dir for PySpark temp files (SPARK-580).
2013-02-01 11:50:27 -08:00
Josh Rosen
9cc6ff9c4e
Do not launch JavaGateways on workers (SPARK-674).
...
The problem was that the gateway was being initialized whenever the
pyspark.context module was loaded. The fix uses lazy initialization
that occurs only when SparkContext instances are actually constructed.
I also made the gateway and jvm variables private.
This change results in ~3-4x performance improvement when running the
PySpark unit tests.
2013-02-01 11:13:10 -08:00
Matei Zaharia
a2f4891d1d
Merge pull request #396 from JoshRosen/spark-653
...
Make PySpark AccumulatorParam an abstract base class
2013-01-24 13:05:03 -08:00
Josh Rosen
ae2ed2947d
Allow PySpark's SparkFiles to be used from driver
...
Fix minor documentation formatting issues.
2013-01-23 10:58:50 -08:00
Josh Rosen
35168d9c89
Fix sys.path bug in PySpark SparkContext.addPyFile
2013-01-22 17:54:11 -08:00
Josh Rosen
c75ae3622e
Make AccumulatorParam an abstract base class.
2013-01-21 22:32:57 -08:00
Josh Rosen
ef711902c1
Don't download files to master's working directory.
...
This should avoid exceptions caused by existing
files with different contents.
I also removed some unused code.
2013-01-21 17:34:17 -08:00
Josh Rosen
5b6ea9e9a0
Update checkpointing API docs in Python/Java.
2013-01-20 15:31:41 -08:00
Josh Rosen
d0ba80dc72
Add checkpointFile() and more tests to PySpark.
2013-01-20 13:59:45 -08:00
Josh Rosen
7ed1bf4b48
Add RDD checkpointing to Python API.
2013-01-20 13:19:19 -08:00
Matei Zaharia
8e7f098a2c
Added accumulators to PySpark
2013-01-20 01:57:44 -08:00
Josh Rosen
49c74ba2af
Change PYSPARK_PYTHON_EXEC to PYSPARK_PYTHON.
2013-01-10 08:10:59 -08:00
Josh Rosen
33beba3965
Change PySpark RDD.take() to not call iterator().
2013-01-03 14:52:21 -08:00
Josh Rosen
b58340dbd9
Rename top-level 'pyspark' directory to 'python'
2013-01-01 15:05:00 -08:00