Patrick Wendell
3446d5c8d6
SPARK-673: Capture and re-throw Python exceptions
...
This patch alters the Python <-> executor protocol to pass on
exception data when they occur in user Python code.
2013-01-31 18:06:11 -08:00
Matei Zaharia
55327a283e
Merge pull request #430 from pwendell/pyspark-guide
...
Minor improvements to PySpark docs
2013-01-30 15:35:29 -08:00
Patrick Wendell
3f945e3b83
Make module help available in python shell.
...
Also, adds a line in doc explaining how to use.
2013-01-30 15:04:06 -08:00
Stephen Haberman
7dfb82a992
Replace old 'master' term with 'driver'.
2013-01-25 11:03:00 -06:00
Matei Zaharia
a2f4891d1d
Merge pull request #396 from JoshRosen/spark-653
...
Make PySpark AccumulatorParam an abstract base class
2013-01-24 13:05:03 -08:00
Josh Rosen
b47d054cfc
Remove use of abc.ABCMeta due to cloudpickle issue.
...
cloudpickle runs into issues while pickling subclasses of AccumulatorParam,
which may be related to this Python issue:
http://bugs.python.org/issue7689
This seems hard to fix and the ABCMeta wasn't necessary, so I removed it.
2013-01-23 11:47:27 -08:00
Josh Rosen
ae2ed2947d
Allow PySpark's SparkFiles to be used from driver
...
Fix minor documentation formatting issues.
2013-01-23 10:58:50 -08:00
Josh Rosen
35168d9c89
Fix sys.path bug in PySpark SparkContext.addPyFile
2013-01-22 17:54:11 -08:00
Josh Rosen
c75ae3622e
Make AccumulatorParam an abstract base class.
2013-01-21 22:32:57 -08:00
Josh Rosen
ef711902c1
Don't download files to master's working directory.
...
This should avoid exceptions caused by existing
files with different contents.
I also removed some unused code.
2013-01-21 17:34:17 -08:00
Matei Zaharia
c7b5e5f1ec
Merge pull request #389 from JoshRosen/python_rdd_checkpointing
...
Add checkpointing to the Python API
2013-01-20 17:10:44 -08:00
Josh Rosen
9f211dd3f0
Fix PythonPartitioner equality; see SPARK-654.
...
PythonPartitioner did not take the Python-side partitioning function
into account when checking for equality, which might cause problems
in the future.
2013-01-20 15:41:42 -08:00
Josh Rosen
00d70cd660
Clean up setup code in PySpark checkpointing tests
2013-01-20 15:38:11 -08:00
Josh Rosen
5b6ea9e9a0
Update checkpointing API docs in Python/Java.
2013-01-20 15:31:41 -08:00
Josh Rosen
d0ba80dc72
Add checkpointFile() and more tests to PySpark.
2013-01-20 13:59:45 -08:00
Josh Rosen
7ed1bf4b48
Add RDD checkpointing to Python API.
2013-01-20 13:19:19 -08:00
Josh Rosen
17035db159
Add __repr__ to Accumulator; fix bug in sc.accumulator
2013-01-20 11:58:57 -08:00
Josh Rosen
9f54d7e1f5
Merge pull request #387 from mateiz/python-accumulators
...
Add accumulators to PySpark
2013-01-20 11:00:36 -08:00
Matei Zaharia
2a8c2a6790
Minor formatting fixes
2013-01-20 10:24:53 -08:00
Matei Zaharia
a23ed25f3c
Add a class comment to Accumulator
2013-01-20 02:10:25 -08:00
Matei Zaharia
61b6382a35
Launch accumulator tests in run-tests
2013-01-20 01:59:07 -08:00
Matei Zaharia
8e7f098a2c
Added accumulators to PySpark
2013-01-20 01:57:44 -08:00
Nick Pentreath
b77f7390a5
Python ALS example
2013-01-15 09:04:32 +02:00
Josh Rosen
49c74ba2af
Change PYSPARK_PYTHON_EXEC to PYSPARK_PYTHON.
2013-01-10 08:10:59 -08:00
Josh Rosen
d55f2b9882
Use take() instead of takeSample() in PySpark kmeans example.
...
This is a temporary change until we port takeSample().
2013-01-09 21:21:23 -08:00
Josh Rosen
1a64432ba5
Indicate success/failure in PySpark test script.
2013-01-09 20:30:36 -08:00
Josh Rosen
b57dd0f160
Add mapPartitionsWithSplit() to PySpark.
2013-01-08 16:05:02 -08:00
Josh Rosen
33beba3965
Change PySpark RDD.take() to not call iterator().
2013-01-03 14:52:21 -08:00
Josh Rosen
ce9f1bbe20
Add pyspark
script to replace the other scripts.
...
Expand the PySpark programming guide.
2013-01-01 21:25:49 -08:00
Josh Rosen
b58340dbd9
Rename top-level 'pyspark' directory to 'python'
2013-01-01 15:05:00 -08:00
Josh Rosen
9abdfa6633
Fix Python 2.6 compatibility in Python API.
2012-09-17 00:09:16 -07:00
Josh Rosen
886b39de55
Add Python API.
2012-08-18 22:33:51 -07:00