spark-instrumented-optimizer/python
Matei Zaharia feba7ee540 SPARK-815. Python parallelize() should split lists before batching
One unfortunate consequence of this fix is that we materialize any
collections that are given to us as generators, but this seems necessary
to get reasonable behavior on small collections. We could add a
batchSize parameter later to bypass auto-computation of batch size if
this becomes a problem (e.g. if users really want to parallelize big
generators nicely)
2013-07-29 02:51:43 -04:00
..
examples Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
lib Rename top-level 'pyspark' directory to 'python' 2013-01-01 15:05:00 -08:00
pyspark SPARK-815. Python parallelize() should split lists before batching 2013-07-29 02:51:43 -04:00
test_support Allow PySpark's SparkFiles to be used from driver 2013-01-23 10:58:50 -08:00
.gitignore Rename top-level 'pyspark' directory to 'python' 2013-01-01 15:05:00 -08:00
epydoc.conf Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
run-tests Allow python/run-tests to run from any directory 2013-07-29 02:51:43 -04:00