9952217749
Python DataFrame.head/take now requires scanning all the partitions. This pull request changes them to delegate the actual implementation to Scala DataFrame (by calling DataFrame.take). This is more of a hack for fixing this issue in 1.5.1. A more proper fix is to change executeCollect and executeTake to return InternalRow rather than Row, and thus eliminate the extra round-trip conversion. Author: Reynold Xin <rxin@databricks.com> Closes #8876 from rxin/SPARK-10731. |
||
---|---|---|
.. | ||
docs | ||
lib | ||
pyspark | ||
test_support | ||
.gitignore | ||
run-tests | ||
run-tests.py |