spark-instrumented-optimizer/python
Reynold Xin 9952217749 [SPARK-10731] [SQL] Delegate to Scala's DataFrame.take implementation in Python DataFrame.
Python DataFrame.head/take now requires scanning all the partitions. This pull request changes them to delegate the actual implementation to Scala DataFrame (by calling DataFrame.take).

This is more of a hack for fixing this issue in 1.5.1. A more proper fix is to change executeCollect and executeTake to return InternalRow rather than Row, and thus eliminate the extra round-trip conversion.

Author: Reynold Xin <rxin@databricks.com>

Closes #8876 from rxin/SPARK-10731.
2015-09-23 16:43:21 -07:00
..
docs [SPARK-10440] [STREAMING] [DOCS] Update python API stuff in the programming guides and python docs 2015-09-04 23:16:39 -10:00
lib [SPARK-2305] [PySpark] Update Py4J to version 0.8.2.1 2014-07-29 19:02:06 -07:00
pyspark [SPARK-10731] [SQL] Delegate to Scala's DataFrame.take implementation in Python DataFrame. 2015-09-23 16:43:21 -07:00
test_support [SPARK-10716] [BUILD] spark-1.5.0-bin-hadoop2.6.tgz file doesn't uncompress on OS X due to hidden file 2015-09-21 23:29:59 -07:00
.gitignore [SPARK-3946] gitignore in /python includes wrong directory 2014-10-14 14:09:39 -07:00
run-tests [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system 2015-06-27 20:24:34 -07:00
run-tests.py [SPARK-9572] [STREAMING] [PYSPARK] Added StreamingContext.getActiveOrCreate() in Python 2015-08-11 12:02:28 -07:00