spark-instrumented-optimizer/python
Wenchen Fan 962e9bcf94 [SPARK-12756][SQL] use hash expression in Exchange
This PR makes bucketing and exchange share one common hash algorithm, so that we can guarantee the data distribution is same between shuffle and bucketed data source, which enables us to only shuffle one side when join a bucketed table and a normal one.

This PR also fixes the tests that are broken by the new hash behaviour in shuffle.

Author: Wenchen Fan <wenchen@databricks.com>

Closes #10703 from cloud-fan/use-hash-expr-in-shuffle.
2016-01-13 22:43:28 -08:00
..
docs [SPARK-12652][PYSPARK] Upgrade Py4J to 0.9.1 2016-01-12 14:27:05 -08:00
lib [SPARK-12652][PYSPARK] Upgrade Py4J to 0.9.1 2016-01-12 14:27:05 -08:00
pyspark [SPARK-12756][SQL] use hash expression in Exchange 2016-01-13 22:43:28 -08:00
test_support [SPARK-11292] [SQL] Python API for text data source 2015-10-28 14:28:38 -07:00
.gitignore [SPARK-3946] gitignore in /python includes wrong directory 2014-10-14 14:09:39 -07:00
run-tests [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system 2015-06-27 20:24:34 -07:00
run-tests.py [SPARK-12361][PYSPARK][TESTS] Should set PYSPARK_DRIVER_PYTHON before Python tests 2015-12-16 11:29:51 -08:00