spark-instrumented-optimizer

History

Davies Liu 5ccecc078a [SPARK-15392][SQL] fix default value of size estimation of logical plan ## What changes were proposed in this pull request? We use autoBroadcastJoinThreshold + 1L as the default value of size estimation, that is not good in 2.0, because we will calculate the size based on size of schema, then the estimation could be less than autoBroadcastJoinThreshold if you have an SELECT on top of an DataFrame created from RDD. This PR change the default value to Long.MaxValue. ## How was this patch tested? Added regression tests. Author: Davies Liu <davies@databricks.com> Closes #13183 from davies/fix_default_size.		2016-05-19 12:12:42 -07:00
..
docs	[SPARK-14906][ML] Copy linalg in PySpark to new ML package	2016-05-17 00:08:02 -07:00
lib	[SPARK-15061][PYSPARK] Upgrade to Py4J 0.10.1	2016-05-13 08:59:18 +01:00
pyspark	[SPARK-15392][SQL] fix default value of size estimation of logical plan	2016-05-19 12:12:42 -07:00
test_support	[SPARK-14555] First cut of Python API for Structured Streaming	2016-04-20 10:32:01 -07:00
.gitignore	[SPARK-3946] gitignore in /python includes wrong directory	2014-10-14 14:09:39 -07:00
pylintrc	[SPARK-13596][BUILD] Move misc top-level build files into appropriate subdirs	2016-03-07 14:48:02 -08:00
run-tests	[SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system	2015-06-27 20:24:34 -07:00
run-tests.py	[SPARK-13579][BUILD] Stop building the main Spark assembly.	2016-04-04 16:52:22 -07:00