spark-instrumented-optimizer

History

Achuth17 d36539741f [SPARK-24626][SQL] Improve location size calculation in Analyze Table command ## What changes were proposed in this pull request? Currently, Analyze table calculates table size sequentially for each partition. We can parallelize size calculations over partitions. Results : Tested on a table with 100 partitions and data stored in S3. With changes : - 10.429s - 10.557s - 10.439s - 9.893s  Without changes : - 110.034s - 99.510s - 100.743s - 99.106s ## How was this patch tested? Simple unit test. Closes #21608 from Achuth17/improveAnalyze. Lead-authored-by: Achuth17 <Achuth.narayan@gmail.com> Co-authored-by: arajagopal17 <arajagopal@qubole.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>		2018-08-09 08:29:24 -07:00
..
benchmarks	[SPARK-24549][SQL] Support Decimal type push down to the parquet data sources	2018-07-16 15:44:51 +08:00
src	[SPARK-24626][SQL] Improve location size calculation in Analyze Table command	2018-08-09 08:29:24 -07:00
pom.xml	[SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules	2018-08-06 12:00:39 -07:00