d36539741f
## What changes were proposed in this pull request? Currently, Analyze table calculates table size sequentially for each partition. We can parallelize size calculations over partitions. Results : Tested on a table with 100 partitions and data stored in S3. With changes : - 10.429s - 10.557s - 10.439s - 9.893s Without changes : - 110.034s - 99.510s - 100.743s - 99.106s ## How was this patch tested? Simple unit test. Closes #21608 from Achuth17/improveAnalyze. Lead-authored-by: Achuth17 <Achuth.narayan@gmail.com> Co-authored-by: arajagopal17 <arajagopal@qubole.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |