spark-instrumented-optimizer/core/benchmarks
fitermay 21db4336b0 [SPARK-27070] Fix performance bug in DefaultPartitionCoalescer
When trying to coalesce a UnionRDD of two large FileScanRDDs
(each with a few million partitions) into around 8k partitions
the driver can stall for over an hour.

Profiler shows that over 90% of the time is spent in TimSort
which is invoked by `pickBin`.  This patch replaces sorting with a more
efficient `min` for the purpose of finding the least occupied
PartitionGroup

Closes #23986 from fitermay/SPARK-27070.

Authored-by: fitermay <fiterman@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-03-14 20:13:18 -05:00
..
CoalescedRDDBenchmark-results.txt [SPARK-27070] Fix performance bug in DefaultPartitionCoalescer 2019-03-14 20:13:18 -05:00
KryoBenchmark-results.txt [SPARK-25490][SQL][TEST] Fix OOM of KryoBenchmark due to large 2D array and refactor it to use main method 2018-10-24 16:56:17 -05:00
KryoSerializerBenchmark-results.txt [SPARK-25839][CORE] Implement use of KryoPool in KryoSerializer 2018-11-10 12:51:24 -06:00
XORShiftRandomBenchmark-results.txt [SPARK-26816][CORE][TEST] Add XORShiftRandom Benchmark 2019-02-10 13:52:24 -08:00