spark-instrumented-optimizer/core
Min Shen 552a332dd4 [SPARK-36423][SHUFFLE] Randomize order of blocks in a push request to improve block merge ratio for push-based shuffle
### What changes were proposed in this pull request?

On the client side, we are currently randomizing the order of push requests before processing each request. In addition we can further randomize the order of blocks within each push request before pushing them.
In our benchmark, this has resulted in a 60%-70% reduction of blocks that fail to be merged due to bock collision (the existing block merge ratio is already pretty good in general, and this further improves it).

### Why are the changes needed?

Improve block merge ratio for push-based shuffle

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Straightforward small change, no additional test needed.

Closes #33649 from Victsm/SPARK-36423.

Lead-authored-by: Min Shen <mshen@linkedin.com>
Co-authored-by: Min Shen <victor.nju@gmail.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
(cherry picked from commit 6e729515fd)
Signed-off-by: Mridul Muralidharan <mridulatgmail.com>
2021-08-06 09:48:31 -05:00
..
benchmarks [SPARK-35670][BUILD] Upgrade ZSTD-JNI to 1.5.0-2 2021-06-17 11:06:50 -07:00
src [SPARK-36423][SHUFFLE] Randomize order of blocks in a push request to improve block merge ratio for push-based shuffle 2021-08-06 09:48:31 -05:00
pom.xml [SPARK-35928][BUILD] Upgrade ASM to 9.1 2021-06-29 10:27:51 -07:00