spark-instrumented-optimizer/resource-managers/kubernetes
Holden Karau ce6180c8c3 [SPARK-33154][CORE][K8S] Handle cleaned shuffles during migration
### What changes were proposed in this pull request?

If a block is removed between discovery to transfer fo the block, we short circuit that block and remove it from the list to transfer and increment the transferred blocks. This is complicated since both RPC errors and local read errors may be reported with the same exception class.

### Why are the changes needed?

Slow shuffle refreshes could waste time when decommissioning has already finished. Decommissioning might avoid transferring some some blocks to an otherwise live host which is marked as "full" if a deleted block fails to transfer to that host.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New unit and integration tests.

Closes #30046 from holdenk/handle-cleaned-shuffles-during0migration.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-10-16 14:47:46 -07:00
..
core [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion allows only '3' 2020-10-15 01:51:01 -07:00
docker/src/main/dockerfiles/spark [SPARK-33155][K8S] spark.kubernetes.pyspark.pythonVersion allows only '3' 2020-10-15 01:51:01 -07:00
integration-tests [SPARK-33154][CORE][K8S] Handle cleaned shuffles during migration 2020-10-16 14:47:46 -07:00