spark-instrumented-optimizer/resource-managers/kubernetes
Holden Karau 57d27e900f
[SPARK-31125][K8S] Terminating pods have a deletion timestamp but they are not yet dead
### What changes were proposed in this pull request?

Change what we consider a deleted pod to not include "Terminating"

### Why are the changes needed?

If we get a new snapshot while a pod is in the process of being cleaned up we shouldn't delete the executor until it is fully terminated.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

This should be covered by the decommissioning tests in that they currently are flaky because we sometimes delete the executor instead of allowing it to decommission all the way.

I also ran this in a loop locally ~80 times with the only failures being the PV suite because of unrelated minikube mount issues.

Closes #27905 from holdenk/SPARK-31125-Processing-state-snapshots-incorrect.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-03-17 12:04:06 -07:00
..
core [SPARK-31125][K8S] Terminating pods have a deletion timestamp but they are not yet dead 2020-03-17 12:04:06 -07:00
docker/src/main/dockerfiles/spark [SPARK-20628][CORE][K8S] Start to improve Spark decommissioning & preemption support 2020-02-14 12:36:52 -08:00
integration-tests [SPARK-31062][K8S][TESTS] Improve spark decommissioning k8s test reliability 2020-03-11 14:42:31 -07:00