0343854f54
This change implements a few changes to the k8s pod allocator so that it behaves a little better when dynamic allocation is on. (i) Allow the application to ramp up immediately when there's a change in the target number of executors. Without this change, scaling would only trigger when a change happened in the state of the cluster, e.g. an executor going down, or when the periodical snapshot was taken (default every 30s). (ii) Get rid of pending pod requests, both acknowledged (i.e. Spark knows that a pod is pending resource allocation) and unacknowledged (i.e. Spark has requested the pod but the API server hasn't created it yet), when they're not needed anymore. This avoids starting those executors to just remove them after the idle timeout, wasting resources in the meantime. (iii) Re-work some of the code to avoid unnecessary logging. While not bad without dynamic allocation, the existing logging was very chatty when dynamic allocation was on. With the changes, all the useful information is still there, but only when interesting changes happen. (iv) Gracefully shut down executors when they become idle. Just deleting the pod causes a lot of ugly logs to show up, so it's better to ask pods to exit nicely. That also allows Spark to respect the "don't delete pods" option when dynamic allocation is on. Tested on a small k8s cluster running different TPC-DS workloads. Closes #25236 from vanzin/SPARK-28487. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> |
||
---|---|---|
.. | ||
kubernetes | ||
mesos | ||
yarn |