spark-instrumented-optimizer/core
Thomas Graves e5c7f89082 [SPARK-30529][CORE] Improve error messages when Executor dies before registering with driver
…

### What changes were proposed in this pull request?

If the resource discovery goes bad, like it doesn't return enough GPUs,  currently it just throws an exception. This is hard for users to see because you have to go find the executor logs and its not reported back to the driver.  On yarn if you exit explicitly then the driver logs show the error thrown and its much more useful.  On yarn with the explicit exit with non-zero it also goes against failed executor launch attempts and the application will eventually exit. so if its fundamentally a bad configuration or bad discovery script it won't just hang forever.   I also tested on k8s and standalone and the behaviors there don't change, the executor cleanly exit with an error message in the logs. The standalone ui makes it easy to see failed executors.

### Why are the changes needed?

better user experience.

### Does this PR introduce any user-facing change?

no api changes

### How was this patch tested?

ran unit tests and manually tested on yarn, standalone, and k8s.

Closes #27385 from tgravescs/SPARK-30529.

Authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-29 15:37:11 -08:00
..
benchmarks [SPARK-29576][CORE] Use Spark's CompressionCodec for Ser/Deser of MapOutputStatus 2019-10-23 18:17:37 -07:00
src [SPARK-30529][CORE] Improve error messages when Executor dies before registering with driver 2020-01-29 15:37:11 -08:00
pom.xml [INFRA] Reverts commit 56dcd79 and c216ef1 2019-12-16 19:57:44 -07:00