spark-instrumented-optimizer/resource-managers
Liupengcheng eb6fd7eab7 [SPARK-26877][YARN] Support user-level app staging directory in yarn mode when spark.yarn…
Currently, when running applications on yarn mode, the app staging directory of  is controlled by `spark.yarn.stagingDir` config if specified, and this directory cannot separate different users, sometimes, it's inconvenient for file and quota management for users.

Sometimes, there might be an unexpected increasing of the staging files, two possible reasons are:
1. The `spark.yarn.preserve.staging.files` provided can be misused by users
2. cron task constantly starting new applications on non-existent yarn queue(wrong configuration).

But now, we are not easy to find out the which user obtains the most HDFS files or spaces.
what's more, even we want set HDFS name quota or space quota for each user to limit the increase is impossible.

So I propose to add user sub directories under this app staging directory which is more clear.

existing UT

Closes #23786 from liupc/Support-user-level-app-staging-dir.

Authored-by: Liupengcheng <liupengcheng@xiaomi.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-02-20 11:45:17 -08:00
..
kubernetes [SPARK-24894][K8S] Make sure valid host names are created for executors. 2019-02-19 15:19:59 -08:00
mesos [SPARK-26817][CORE] Use System.nanoTime to measure time intervals 2019-02-13 13:12:16 -06:00
yarn [SPARK-26877][YARN] Support user-level app staging directory in yarn mode when spark.yarn… 2019-02-20 11:45:17 -08:00