[SPARK-3685][CORE] Prints explicit warnings when configured local directories are set to URIs
## What changes were proposed in this pull request? This PR proposes to print warnings before creating local by `java.io.File`. I think we can't just simply disallow and throw an exception for such cases of `hdfs:/tmp/foo` case because it might break compatibility. Note that `hdfs:/tmp/foo` creates a directory called `hdfs:/`. There were many discussion here about whether we should support this in other file systems or not; however, since the JIRA targets "Spark's local dir should accept only local paths", here I tried to resolve it by simply printing warnings. I think we could open another JIRA and design doc if this is something we should support, separately. Another note, for your information, [SPARK-1529](https://issues.apache.org/jira/browse/SPARK-1529) is resolved as `Won't Fix`. **Before** ``` ./bin/spark-shell --conf spark.local.dir=file:/a/b/c ``` This creates a local directory as below: ``` file:/ └── a └── b └── c ... ``` **After** ```bash ./bin/spark-shell --conf spark.local.dir=file:/a/b/c ``` Now, it prints a warning as below: ``` ... 17/12/09 21:58:49 WARN Utils: The configured local directories are not expected to be URIs; however, got suspicious values [file:/a/b/c]. Please check your configured local directories. ... ``` ```bash ./bin/spark-shell --conf spark.local.dir=file:/a/b/c,/tmp/a/b/c,hdfs:/a/b/c ``` It also works with comma-separated ones: ``` ... 17/12/09 22:05:01 WARN Utils: The configured local directories are not expected to be URIs; however, got suspicious values [file:/a/b/c, hdfs:/a/b/c]. Please check your configured local directories. ... ``` ## How was this patch tested? Manually tested: ``` ./bin/spark-shell --conf spark.local.dir=C:\\a\\b\\c ./bin/spark-shell --conf spark.local.dir=/tmp/a/b/c ./bin/spark-shell --conf spark.local.dir=a/b/c ./bin/spark-shell --conf spark.local.dir=a/b/c,/tmp/a/b/c,C:\\a\\b\\c ``` Author: hyukjinkwon <gurwls223@gmail.com> Closes #19934 from HyukjinKwon/SPARK-3685.
This commit is contained in:
parent
ecc179ecaa
commit
bc8933faf2
|
@ -829,7 +829,18 @@ private[spark] object Utils extends Logging {
|
|||
}
|
||||
|
||||
private def getOrCreateLocalRootDirsImpl(conf: SparkConf): Array[String] = {
|
||||
getConfiguredLocalDirs(conf).flatMap { root =>
|
||||
val configuredLocalDirs = getConfiguredLocalDirs(conf)
|
||||
val uris = configuredLocalDirs.filter { root =>
|
||||
// Here, we guess if the given value is a URI at its best - check if scheme is set.
|
||||
Try(new URI(root).getScheme != null).getOrElse(false)
|
||||
}
|
||||
if (uris.nonEmpty) {
|
||||
logWarning(
|
||||
"The configured local directories are not expected to be URIs; however, got suspicious " +
|
||||
s"values [${uris.mkString(", ")}]. Please check your configured local directories.")
|
||||
}
|
||||
|
||||
configuredLocalDirs.flatMap { root =>
|
||||
try {
|
||||
val rootDir = new File(root)
|
||||
if (rootDir.exists || rootDir.mkdirs()) {
|
||||
|
|
Loading…
Reference in a new issue