[SPARK-3685][CORE] Prints explicit warnings when configured local directories are set to URIs

## What changes were proposed in this pull request?

This PR proposes to print warnings before creating local by `java.io.File`.

I think we can't just simply disallow and throw an exception for such cases of `hdfs:/tmp/foo` case because it might break compatibility. Note that `hdfs:/tmp/foo` creates a directory called `hdfs:/`.

There were many discussion here about whether we should support this in other file systems or not; however, since the JIRA targets "Spark's local dir should accept only local paths", here I tried to resolve it by simply printing warnings.

I think we could open another JIRA and design doc if this is something we should support, separately. Another note, for your information, [SPARK-1529](https://issues.apache.org/jira/browse/SPARK-1529) is resolved as `Won't Fix`.

**Before**

```
./bin/spark-shell --conf spark.local.dir=file:/a/b/c
```

This creates a local directory as below:

```
 file:/
└── a
    └── b
        └── c
        ...
```

**After**

```bash
./bin/spark-shell --conf spark.local.dir=file:/a/b/c
```

Now, it prints a warning as below:

```
...
17/12/09 21:58:49 WARN Utils: The configured local directories are not expected to be URIs; however, got suspicious values [file:/a/b/c]. Please check your configured local directories.
...
```

```bash
./bin/spark-shell --conf spark.local.dir=file:/a/b/c,/tmp/a/b/c,hdfs:/a/b/c
```

It also works with comma-separated ones:

```
...
17/12/09 22:05:01 WARN Utils: The configured local directories are not expected to be URIs; however, got suspicious values [file:/a/b/c, hdfs:/a/b/c]. Please check your configured local directories.
...
 ```

## How was this patch tested?

 Manually tested:

 ```
 ./bin/spark-shell --conf spark.local.dir=C:\\a\\b\\c
 ./bin/spark-shell --conf spark.local.dir=/tmp/a/b/c
 ./bin/spark-shell --conf spark.local.dir=a/b/c
 ./bin/spark-shell --conf spark.local.dir=a/b/c,/tmp/a/b/c,C:\\a\\b\\c
 ```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19934 from HyukjinKwon/SPARK-3685.
This commit is contained in:
hyukjinkwon 2017-12-12 17:02:04 +09:00
parent ecc179ecaa
commit bc8933faf2

View file

@ -829,7 +829,18 @@ private[spark] object Utils extends Logging {
}
private def getOrCreateLocalRootDirsImpl(conf: SparkConf): Array[String] = {
getConfiguredLocalDirs(conf).flatMap { root =>
val configuredLocalDirs = getConfiguredLocalDirs(conf)
val uris = configuredLocalDirs.filter { root =>
// Here, we guess if the given value is a URI at its best - check if scheme is set.
Try(new URI(root).getScheme != null).getOrElse(false)
}
if (uris.nonEmpty) {
logWarning(
"The configured local directories are not expected to be URIs; however, got suspicious " +
s"values [${uris.mkString(", ")}]. Please check your configured local directories.")
}
configuredLocalDirs.flatMap { root =>
try {
val rootDir = new File(root)
if (rootDir.exists || rootDir.mkdirs()) {