spark-instrumented-optimizer/common
Denis Tarima cfcd094147 [SPARK-36036][CORE] Fix cleanup of DownloadFile resources
### What changes were proposed in this pull request?

There was a regression since Spark started storing large remote files on disk (https://issues.apache.org/jira/browse/SPARK-22062). In 2018 a refactoring introduced a hidden reference preventing the auto-deletion of the files (a97001d217 (diff-42a673b8fa5f2b999371dc97a5de7ebd2c2ec19447353d39efb7e8ebc012fe32L1677)). Since then all underlying files of DownloadFile instances are kept on disk for the duration of the Spark application which sometimes results in "no space left" errors.

`ReferenceWithCleanup` class uses `file` (the `DownloadFile`) in `cleanUp(): Unit` method so it has to keep a reference to it which prevents it from being garbage-collected.
```
def cleanUp(): Unit = {
  logDebug(s"Clean up file $filePath")

  if (!file.delete()) {                                      <--- here
    logDebug(s"Fail to delete file $filePath")
  }
}
```

### Why are the changes needed?

Long-running Spark applications require freeing resources when they are not needed anymore, and iterative algorithms could use all the disk space quickly too.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added a test in BlockManagerSuite and tested manually.

Closes #33251 from dtarima/fix-download-file-cleanup.

Authored-by: Denis Tarima <dtarima@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-07-11 11:54:23 -05:00
..
kvstore [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00
network-common [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00
network-shuffle [SPARK-36036][CORE] Fix cleanup of DownloadFile resources 2021-07-11 11:54:23 -05:00
network-yarn [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00
sketch [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00
tags [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00
unsafe [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00