[SPARK-35501][SQL][TESTS] Add a feature for removing pulled container image for docker integration tests

### What changes were proposed in this pull request?

This PR adds a feature for removing pulled container image after every docker integration test finish.
This feature is enabled by the new propoerty `spark.tes.docker.removePulledImage`.

### Why are the changes needed?

For idempotent.
I'm trying to add docker integration tests to GA in SPARK-35483 (#32631) but I noticed that `jdbc.OracleIntegrationSuite` consistently fails(https://github.com/sarutak/spark/runs/2646707235?check_suite_focus=true).
I investigated the reason and I found it's short of the storage capacity of the host on GA.
```
 ORACLE PASSWORD FOR SYS AND SYSTEM: oracle
The location '/opt/oracle' specified for database files has insufficient space.
Database creation needs at least '4.5GB' disk space.
Specify a different database file destination that has enough space in the configuration file '/etc/sysconfig/oracle-xe-18c.conf'.
mv: cannot stat '/opt/oracle/product/18c/dbhomeXE/dbs/spfileXE.ora': No such file or directory
mv: cannot stat '/opt/oracle/product/18c/dbhomeXE/dbs/orapwXE': No such file or directory
ORACLE_HOME = [/home/oracle] ? ORACLE_BASE environment variable is not being set since this
information is not available for the current user ID .
You can set ORACLE_BASE manually if it is required.
Resetting ORACLE_BASE to its previous value or ORACLE_HOME
The Oracle base remains unchanged with value /opt/oracle
#####################################
########### E R R O R ###############
DATABASE SETUP WAS NOT SUCCESSFUL!
Please check output for further info!
########### E R R O R ###############
#####################################
The following output is now a tail of the alert.log:
tail: cannot open '/opt/oracle/diag/rdbms/*/*/trace/alert*.log' for reading: No such file or directory
tail: no files remaining
```

With this feature, pulled container image is removed and keep the capacity for `jdbc.OracleIntegrationSuite` in GA.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

I confirmed the following things.

* A container image which is absent in the local repository is removed after test finished if `spark.test.container.removePulledImage` is `true`.
* A container image which is present in the local repository is not removed after the finished even if `spark.test.container.removePulledImage` is `true`.
* A container image is not removed regardless of presence of the container image in the local repository even if `spark.test.container.removePulledImage` is `true`.

Closes #32652 from sarutak/docker-image-rm.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
Kousuke Saruta 2021-05-26 17:24:29 +09:00 committed by Hyukjin Kwon
parent 50fefc6447
commit 116a97e153
2 changed files with 9 additions and 0 deletions

View file

@ -99,6 +99,8 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
val connectionTimeout = timeout(5.minutes) val connectionTimeout = timeout(5.minutes)
val keepContainer = val keepContainer =
sys.props.getOrElse("spark.test.docker.keepContainer", "false").toBoolean sys.props.getOrElse("spark.test.docker.keepContainer", "false").toBoolean
val removePulledImage =
sys.props.getOrElse("spark.test.docker.removePulledImage", "true").toBoolean
private var docker: DockerClient = _ private var docker: DockerClient = _
// Configure networking (necessary for boot2docker / Docker Machine) // Configure networking (necessary for boot2docker / Docker Machine)
@ -109,6 +111,7 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
port port
} }
private var containerId: String = _ private var containerId: String = _
private var pulled: Boolean = false
protected var jdbcUrl: String = _ protected var jdbcUrl: String = _
override def beforeAll(): Unit = { override def beforeAll(): Unit = {
@ -130,6 +133,7 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
case e: ImageNotFoundException => case e: ImageNotFoundException =>
log.warn(s"Docker image ${db.imageName} not found; pulling image from registry") log.warn(s"Docker image ${db.imageName} not found; pulling image from registry")
docker.pull(db.imageName) docker.pull(db.imageName)
pulled = true
} }
val hostConfigBuilder = HostConfig.builder() val hostConfigBuilder = HostConfig.builder()
.privileged(db.privileged) .privileged(db.privileged)
@ -215,6 +219,9 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
} }
} finally { } finally {
docker.removeContainer(containerId) docker.removeContainer(containerId)
if (removePulledImage && pulled) {
docker.removeImage(db.imageName)
}
} }
} }
} }

View file

@ -264,6 +264,7 @@
<spark.test.home>${session.executionRootDirectory}</spark.test.home> <spark.test.home>${session.executionRootDirectory}</spark.test.home>
<spark.test.webdriver.chrome.driver></spark.test.webdriver.chrome.driver> <spark.test.webdriver.chrome.driver></spark.test.webdriver.chrome.driver>
<spark.test.docker.keepContainer>false</spark.test.docker.keepContainer> <spark.test.docker.keepContainer>false</spark.test.docker.keepContainer>
<spark.test.docker.removePulledImage>true</spark.test.docker.removePulledImage>
<CodeCacheSize>1g</CodeCacheSize> <CodeCacheSize>1g</CodeCacheSize>
<!-- Needed for consistent times --> <!-- Needed for consistent times -->
@ -2728,6 +2729,7 @@
<spark.unsafe.exceptionOnMemoryLeak>true</spark.unsafe.exceptionOnMemoryLeak> <spark.unsafe.exceptionOnMemoryLeak>true</spark.unsafe.exceptionOnMemoryLeak>
<spark.test.webdriver.chrome.driver>${spark.test.webdriver.chrome.driver}</spark.test.webdriver.chrome.driver> <spark.test.webdriver.chrome.driver>${spark.test.webdriver.chrome.driver}</spark.test.webdriver.chrome.driver>
<spark.test.docker.keepContainer>${spark.test.docker.keepContainer}</spark.test.docker.keepContainer> <spark.test.docker.keepContainer>${spark.test.docker.keepContainer}</spark.test.docker.keepContainer>
<spark.test.docker.removePulledImage>${spark.test.docker.removePulledImage}</spark.test.docker.removePulledImage>
<!-- Needed by sql/hive tests. --> <!-- Needed by sql/hive tests. -->
<test.src.tables>__not_used__</test.src.tables> <test.src.tables>__not_used__</test.src.tables>
</systemProperties> </systemProperties>