[SPARK-34202][SQL][TEST] Add ability to fetch spark release package from internal environment in HiveExternalCatalogVersionsSuite
### What changes were proposed in this pull request? `HiveExternalCatalogVersionsSuite` can't run in orgs internal environment where access to outside internet is not allowed because `HiveExternalCatalogVersionsSuite` will download spark release package from internet. Similar to SPARK-32998, this pr add 1 environment variables `SPARK_RELEASE_MIRROR` to let user can specify an accessible download address of spark release package and run `HiveExternalCatalogVersionsSuite` in orgs internal environment. ### Why are the changes needed? Let `HiveExternalCatalogVersionsSuite` can run in orgs internal environment without relying on external spark release download address. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass the Jenkins or GitHub Action - Manual test with and without env variables set in internal environment can't access internet. execute ``` mvn clean install -Dhadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -PhPhive -pl sql/hive -am -DskipTests mvn clean install -Dhadoop-3.2 -Phive-2.3 -Phadoop-cloud -Pmesos -Pyarn -Pkinesis-asl -Phive-thriftserver -Pspark-ganglia-lgpl -Pkubernetes -PhPhive -pl sql/hive -DwildcardSuites=org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite -Dtest=none ``` **Without env** ``` HiveExternalCatalogVersionsSuite: 19:50:35.123 WARN org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: Failed to download Spark 3.0.1 from https://archive.apache.org/dist/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz: Network is unreachable (connect failed) 19:50:35.126 WARN org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite: Failed to download Spark 3.0.1 from https://dist.apache.org/repos/dist/release/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz: Network is unreachable (connect failed) org.apache.spark.sql.hive.HiveExternalCatalogVersionsSuite *** ABORTED *** Exception encountered when invoking run on a nested suite - Unable to download Spark 3.0.1 (HiveExternalCatalogVersionsSuite.scala:125) Run completed in 2 seconds, 669 milliseconds. Total number of tests run: 0 Suites: completed 1, aborted 1 Tests: succeeded 0, failed 0, canceled 0, ignored 0, pending 0 ``` **With env** ``` export SPARK_RELEASE_MIRROR=${spark-release.internal.com}/dist/release/ ``` ``` HiveExternalCatalogVersionsSuite - backward compatibility Run completed in 1 minute, 32 seconds. Total number of tests run: 1 Suites: completed 2, aborted 0 Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Closes #31294 from LuciferYang/SPARK-34202. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
parent
134a7d7eb9
commit
e48a8ad1a2
|
@ -237,14 +237,15 @@ class HiveExternalCatalogVersionsSuite extends SparkSubmitTestUtils {
|
|||
}
|
||||
|
||||
object PROCESS_TABLES extends QueryTest with SQLTestUtils {
|
||||
val releaseMirror = "https://dist.apache.org/repos/dist/release"
|
||||
val releaseMirror = sys.env.getOrElse("SPARK_RELEASE_MIRROR",
|
||||
"https://dist.apache.org/repos/dist/release")
|
||||
// Tests the latest version of every release line.
|
||||
val testingVersions: Seq[String] = {
|
||||
import scala.io.Source
|
||||
val versions: Seq[String] = try {
|
||||
Source.fromURL(s"${releaseMirror}/spark").mkString
|
||||
.split("\n")
|
||||
.filter(_.contains("""<li><a href="spark-"""))
|
||||
.filter(_.contains("""<a href="spark-"""))
|
||||
.filterNot(_.contains("preview"))
|
||||
.map("""<a href="spark-(\d.\d.\d)/">""".r.findFirstMatchIn(_).get.group(1))
|
||||
.filter(_ < org.apache.spark.SPARK_VERSION)
|
||||
|
|
Loading…
Reference in a new issue