5b3277f4fc
### What changes were proposed in this pull request? This PR propose 1. Explicitly include xml-apis. xml-apis is already the part of xerces 2.12.0 (https://repo1.maven.org/maven2/xerces/xercesImpl/2.12.0/xercesImpl-2.12.0.pom). However, we're excluding it by setting `scope` to `test`. This seems causing `spark-shell`, built from Maven, to fail. Seems like previously xml-apis wasn't reached for some reasons but after we upgrade, it seems requiring. Therefore, this PR proposes to include it. 2. Pins `xerces` version in SBT as well. Seems this dependency is resolved differently from Maven. Note that Hadoop 3 does not looks requiring this as they replaced xerces as of [HDFS-12221](https://issues.apache.org/jira/browse/HDFS-12221). ### Why are the changes needed? To make `spark-shell` working from Maven build, and uses the same xerces version. ### Does this PR introduce any user-facing change? No, it's master only. ### How was this patch tested? **1.** ```bash ./build/mvn -DskipTests -Psparkr -Phive clean package ./bin/spark-shell ``` Before: ``` Exception in thread "main" java.lang.NoClassDefFoundError: org/w3c/dom/ElementTraversal at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:763) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at org.apache.xerces.parsers.AbstractDOMParser.startDocument(Unknown Source) at org.apache.xerces.xinclude.XIncludeHandler.startDocument(Unknown Source) at org.apache.xerces.impl.dtd.XMLDTDValidator.startDocument(Unknown Source) at org.apache.xerces.impl.XMLDocumentScannerImpl.startEntity(Unknown Source) at org.apache.xerces.impl.XMLVersionDetector.startDocumentParsing(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2482) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2470) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2541) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2494) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2407) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115) at org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456) at org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427) at org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.w3c.dom.ElementTraversal at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 42 more ``` After: ``` ... Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.0-SNAPSHOT /_/ Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_202) Type in expressions to have them evaluated. Type :help for more information. scala> ``` **2.** ``` ./build/sbt dependencyTree -Phadoop-2.7 -Phive-2.3 -Phive-thriftserver -Phive ./build/sbt dependencyTree -Phadoop-3.2 -Phive-2.3 -Phive-thriftserver -Phive ``` Closes #27808 from HyukjinKwon/SPARK-30994. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org> |
||
---|---|---|
.. | ||
spark-deps-hadoop-2.7-hive-1.2 | ||
spark-deps-hadoop-2.7-hive-2.3 | ||
spark-deps-hadoop-3.2-hive-2.3 |