[SPARK-26134][CORE] Upgrading Hadoop to 2.7.4 to fix java.version problem

## What changes were proposed in this pull request?

When I ran spark-shell on JDK11+28(2018-09-25), It failed with the error below.

```
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:80)
	at org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:611)
	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
	at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
	at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:791)
	at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:761)
	at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:634)
	at org.apache.spark.util.Utils$.$anonfun$getCurrentUserName$1(Utils.scala:2427)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2427)
	at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:79)
	at org.apache.spark.deploy.SparkSubmit.secMgr$lzycompute$1(SparkSubmit.scala:359)
	at org.apache.spark.deploy.SparkSubmit.secMgr$1(SparkSubmit.scala:359)
	at org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$9(SparkSubmit.scala:367)
	at scala.Option.map(Option.scala:146)
	at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:367)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:143)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:927)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:936)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.StringIndexOutOfBoundsException: begin 0, end 3, length 2
	at java.base/java.lang.String.checkBoundsBeginEnd(String.java:3319)
	at java.base/java.lang.String.substring(String.java:1874)
	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:52)
```
This is a Hadoop issue that fails to parse some java.version. It has been fixed from Hadoop-2.7.4(see [HADOOP-14586](https://issues.apache.org/jira/browse/HADOOP-14586)).

Note, Hadoop-2.7.5 or upper have another problem with Spark ([SPARK-25330](https://issues.apache.org/jira/browse/SPARK-25330)). So upgrading to 2.7.4 would be fine for now.

## How was this patch tested?
Existing tests.

Closes #23101 from tasanuma/SPARK-26134.

Authored-by: Takanobu Asanuma <tasanuma@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This commit is contained in:
Takanobu Asanuma 2018-11-21 23:09:57 -08:00 committed by Dongjoon Hyun
parent 8d54bf79f2
commit 15c0384977
No known key found for this signature in database
GPG key ID: EDA00CE834F0FC5C
5 changed files with 20 additions and 19 deletions

View file

@ -9,4 +9,4 @@ This module is off by default. To activate it specify the profile in the command
If you need to build an assembly for a different version of Hadoop the If you need to build an assembly for a different version of Hadoop the
hadoop-version system property needs to be set as in this example: hadoop-version system property needs to be set as in this example:
-Dhadoop.version=2.7.3 -Dhadoop.version=2.7.4

View file

@ -64,21 +64,21 @@ gson-2.2.4.jar
guava-14.0.1.jar guava-14.0.1.jar
guice-3.0.jar guice-3.0.jar
guice-servlet-3.0.jar guice-servlet-3.0.jar
hadoop-annotations-2.7.3.jar hadoop-annotations-2.7.4.jar
hadoop-auth-2.7.3.jar hadoop-auth-2.7.4.jar
hadoop-client-2.7.3.jar hadoop-client-2.7.4.jar
hadoop-common-2.7.3.jar hadoop-common-2.7.4.jar
hadoop-hdfs-2.7.3.jar hadoop-hdfs-2.7.4.jar
hadoop-mapreduce-client-app-2.7.3.jar hadoop-mapreduce-client-app-2.7.4.jar
hadoop-mapreduce-client-common-2.7.3.jar hadoop-mapreduce-client-common-2.7.4.jar
hadoop-mapreduce-client-core-2.7.3.jar hadoop-mapreduce-client-core-2.7.4.jar
hadoop-mapreduce-client-jobclient-2.7.3.jar hadoop-mapreduce-client-jobclient-2.7.4.jar
hadoop-mapreduce-client-shuffle-2.7.3.jar hadoop-mapreduce-client-shuffle-2.7.4.jar
hadoop-yarn-api-2.7.3.jar hadoop-yarn-api-2.7.4.jar
hadoop-yarn-client-2.7.3.jar hadoop-yarn-client-2.7.4.jar
hadoop-yarn-common-2.7.3.jar hadoop-yarn-common-2.7.4.jar
hadoop-yarn-server-common-2.7.3.jar hadoop-yarn-server-common-2.7.4.jar
hadoop-yarn-server-web-proxy-2.7.3.jar hadoop-yarn-server-web-proxy-2.7.4.jar
hk2-api-2.4.0-b34.jar hk2-api-2.4.0-b34.jar
hk2-locator-2.4.0-b34.jar hk2-locator-2.4.0-b34.jar
hk2-utils-2.4.0-b34.jar hk2-utils-2.4.0-b34.jar
@ -117,6 +117,7 @@ jersey-guava-2.22.2.jar
jersey-media-jaxb-2.22.2.jar jersey-media-jaxb-2.22.2.jar
jersey-server-2.22.2.jar jersey-server-2.22.2.jar
jetty-6.1.26.jar jetty-6.1.26.jar
jetty-sslengine-6.1.26.jar
jetty-util-6.1.26.jar jetty-util-6.1.26.jar
jline-2.14.6.jar jline-2.14.6.jar
joda-time-2.9.3.jar joda-time-2.9.3.jar

View file

@ -118,7 +118,7 @@
<sbt.project.name>spark</sbt.project.name> <sbt.project.name>spark</sbt.project.name>
<slf4j.version>1.7.16</slf4j.version> <slf4j.version>1.7.16</slf4j.version>
<log4j.version>1.2.17</log4j.version> <log4j.version>1.2.17</log4j.version>
<hadoop.version>2.7.3</hadoop.version> <hadoop.version>2.7.4</hadoop.version>
<protobuf.version>2.5.0</protobuf.version> <protobuf.version>2.5.0</protobuf.version>
<yarn.version>${hadoop.version}</yarn.version> <yarn.version>${hadoop.version}</yarn.version>
<zookeeper.version>3.4.6</zookeeper.version> <zookeeper.version>3.4.6</zookeeper.version>

View file

@ -107,7 +107,7 @@ properties to Maven. For example:
mvn integration-test -am -pl :spark-kubernetes-integration-tests_2.11 \ mvn integration-test -am -pl :spark-kubernetes-integration-tests_2.11 \
-Pkubernetes -Pkubernetes-integration-tests \ -Pkubernetes -Pkubernetes-integration-tests \
-Phadoop-2.7 -Dhadoop.version=2.7.3 \ -Phadoop-2.7 -Dhadoop.version=2.7.4 \
-Dspark.kubernetes.test.sparkTgz=spark-3.0.0-SNAPSHOT-bin-example.tgz \ -Dspark.kubernetes.test.sparkTgz=spark-3.0.0-SNAPSHOT-bin-example.tgz \
-Dspark.kubernetes.test.imageTag=sometag \ -Dspark.kubernetes.test.imageTag=sometag \
-Dspark.kubernetes.test.imageRepo=docker.io/somerepo \ -Dspark.kubernetes.test.imageRepo=docker.io/somerepo \

View file

@ -65,7 +65,7 @@ private[hive] object IsolatedClientLoader extends Logging {
case e: RuntimeException if e.getMessage.contains("hadoop") => case e: RuntimeException if e.getMessage.contains("hadoop") =>
// If the error message contains hadoop, it is probably because the hadoop // If the error message contains hadoop, it is probably because the hadoop
// version cannot be resolved. // version cannot be resolved.
val fallbackVersion = "2.7.3" val fallbackVersion = "2.7.4"
logWarning(s"Failed to resolve Hadoop artifacts for the version $hadoopVersion. We " + logWarning(s"Failed to resolve Hadoop artifacts for the version $hadoopVersion. We " +
s"will change the hadoop version from $hadoopVersion to $fallbackVersion and try " + s"will change the hadoop version from $hadoopVersion to $fallbackVersion and try " +
"again. Hadoop classes will not be shared between Spark and Hive metastore client. " + "again. Hadoop classes will not be shared between Spark and Hive metastore client. " +