a2409d1c8e
Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork. Tests not run yet: that's what the machines are for Author: Steve Loughran <stevel@hortonworks.com> Author: Cheng Lian <lian@databricks.com> Author: Michael Armbrust <michael@databricks.com> Author: Patrick Wendell <patrick@databricks.com> Closes #7191 from steveloughran/stevel/feature/SPARK-8064-hive-1.2-002 and squashes the following commits: 7556d85 [Cheng Lian] Updates .q files and corresponding golden files ef4af62 [Steve Loughran] Merge commit '6a92bb09f46a04d6cd8c41bdba3ecb727ebb9030' into stevel/feature/SPARK-8064-hive-1.2-002 6a92bb0 [Cheng Lian] Overrides HiveConf time vars dcbb391 [Cheng Lian] Adds com.twitter:parquet-hadoop-bundle:1.6.0 for Hive Parquet SerDe 0bbe475 [Steve Loughran] SPARK-8064 scalastyle rejects the standard Hadoop ASF license header... fdf759b [Steve Loughran] SPARK-8064 classpath dependency suite to be in sync with shading in final (?) hive-exec spark 7a6c727 [Steve Loughran] SPARK-8064 switch to second staging repo of the spark-hive artifacts. This one has the protobuf-shaded hive-exec jar 376c003 [Steve Loughran] SPARK-8064 purge duplicate protobuf declaration 2c74697 [Steve Loughran] SPARK-8064 switch to the protobuf shaded hive-exec jar with tests to chase it down cc44020 [Steve Loughran] SPARK-8064 remove hadoop.version from runtest.py, as profile will fix that automatically. 6901fa9 [Steve Loughran] SPARK-8064 explicit protobuf import da310dc [Michael Armbrust] Fixes for Hive tests. a775a75 [Steve Loughran] SPARK-8064 cherry-pick-incomplete 7404f34 [Patrick Wendell] Add spark-hive staging repo 832c164 [Steve Loughran] SPARK-8064 try to supress compiler warnings on Complex.java pasted-thrift-code 312c0d4 [Steve Loughran] SPARK-8064 maven/ivy dependency purge; calcite declaration needed fa5ae7b [Steve Loughran] HIVE-8064 fix up hive-thriftserver dependencies and cut back on evicted references in the hive- packages; this keeps mvn and ivy resolution compatible, as the reconciliation policy is "by hand" c188048 [Steve Loughran] SPARK-8064 manage the Hive depencencies to that -things that aren't needed are excluded -sql/hive built with ivy is in sync with the maven reconciliation policy, rather than latest-first 4c8be8d [Cheng Lian] WIP: Partial fix for Thrift server and CLI tests 314eb3c [Steve Loughran] SPARK-8064 deprecation warning noise in one of the tests 17b0341 [Steve Loughran] SPARK-8064 IDE-hinted cleanups of Complex.java to reduce compiler warnings. It's all autogenerated code, so still ugly. d029b92 [Steve Loughran] SPARK-8064 rely on unescaping to have already taken place, so go straight to map of serde options 23eca7e [Steve Loughran] HIVE-8064 handle raw and escaped property tokens 54d9b06 [Steve Loughran] SPARK-8064 fix compilation regression surfacing from rebase 0b12d5f [Steve Loughran] HIVE-8064 use subset of hive complex type whose types deserialize fce73b6 [Steve Loughran] SPARK-8064 poms rely implicitly on the version of kryo chill provides fd3aa5d [Steve Loughran] SPARK-8064 version of hive to d/l from ivy is 1.2.1 dc73ece [Steve Loughran] SPARK-8064 revert to master's determinstic pushdown strategy d3c1e4a [Steve Loughran] SPARK-8064 purge UnionType 051cc21 [Steve Loughran] SPARK-8064 switch to an unshaded version of hive-exec-core, which must have been built with Kryo 2.21. This currently looks for a (locally built) version 1.2.1.spark 6684c60 [Steve Loughran] SPARK-8064 ignore RTE raised in blocking process.exitValue() call e6121e5 [Steve Loughran] SPARK-8064 address review comments aa43dc6 [Steve Loughran] SPARK-8064 more robust teardown on JavaMetastoreDatasourcesSuite f2bff01 [Steve Loughran] SPARK-8064 better takeup of asynchronously caught error text 8b1ef38 [Steve Loughran] SPARK-8064: on failures executing spark-submit in HiveSparkSubmitSuite, print command line and all logged output. 5a9ce6b [Steve Loughran] SPARK-8064 add explicit reason for kv split failure, rather than array OOB. *does not address the issue* 642b63a [Steve Loughran] SPARK-8064 reinstate something cut briefly during rebasing 97194dc [Steve Loughran] SPARK-8064 add extra logging to the YarnClusterSuite classpath test. There should be no reason why this is failing on jenkins, but as it is (and presumably its CP-related), improve the logging including any exception raised. 335357f [Steve Loughran] SPARK-8064 fail fast on thrive process spawning tests on exit codes and/or error string patterns seen in log. 3ed872f [Steve Loughran] SPARK-8064 rename field double to dbl bca55e5 [Steve Loughran] SPARK-8064 missed one of the `date` escapes 41d6479 [Steve Loughran] SPARK-8064 wrap tests with withTable() calls to avoid table-exists exceptions 2bc29a4 [Steve Loughran] SPARK-8064 ParquetSuites to escape `date` field name 1ab9bc4 [Steve Loughran] SPARK-8064 TestHive to use sered2.thrift.test.Complex bf3a249 [Steve Loughran] SPARK-8064: more resubmit than fix; tighten startup timeout to 60s. Still no obvious reason why jersey server code in spark-assembly isn't being picked up -it hasn't been shaded c829b8f [Steve Loughran] SPARK-8064: reinstate yarn-rm-server dependencies to hive-exec to ensure that jersey server is on classpath on hadoop versions < 2.6 0b0f738 [Steve Loughran] SPARK-8064: thrift server startup to fail fast on any exception in the main thread 13abaf1 [Steve Loughran] SPARK-8064 Hive compatibilty tests sin sync with explain/show output from Hive 1.2.1 d14d5ea [Steve Loughran] SPARK-8064: DATE is now a predicate; you can't use it as a field in select ops 26eef1c [Steve Loughran] SPARK-8064: HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT 3d64523 [Steve Loughran] SPARK-8064 improve diagns on uknown token; fix scalastyle failure d0360f6 [Steve Loughran] SPARK-8064: delicate merge in of the branch vanzin/hive-1.1 1126e5a [Steve Loughran] SPARK-8064: name of unrecognized file format wasn't appearing in error text 8cb09c4 [Steve Loughran] SPARK-8064: test resilience/assertion improvements. Independent of the rest of the work; can be backported to earlier versions dec12cb [Steve Loughran] SPARK-8064: when a CLI suite test fails include the full output text in the raised exception; this ensures that the stdout/stderr is included in jenkins reports, so it becomes possible to diagnose the cause. 463a670 [Steve Loughran] SPARK-8064 run-tests.py adds a hadoop-2.6 profile, and changes info messages to say "w/Hive 1.2.1" in console output 2531099 [Steve Loughran] SPARK-8064 successful attempt to get rid of pentaho as a transitive dependency of hive-exec 1d59100 [Steve Loughran] SPARK-8064 (unsuccessful) attempt to get rid of pentaho as a transitive dependency of hive-exec 75733fc [Steve Loughran] SPARK-8064 change thrift binary startup message to "Starting ThriftBinaryCLIService on port" 3ebc279 [Steve Loughran] SPARK-8064 move strings used to check for http/bin thrift services up into constants c80979d [Steve Loughran] SPARK-8064: SparkSQLCLIDriver drops remote mode support. CLISuite Tests pass instead of timing out: undetected regression? 27e8370 [Steve Loughran] SPARK-8064 fix some style & IDE warnings 00e50d6 [Steve Loughran] SPARK-8064 stop excluding hive shims from dependency (commented out , for now) cb4f142 [Steve Loughran] SPARK-8054 cut pentaho dependency from calcite f7aa9cb [Steve Loughran] SPARK-8064 everything compiles with some commenting and moving of classes into a hive package 6c310b4 [Steve Loughran] SPARK-8064 subclass Hive ServerOptionsProcessor to make it public again f61a675 [Steve Loughran] SPARK-8064 thrift server switched to Hive 1.2.1, though it doesn't compile everywhere 4890b9d [Steve Loughran] SPARK-8064, build against Hive 1.2.1
267 lines
8.9 KiB
XML
267 lines
8.9 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!--
|
|
~ Licensed to the Apache Software Foundation (ASF) under one or more
|
|
~ contributor license agreements. See the NOTICE file distributed with
|
|
~ this work for additional information regarding copyright ownership.
|
|
~ The ASF licenses this file to You under the Apache License, Version 2.0
|
|
~ (the "License"); you may not use this file except in compliance with
|
|
~ the License. You may obtain a copy of the License at
|
|
~
|
|
~ http://www.apache.org/licenses/LICENSE-2.0
|
|
~
|
|
~ Unless required by applicable law or agreed to in writing, software
|
|
~ distributed under the License is distributed on an "AS IS" BASIS,
|
|
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
~ See the License for the specific language governing permissions and
|
|
~ limitations under the License.
|
|
-->
|
|
|
|
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
|
|
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
|
|
<modelVersion>4.0.0</modelVersion>
|
|
<parent>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-parent_2.10</artifactId>
|
|
<version>1.5.0-SNAPSHOT</version>
|
|
<relativePath>../../pom.xml</relativePath>
|
|
</parent>
|
|
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-hive_2.10</artifactId>
|
|
<packaging>jar</packaging>
|
|
<name>Spark Project Hive</name>
|
|
<url>http://spark.apache.org/</url>
|
|
<properties>
|
|
<sbt.project.name>hive</sbt.project.name>
|
|
</properties>
|
|
|
|
<dependencies>
|
|
<!-- Added for Hive Parquet SerDe -->
|
|
<dependency>
|
|
<groupId>com.twitter</groupId>
|
|
<artifactId>parquet-hadoop-bundle</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-core_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-core_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
<type>test-jar</type>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-sql_${scala.binary.version}</artifactId>
|
|
<version>${project.version}</version>
|
|
</dependency>
|
|
<!--
|
|
<dependency>
|
|
<groupId>com.google.guava</groupId>
|
|
<artifactId>guava</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>com.google.protobuf</groupId>
|
|
<artifactId>protobuf-java</artifactId>
|
|
<version>${protobuf.version}</version>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>${hive.group}</groupId>
|
|
<artifactId>hive-common</artifactId>
|
|
</dependency>
|
|
-->
|
|
<dependency>
|
|
<groupId>${hive.group}</groupId>
|
|
<artifactId>hive-exec</artifactId>
|
|
<!--
|
|
<classifier>core</classifier>
|
|
-->
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>${hive.group}</groupId>
|
|
<artifactId>hive-metastore</artifactId>
|
|
</dependency>
|
|
<!--
|
|
<dependency>
|
|
<groupId>${hive.group}</groupId>
|
|
<artifactId>hive-serde</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>${hive.group}</groupId>
|
|
<artifactId>hive-shims</artifactId>
|
|
</dependency>
|
|
-->
|
|
<!-- hive-serde already depends on avro, but this brings in customized config of avro deps from parent -->
|
|
<dependency>
|
|
<groupId>org.apache.avro</groupId>
|
|
<artifactId>avro</artifactId>
|
|
</dependency>
|
|
<!-- use the build matching the hadoop api of avro-mapred (i.e. no classifier for hadoop 1 API,
|
|
hadoop2 classifier for hadoop 2 API. avro-mapred is a dependency of org.spark-project.hive:hive-serde -->
|
|
<dependency>
|
|
<groupId>org.apache.avro</groupId>
|
|
<artifactId>avro-mapred</artifactId>
|
|
<classifier>${avro.mapred.classifier}</classifier>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>commons-httpclient</groupId>
|
|
<artifactId>commons-httpclient</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.calcite</groupId>
|
|
<artifactId>calcite-avatica</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.calcite</groupId>
|
|
<artifactId>calcite-core</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.httpcomponents</groupId>
|
|
<artifactId>httpclient</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.codehaus.jackson</groupId>
|
|
<artifactId>jackson-mapper-asl</artifactId>
|
|
</dependency>
|
|
<!-- transitive dependencies of hive-exec-core doesn't declare -->
|
|
<dependency>
|
|
<groupId>commons-codec</groupId>
|
|
<artifactId>commons-codec</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>joda-time</groupId>
|
|
<artifactId>joda-time</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.jodd</groupId>
|
|
<artifactId>jodd-core</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>com.google.code.findbugs</groupId>
|
|
<artifactId>jsr305</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.datanucleus</groupId>
|
|
<artifactId>datanucleus-core</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.thrift</groupId>
|
|
<artifactId>libthrift</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.thrift</groupId>
|
|
<artifactId>libfb303</artifactId>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.scalacheck</groupId>
|
|
<artifactId>scalacheck_${scala.binary.version}</artifactId>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>junit</groupId>
|
|
<artifactId>junit</artifactId>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-sql_${scala.binary.version}</artifactId>
|
|
<type>test-jar</type>
|
|
<version>${project.version}</version>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
<dependency>
|
|
<groupId>org.apache.spark</groupId>
|
|
<artifactId>spark-catalyst_${scala.binary.version}</artifactId>
|
|
<type>test-jar</type>
|
|
<version>${project.version}</version>
|
|
<scope>test</scope>
|
|
</dependency>
|
|
</dependencies>
|
|
<profiles>
|
|
<profile>
|
|
<id>hive</id>
|
|
<build>
|
|
<plugins>
|
|
<plugin>
|
|
<groupId>org.codehaus.mojo</groupId>
|
|
<artifactId>build-helper-maven-plugin</artifactId>
|
|
<executions>
|
|
<execution>
|
|
<id>add-scala-test-sources</id>
|
|
<phase>generate-test-sources</phase>
|
|
<goals>
|
|
<goal>add-test-source</goal>
|
|
</goals>
|
|
<configuration>
|
|
<sources>
|
|
<source>compatibility/src/test/scala</source>
|
|
</sources>
|
|
</configuration>
|
|
</execution>
|
|
</executions>
|
|
</plugin>
|
|
</plugins>
|
|
</build>
|
|
</profile>
|
|
</profiles>
|
|
|
|
<build>
|
|
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
|
|
<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
|
|
<plugins>
|
|
<plugin>
|
|
<groupId>org.scalatest</groupId>
|
|
<artifactId>scalatest-maven-plugin</artifactId>
|
|
<configuration>
|
|
<!-- Specially disable assertions since some Hive tests fail them -->
|
|
<argLine>-da -Xmx3g -XX:MaxPermSize=${MaxPermGen} -XX:ReservedCodeCacheSize=512m</argLine>
|
|
</configuration>
|
|
</plugin>
|
|
<plugin>
|
|
<groupId>org.codehaus.mojo</groupId>
|
|
<artifactId>build-helper-maven-plugin</artifactId>
|
|
<executions>
|
|
<execution>
|
|
<id>add-default-sources</id>
|
|
<phase>generate-sources</phase>
|
|
<goals>
|
|
<goal>add-source</goal>
|
|
</goals>
|
|
<configuration>
|
|
<sources>
|
|
<source>v${hive.version.short}/src/main/scala</source>
|
|
</sources>
|
|
</configuration>
|
|
</execution>
|
|
</executions>
|
|
</plugin>
|
|
|
|
<!-- Deploy datanucleus jars to the spark/lib_managed/jars directory -->
|
|
<plugin>
|
|
<groupId>org.apache.maven.plugins</groupId>
|
|
<artifactId>maven-dependency-plugin</artifactId>
|
|
<executions>
|
|
<execution>
|
|
<id>copy-dependencies</id>
|
|
<phase>package</phase>
|
|
<goals>
|
|
<goal>copy-dependencies</goal>
|
|
</goals>
|
|
<configuration>
|
|
<!-- basedir is spark/sql/hive/ -->
|
|
<outputDirectory>${basedir}/../../lib_managed/jars</outputDirectory>
|
|
<overWriteReleases>false</overWriteReleases>
|
|
<overWriteSnapshots>false</overWriteSnapshots>
|
|
<overWriteIfNewer>true</overWriteIfNewer>
|
|
<includeGroupIds>org.datanucleus</includeGroupIds>
|
|
</configuration>
|
|
</execution>
|
|
</executions>
|
|
</plugin>
|
|
</plugins>
|
|
</build>
|
|
</project>
|