spark-instrumented-optimizer/external/kafka-0-10-token-provider/pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<!--
  ~ Licensed to the Apache Software Foundation (ASF) under one or more
  ~ contributor license agreements.  See the NOTICE file distributed with
  ~ this work for additional information regarding copyright ownership.
  ~ The ASF licenses this file to You under the Apache License, Version 2.0
  ~ (the "License"); you may not use this file except in compliance with
  ~ the License.  You may obtain a copy of the License at
  ~
  ~    http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing, software
  ~ distributed under the License is distributed on an "AS IS" BASIS,
  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  ~ See the License for the specific language governing permissions and
  ~ limitations under the License.
  -->

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <parent>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-parent_2.12</artifactId>
    <version>3.2.1-SNAPSHOT</version>
    <relativePath>../../pom.xml</relativePath>
  </parent>

  <groupId>org.apache.spark</groupId>
  <artifactId>spark-token-provider-kafka-0-10_2.12</artifactId>
  <properties>
    <sbt.project.name>token-provider-kafka-0-10</sbt.project.name>
  </properties>
  <packaging>jar</packaging>
  <name>Kafka 0.10+ Token Provider for Streaming</name>
  <url>http://spark.apache.org/</url>

  <dependencies>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_${scala.binary.version}</artifactId>
      <version>${project.version}</version>
      <type>test-jar</type>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.kafka</groupId>
      <artifactId>kafka-clients</artifactId>
      <version>${kafka.version}</version>
      <exclusions>
        <exclusion>
          <groupId>com.github.luben</groupId>
          <artifactId>zstd-jni</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.mockito</groupId>
      <artifactId>mockito-core</artifactId>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>${hadoop-client-runtime.artifact}</artifactId>
      <scope>${hadoop.deps.scope}</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-tags_${scala.binary.version}</artifactId>
    </dependency>

    <!--
      This spark-tags test-dep is needed even though it isn't used in this module, otherwise testing-cmds that exclude
      them will yield errors.
    -->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-tags_${scala.binary.version}</artifactId>
      <type>test-jar</type>
      <scope>test</scope>
    </dependency>

  </dependencies>

  <build>
    <outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
    <testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
  </build>

</project>
[SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. ## What changes were proposed in this pull request? There are ugly provided dependencies inside core for the following: * Hive * Kafka In this PR I've extracted them out. This PR contains the following: * Token providers are now loaded with service loader * Hive token provider moved to hive project * Kafka token provider extracted into a new project ## How was this patch tested? Existing + newly added unit tests. Additionally tested on cluster. Closes #23499 from gaborgsomogyi/SPARK-26254. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-01-25 13:36:00 -05:00			`<?xml version="1.0" encoding="UTF-8"?>`
			`<!--`
			`~ Licensed to the Apache Software Foundation (ASF) under one or more`
			`~ contributor license agreements. See the NOTICE file distributed with`
			`~ this work for additional information regarding copyright ownership.`
			`~ The ASF licenses this file to You under the Apache License, Version 2.0`
			`~ (the "License"); you may not use this file except in compliance with`
			`~ the License. You may obtain a copy of the License at`
			`~`
			`~ http://www.apache.org/licenses/LICENSE-2.0`
			`~`
			`~ Unless required by applicable law or agreed to in writing, software`
			`~ distributed under the License is distributed on an "AS IS" BASIS,`
			`~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`~ See the License for the specific language governing permissions and`
			`~ limitations under the License.`
			`-->`

			`<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">`
			`<modelVersion>4.0.0</modelVersion>`
			`<parent>`
			`<groupId>org.apache.spark</groupId>`
			`<artifactId>spark-parent_2.12</artifactId>`
Preparing development version 3.2.1-SNAPSHOT 2021-08-19 10:08:32 -04:00			`<version>3.2.1-SNAPSHOT</version>`
[SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. ## What changes were proposed in this pull request? There are ugly provided dependencies inside core for the following: * Hive * Kafka In this PR I've extracted them out. This PR contains the following: * Token providers are now loaded with service loader * Hive token provider moved to hive project * Kafka token provider extracted into a new project ## How was this patch tested? Existing + newly added unit tests. Additionally tested on cluster. Closes #23499 from gaborgsomogyi/SPARK-26254. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-01-25 13:36:00 -05:00			`<relativePath>../../pom.xml</relativePath>`
			`</parent>`

			`<groupId>org.apache.spark</groupId>`
			`<artifactId>spark-token-provider-kafka-0-10_2.12</artifactId>`
			`<properties>`
			`<sbt.project.name>token-provider-kafka-0-10</sbt.project.name>`
			`</properties>`
			`<packaging>jar</packaging>`
			`<name>Kafka 0.10+ Token Provider for Streaming</name>`
			`<url>http://spark.apache.org/</url>`

			`<dependencies>`
			`<dependency>`
			`<groupId>org.apache.spark</groupId>`
			`<artifactId>spark-core_${scala.binary.version}</artifactId>`
			`<version>${project.version}</version>`
[SPARK-27477][BUILD] Kafka token provider should have provided dependency on Spark ## What changes were proposed in this pull request? Change spark-token-provider-kafka-0-10 dependency on spark-core to be provided ## How was this patch tested? Ran existing unit tests Closes #24384 from koertkuipers/feat-kafka-token-provider-fix-deps. Authored-by: Koert Kuipers <koert@tresata.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-04-26 14:52:08 -04:00			`<scope>provided</scope>`
[SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. ## What changes were proposed in this pull request? There are ugly provided dependencies inside core for the following: * Hive * Kafka In this PR I've extracted them out. This PR contains the following: * Token providers are now loaded with service loader * Hive token provider moved to hive project * Kafka token provider extracted into a new project ## How was this patch tested? Existing + newly added unit tests. Additionally tested on cluster. Closes #23499 from gaborgsomogyi/SPARK-26254. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-01-25 13:36:00 -05:00			`</dependency>`
			`<dependency>`
			`<groupId>org.apache.spark</groupId>`
			`<artifactId>spark-core_${scala.binary.version}</artifactId>`
			`<version>${project.version}</version>`
			`<type>test-jar</type>`
			`<scope>test</scope>`
			`</dependency>`
			`<dependency>`
			`<groupId>org.apache.kafka</groupId>`
			`<artifactId>kafka-clients</artifactId>`
			`<version>${kafka.version}</version>`
[SPARK-34650][BUILD][SS] Exclude zstd-jni transitive dependency from Kafka Client ### What changes were proposed in this pull request? This PR aims to exclude `zstd-jni` transitive dependency from kafka-client. ### Why are the changes needed? To prevent future conflicts, the followings are removed. We should use Spark's zstd-jni dependency consistently. ``` $ build/sbt "token-provider-kafka-0-10/dependencyTree" \| grep zstd [info] \| +-com.github.luben:zstd-jni:1.4.4-7 $ build/sbt "streaming-kafka-0-10/dependencyTree" \| grep zstd [info] \| +-com.github.luben:zstd-jni:1.4.4-7 [info] \| \| +-com.github.luben:zstd-jni:1.4.4-7 $ build/sbt "sql-kafka-0-10/dependencyTree" \| grep zstd [info] \| +-com.github.luben:zstd-jni:1.4.4-7 [info] \| \| +-com.github.luben:zstd-jni:1.4.4-7 ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #31767 from dongjoon-hyun/SPARK-34650. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> 2021-03-06 23:53:55 -05:00			`<exclusions>`
			`<exclusion>`
			`<groupId>com.github.luben</groupId>`
			`<artifactId>zstd-jni</artifactId>`
			`</exclusion>`
			`</exclusions>`
[SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. ## What changes were proposed in this pull request? There are ugly provided dependencies inside core for the following: * Hive * Kafka In this PR I've extracted them out. This PR contains the following: * Token providers are now loaded with service loader * Hive token provider moved to hive project * Kafka token provider extracted into a new project ## How was this patch tested? Existing + newly added unit tests. Additionally tested on cluster. Closes #23499 from gaborgsomogyi/SPARK-26254. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-01-25 13:36:00 -05:00			`</dependency>`
[SPARK-27022][DSTREAMS] Add kafka delegation token support. ## What changes were proposed in this pull request? It adds Kafka delegation token support for DStreams. Please be aware as Kafka native sink is not available for DStreams this PR contains delegation token usage only on consumer side. What this PR contains: * Usage of token through dynamic JAAS configuration * `KafkaConfigUpdater` moved to `kafka-0-10-token-provider` * `KafkaSecurityHelper` functionality moved into `KafkaTokenUtil` * Documentation ## How was this patch tested? Existing unit tests + on cluster. Long running Kafka to file tests on 4 node cluster with randomly thrown artificial exceptions. Test scenario: * 4 node cluster * Yarn * Kafka broker version 2.1.0 * security.protocol = SASL_SSL * sasl.mechanism = SCRAM-SHA-512 Kafka broker settings: * delegation.token.expiry.time.ms=600000 (10 min) * delegation.token.max.lifetime.ms=1200000 (20 min) * delegation.token.expiry.check.interval.ms=300000 (5 min) After each 7.5 minutes new delegation token obtained from Kafka broker (10 min * 0.75). When token expired after 10 minutes (Spark obtains new one and doesn't renew the old), the brokers expiring thread comes after each 5 minutes (invalidates expired tokens) and artificial exception has been thrown inside the Spark application (such case Spark closes connection), then the latest delegation token picked up correctly. cd docs/ SKIP_API=1 jekyll build Manual webpage check. Closes #23929 from gaborgsomogyi/SPARK-27022. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-03-07 14:36:37 -05:00			`<dependency>`
			`<groupId>org.mockito</groupId>`
			`<artifactId>mockito-core</artifactId>`
			`<scope>test</scope>`
			`</dependency>`
[SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile ### What changes were proposed in this pull request? This: 1. switches Spark to use shaded Hadoop clients, namely hadoop-client-api and hadoop-client-runtime, for Hadoop 3.x. 2. upgrade built-in version for Hadoop 3.x to Hadoop 3.2.2 Note that for Hadoop 2.7, we'll still use the same modules such as hadoop-client. In order to still keep default Hadoop profile to be hadoop-3.2, this defines the following Maven properties: ``` hadoop-client-api.artifact hadoop-client-runtime.artifact hadoop-client-minicluster.artifact ``` which default to: ``` hadoop-client-api hadoop-client-runtime hadoop-client-minicluster ``` but all switch to `hadoop-client` when the Hadoop profile is hadoop-2.7. A side affect from this is we'll import the same dependency multiple times. For this I have to disable Maven enforcer `banDuplicatePomDependencyVersions`. Besides above, there are the following changes: - explicitly add a few dependencies which are imported via transitive dependencies from Hadoop jars, but are removed from the shaded client jars. - removed the use of `ProxyUriUtils.getPath` from `ApplicationMaster` which is a server-side/private API. - modified `IsolatedClientLoader` to exclude `hadoop-auth` jars when Hadoop version is 3.x. This change should only matter when we're not sharing Hadoop classes with Spark (which is _mostly_ used in tests). ### Why are the changes needed? Hadoop 3.2.2 is released with new features and bug fixes, so it's good for the Spark community to adopt it. However, latest Hadoop versions starting from Hadoop 3.2.1 have upgraded to use Guava 27+. In order to resolve Guava conflicts, this takes the approach by switching to shaded client jars provided by Hadoop. This also has the benefits of avoid pulling other 3rd party dependencies from Hadoop side so as to avoid more potential future conflicts. ### Does this PR introduce _any_ user-facing change? When people use Spark with `hadoop-provided` option, they should make sure class path contains `hadoop-client-api` and `hadoop-client-runtime` jars. In addition, they may need to make sure these jars appear before other Hadoop jars in the order. Otherwise, classes may be loaded from the other non-shaded Hadoop jars and cause potential conflicts. ### How was this patch tested? Relying on existing tests. Closes #30701 from sunchao/test-hadoop-3.2.2. Authored-by: Chao Sun <sunchao@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com> 2021-01-15 17:06:50 -05:00			`<dependency>`
			`<groupId>org.apache.hadoop</groupId>`
			`<artifactId>${hadoop-client-runtime.artifact}</artifactId>`
			`<scope>${hadoop.deps.scope}</scope>`
			`</dependency>`
[SPARK-26254][CORE] Extract Hive + Kafka dependencies from Core. ## What changes were proposed in this pull request? There are ugly provided dependencies inside core for the following: * Hive * Kafka In this PR I've extracted them out. This PR contains the following: * Token providers are now loaded with service loader * Hive token provider moved to hive project * Kafka token provider extracted into a new project ## How was this patch tested? Existing + newly added unit tests. Additionally tested on cluster. Closes #23499 from gaborgsomogyi/SPARK-26254. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com> 2019-01-25 13:36:00 -05:00			`<dependency>`
			`<groupId>org.apache.spark</groupId>`
			`<artifactId>spark-tags_${scala.binary.version}</artifactId>`
			`</dependency>`

			`<!--`
			`This spark-tags test-dep is needed even though it isn't used in this module, otherwise testing-cmds that exclude`
			`them will yield errors.`
			`-->`
			`<dependency>`
			`<groupId>org.apache.spark</groupId>`
			`<artifactId>spark-tags_${scala.binary.version}</artifactId>`
			`<type>test-jar</type>`
			`<scope>test</scope>`
			`</dependency>`

			`</dependencies>`

			`<build>`
			`<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>`
			`<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>`
			`</build>`

			`</project>`