spark-instrumented-optimizer/dev
jerryshao ab648c0004 [SPARK-14743][YARN] Add a configurable credential manager for Spark running on YARN
## What changes were proposed in this pull request?

Add a configurable token manager for Spark on running on yarn.

### Current Problems ###

1. Supported token provider is hard-coded, currently only hdfs, hbase and hive are supported and it is impossible for user to add new token provider without code changes.
2. Also this problem exits in timely token renewer and updater.

### Changes In This Proposal ###

In this proposal, to address the problems mentioned above and make the current code more cleaner and easier to understand, mainly has 3 changes:

1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` interface for token provider. Each service wants to communicate with Spark through token way needs to implement this interface.
2. Provide a `ConfigurableTokenManager` to manage all the register token providers, also token renewer and updater. Also this class offers the API for other modules to obtain tokens, get renewal interval and so on.
3. Implement 3 built-in token providers `HDFSTokenProvider`, `HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as supported today. Whether to load in these built-in token providers is controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by default for all the built-in token providers are loaded.

### Behavior Changes ###

For the end user there's no behavior change, we still use the same configuration `spark.yarn.security.tokens.${service}.enabled` to decide which token provider is enabled (hbase or hive).

For user implemented token provider (assume the name of token provider is "test") needs to add into this class should have two configurations:

1. `spark.yarn.security.tokens.test.enabled` to true
2. `spark.yarn.security.tokens.test.class` to the full qualified class name.

So we still keep the same semantics as current code while add one new configuration.

### Current Status ###

- [x] token provider interface and management framework.
- [x] implement built-in token providers (hdfs, hbase, hive).
- [x] Coverage of unit test.
- [x] Integrated test with security cluster.

## How was this patch tested?

Unit test and integrated test.

Please suggest and review, any comment is greatly appreciated.

Author: jerryshao <sshao@hortonworks.com>

Closes #14065 from jerryshao/SPARK-16342.
2016-08-10 15:39:30 -07:00
..
create-release [SPARK-16453][BUILD] release-build.sh is missing hive-thriftserver for scala 2.10 2016-07-08 15:56:46 -07:00
deps [SPARK-16770][BUILD] Fix JLine dependency management and version (Sca… 2016-08-03 17:07:10 -07:00
sparktestsupport [SPARK-15975] Fix improper Popen retcode code handling in dev/run-tests 2016-06-16 14:18:58 -07:00
tests [SPARK-10359] Enumerate dependencies in a file and diff against it for new pull requests 2015-12-30 12:47:42 -08:00
.gitignore [SPARK-6219] Reuse pep8.py 2015-04-18 16:46:28 -07:00
.rat-excludes [SPARK-14743][YARN] Add a configurable credential manager for Spark running on YARN 2016-08-10 15:39:30 -07:00
change-scala-version.sh [SPARK-9250] Make change-scala-version more helpful w.r.t. valid Scala versions 2015-07-24 17:09:33 +01:00
change-version-to-2.10.sh [SPARK-9304] [BUILD] Improve backwards compatibility of SPARK-8401 2015-07-25 11:05:08 +01:00
change-version-to-2.11.sh [SPARK-9304] [BUILD] Improve backwards compatibility of SPARK-8401 2015-07-25 11:05:08 +01:00
check-license [SPARK-13596][BUILD] Move misc top-level build files into appropriate subdirs 2016-03-07 14:48:02 -08:00
checkstyle-suppressions.xml [MINOR] Fix Java Lint errors introduced by #13286 and #13280 2016-06-08 14:51:00 +01:00
checkstyle.xml [SPARK-14868][BUILD] Enable NewLineAtEofChecker in checkstyle and fix lint-java errors 2016-04-24 20:40:03 -07:00
github_jira_sync.py Fix install jira-python 2015-05-23 09:14:07 -07:00
lint-java [SPARK-6990][BUILD] Add Java linting script; fix minor warnings 2015-12-04 12:03:45 -08:00
lint-python [SPARK-13887][PYTHON][TRIVIAL][BUILD] Make lint-python script fail fast 2016-03-25 12:53:34 +00:00
lint-r [SPARK-10328] [SPARKR] Fix generic for na.omit 2015-08-28 00:37:50 -07:00
lint-r.R [SPARK-14074][SPARKR] Specify commit sha1 ID when using install_github to install intr package. 2016-03-23 07:57:03 -07:00
lint-scala [SPARK-2627] [PySpark] have the build enforce PEP 8 automatically 2014-08-06 12:58:24 -07:00
make-distribution.sh [SPARK-15821][DOCS] Include parallel build info 2016-06-14 13:59:01 +01:00
merge_spark_pr.py [SPARK-9383][PROJECT-INFRA] PR merge script should reset back to previous branch when possible 2016-01-13 11:56:30 -08:00
mima [SPARK-13579][BUILD] Stop building the main Spark assembly. 2016-04-04 16:52:22 -07:00
README.md Merge pull request #565 from pwendell/dev-scripts. Closes #565. 2014-02-08 23:13:34 -08:00
requirements.txt [SPARK-10498][TOOLS][BUILD] Add requirements.txt file for dev python tools 2016-01-24 11:48:28 -08:00
run-tests [SPARK-5161] Parallelize Python test execution 2015-06-29 21:32:40 -07:00
run-tests-jenkins [SPARK-7018][BUILD] Refactor dev/run-tests-jenkins into Python 2015-10-18 22:45:27 -07:00
run-tests-jenkins.py [SPARK-12842][TEST-HADOOP2.7] Add Hadoop 2.7 build profile 2016-01-15 17:07:24 -08:00
run-tests.py [SPARK-15975] Fix improper Popen retcode code handling in dev/run-tests 2016-06-16 14:18:58 -07:00
scalastyle [SPARK-12152][PROJECT-INFRA] Speed up Scalastyle checks by only invoking SBT once 2015-12-06 17:35:01 -08:00
test-dependencies.sh [SPARK-12712] Fix failure in ./dev/test-dependencies when run against empty .m2 cache 2016-06-09 00:51:24 -07:00
tox.ini [SPARK-13596][BUILD] Move misc top-level build files into appropriate subdirs 2016-03-07 14:48:02 -08:00

Spark Developer Scripts

This directory contains scripts useful to developers when packaging, testing, or committing to Spark.

Many of these scripts require Apache credentials to work correctly.