spark-instrumented-optimizer/dev
Xinrong Meng f88874194a [SPARK-30491][INFRA] Enable dependency audit files to tell dependency classifier
### What changes were proposed in this pull request?
Enable dependency audit files to tell the value of artifact id, version, and classifier of a dependency.

For example, `avro-mapred-1.8.2-hadoop2.jar` should be expanded to `avro-mapred/1.8.2/hadoop2/avro-mapred-1.8.2-hadoop2.jar` where `avro-mapred` is the artifact id, `1.8.2` is the version, and `haddop2` is the classifier.

### Why are the changes needed?
Dependency audit files are expected to be consumed by automated tests or downstream tools.

However, current dependency audit files under `dev/deps` only show jar names. And there isn't a simple rule on how to parse the jar name to get the values of different fields. For example, `hadoop2` is the classifier of `avro-mapred-1.8.2-hadoop2.jar`, in contrast, `incubating` is the version of `htrace-core-3.1.0-incubating.jar`.

Reference: There is a good example of the downstream tool that would be enabled as yhuai suggested,

> Say we have a Spark application that depends on a third-party dependency `foo`, which pulls in `jackson` as a transient dependency. Unfortunately, `foo` depends on a different version of `jackson` than Spark. So, in the pom of this Spark application, we use the dependency management section to pin the version of `jackson`. By doing this, we are lifting `jackson` to the top-level dependency of my application and I want to have a way to keep tracking what Spark uses. What we can do is to cross-check my Spark application's classpath with what Spark uses. Then, with a test written in my code base, whenever my application bumps Spark version, this test will check what we define in the application and what Spark has, and then remind us to change our application's pom if needed. In my case, I am fine to directly access git to get these audit files.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Code changes are verified by generated dependency audit files naturally. Thus, there are no tests added.

Closes #27177 from mengCareers/depsOptimize.

Lead-authored-by: Xinrong Meng <meng.careers@gmail.com>
Co-authored-by: mengCareers <meng.careers@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-15 20:19:44 -08:00
..
create-release [SPARK-30268][INFRA] Fix incorrect pyspark version when releasing preview versions 2019-12-17 10:22:29 +09:00
deps [SPARK-30491][INFRA] Enable dependency audit files to tell dependency classifier 2020-01-15 20:19:44 -08:00
sparktestsupport [SPARK-28198][PYTHON][FOLLOW-UP] Run the tests of MAP ITER UDF in Jenkins 2020-01-09 13:45:50 +09:00
tests [MINOR] Fix typos in dev/* scripts. 2018-01-31 07:37:25 +09:00
.gitignore [SPARK-23174][BUILD][PYTHON][FOLLOWUP] Add pycodestyle*.py to .gitignore file. 2018-01-31 00:51:00 +09:00
.rat-excludes [SPARK-29674][CORE] Update dropwizard metrics to 4.1.x for JDK 9+ 2019-11-03 15:13:06 -08:00
.scalafmt.conf [SPARK-26177] Config change followup to [] Automated formatting for Scala code 2018-12-03 10:03:51 -06:00
appveyor-guide.md [SPARK-26918][DOCS] All .md should have ASF license header 2019-03-30 19:49:45 -05:00
appveyor-install-dependencies.ps1 [SPARK-30453][BUILD][R] Update AppVeyor R version to 3.6.2 2020-01-07 18:43:21 -08:00
change-scala-version.sh [SPARK-30012][CORE][SQL] Change classes extending scala collection classes to work with 2.13 2019-12-03 08:59:43 -08:00
check-license [MINOR][BUILD] Upgrade apache-rat to 0.13 2019-04-01 16:44:42 +09:00
checkstyle-suppressions.xml [SPARK-29674][CORE] Update dropwizard metrics to 4.1.x for JDK 9+ 2019-11-03 15:13:06 -08:00
checkstyle.xml [MINOR] Fix google style guide address 2019-12-12 11:04:01 -06:00
github_jira_sync.py [SPARK-29731][INFRA] Use public JIRA REST API to read-only access 2019-11-03 11:17:53 -08:00
lint-java [SPARK-23063][K8S] K8s changes for publishing scripts (and a couple of other misses) 2018-01-13 21:34:28 -08:00
lint-python [SPARK-30450][INFRA][FOLLOWUP] Fix git folder regex for windows file separator 2020-01-08 19:38:20 -08:00
lint-r [SPARK-29932][R][TESTS] lint-r should do non-zero exit in case of errors 2019-11-17 10:09:46 -08:00
lint-r.R [SPARK-29936][R] Fix SparkR lint errors and add lint-r GitHub Action 2019-11-17 21:01:01 -08:00
lint-scala [SPARK-27158][BUILD] dev/mima and dev/scalastyle support dynamic profiles 2019-03-15 08:20:42 +09:00
make-distribution.sh Revert "[SPARK-30056][INFRA] Skip building test artifacts in dev/make-distribution.sh 2019-12-15 23:16:17 -07:00
merge_spark_pr.py [MINOR][BUILD] Decode output of commands during merge script as UTF-8 consistently 2019-10-02 11:28:55 +09:00
mima [SPARK-27158][BUILD] dev/mima and dev/scalastyle support dynamic profiles 2019-03-15 08:20:42 +09:00
pip-sanity-check.py [SPARK-29672][PYSPARK] update spark testing framework to use python3 2019-11-14 10:18:55 -08:00
README.md Merge pull request #565 from pwendell/dev-scripts. Closes #565. 2014-02-08 23:13:34 -08:00
requirements.txt [SPARK-25270] lint-python: Add flake8 to find syntax errors and undefined names 2018-09-07 09:35:25 -07:00
run-pip-tests [SPARK-29672][PYSPARK] update spark testing framework to use python3 2019-11-14 10:18:55 -08:00
run-tests [SPARK-29672][PYSPARK] update spark testing framework to use python3 2019-11-14 10:18:55 -08:00
run-tests-jenkins [SPARK-29672][PYSPARK] update spark testing framework to use python3 2019-11-14 10:18:55 -08:00
run-tests-jenkins.py [SPARK-25016][INFRA][FOLLOW-UP] Remove leftover for dropping Hadoop 2.6 in Jenkins's test script 2019-11-30 12:49:14 +09:00
run-tests.py [SPARK-29991][INFRA] Support Hive 1.2 and Hive 2.3 (default) in PR builder 2019-11-30 12:48:15 +09:00
sbt-checkstyle [SPARK-27158][BUILD] dev/mima and dev/scalastyle support dynamic profiles 2019-03-15 08:20:42 +09:00
scalafmt [SPARK-29293][BUILD] Move scalafmt to Scala 2.12 profile; bump to 0.12 2019-11-26 09:59:19 -08:00
scalastyle [SPARK-27158][BUILD] dev/mima and dev/scalastyle support dynamic profiles 2019-03-15 08:20:42 +09:00
test-dependencies.sh [SPARK-30491][INFRA] Enable dependency audit files to tell dependency classifier 2020-01-15 20:19:44 -08:00
tox.ini [SPARK-30450][INFRA] Exclude .git folder for python linter 2020-01-07 15:14:17 -08:00

Spark Developer Scripts

This directory contains scripts useful to developers when packaging, testing, or committing to Spark.

Many of these scripts require Apache credentials to work correctly.