spark-instrumented-optimizer/assembly
Sean Owen 2e5a7cde22 SPARK-1827. LICENSE and NOTICE files need a refresh to contain transitive dependency info
LICENSE and NOTICE policy is explained here:

http://www.apache.org/dev/licensing-howto.html
http://www.apache.org/legal/3party.html

This leads to the following changes.

First, this change enables two extensions to maven-shade-plugin in assembly/ that will try to include and merge all NOTICE and LICENSE files. This can't hurt.

This generates a consolidated NOTICE file that I manually added to NOTICE.

Next, a list of all dependencies and their licenses was generated:
`mvn ... license:aggregate-add-third-party`
to create: `target/generated-sources/license/THIRD-PARTY.txt`

Each dependency is listed with one or more licenses. Determine the most-compatible license for each if there is more than one.

For "unknown" license dependencies, I manually evaluateD their license. Many are actually Apache projects or components of projects covered already. The only non-trivial one was Colt, which has its own (compatible) license.

I ignored Apache-licensed and public domain dependencies as these require no further action (beyond NOTICE above).

BSD and MIT licenses (permissive Category A licenses) are evidently supposed to be mentioned in LICENSE, so I added a section without output from the THIRD-PARTY.txt file appropriately.

Everything else, Category B licenses, are evidently mentioned in NOTICE (?) Same there.

LICENSE contained some license statements for source code that is redistributed. I left this as I think that is the right place to put it.

Author: Sean Owen <sowen@cloudera.com>

Closes #770 from srowen/SPARK-1827 and squashes the following commits:

a764504 [Sean Owen] Add LICENSE and NOTICE info for all transitive dependencies as of 1.0
2014-05-14 09:38:33 -07:00
..
src SPARK-1184: Update the distribution tar.gz to include spark-assembly jar 2014-03-05 16:52:58 -08:00
pom.xml SPARK-1827. LICENSE and NOTICE files need a refresh to contain transitive dependency info 2014-05-14 09:38:33 -07:00
README Updating assembly README to reflect recent changes in the build. 2013-09-04 20:54:35 -07:00

This is an assembly module for Spark project.

It creates a single tar.gz file that includes all needed dependency of the project
except for org.apache.hadoop.* jars that are supposed to be available from the
deployed Hadoop cluster.

This module is off by default. To activate it specify the profile in the command line
  -Pbigtop-dist

If you need to build an assembly for a different version of Hadoop the
hadoop-version system property needs to be set as in this example:
  -Dhadoop.version=2.0.6-alpha