Commit graph

97 commits

Author SHA1 Message Date
Sean Owen 425c4ada4c [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10
## What changes were proposed in this pull request?

- Remove Scala 2.10 build profiles and support
- Replace some 2.10 support in scripts with commented placeholders for 2.12 later
- Remove deprecated API calls from 2.10 support
- Remove usages of deprecated context bounds where possible
- Remove Scala 2.10 workarounds like ScalaReflectionLock
- Other minor Scala warning fixes

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #17150 from srowen/SPARK-19810.
2017-07-13 17:06:24 +08:00
Holden Karau 1b85bcd929 [SPARK-20627][PYSPARK] Drop the hadoop distirbution name from the Python version
## What changes were proposed in this pull request?

Drop the hadoop distirbution name from the Python version (PEP440 - https://www.python.org/dev/peps/pep-0440/). We've been using the local version string to disambiguate between different hadoop versions packaged with PySpark, but PEP0440 states that local versions should not be used when publishing up-stream. Since we no longer make PySpark pip packages for different hadoop versions, we can simply drop the hadoop information. If at a later point we need to start publishing different hadoop versions we can look at make different packages or similar.

## How was this patch tested?

Ran `make-distribution` locally

Author: Holden Karau <holden@us.ibm.com>

Closes #17885 from holdenk/SPARK-20627-remove-pip-local-version-string.
2017-05-09 11:25:29 -07:00
Josh Rosen 314cf51ded [SPARK-20102] Fix nightly packaging and RC packaging scripts w/ two minor build fixes
## What changes were proposed in this pull request?

The master snapshot publisher builds are currently broken due to two minor build issues:

1. For unknown reasons, the LFTP `mkdir -p` command began throwing errors when the remote directory already exists. This change of behavior might have been caused by configuration changes in the ASF's SFTP server, but I'm not entirely sure of that. To work around this problem, this patch updates the script to ignore errors from the `lftp mkdir -p` commands.
2. The PySpark `setup.py` file references a non-existent `pyspark.ml.stat` module, causing Python packaging to fail by complaining about a missing directory. The fix is to simply drop that line from the setup script.

## How was this patch tested?

The LFTP fix was tested by manually running the failing commands on AMPLab Jenkins against the ASF SFTP server. The PySpark fix was tested locally.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #17437 from JoshRosen/spark-20102.
2017-03-27 10:23:28 -07:00
Sean Owen 0e2405490f
[SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support
- Move external/java8-tests tests into core, streaming, sql and remove
- Remove MaxPermGen and related options
- Fix some reflection / TODOs around Java 8+ methods
- Update doc references to 1.7/1.8 differences
- Remove Java 7/8 related build profiles
- Update some plugins for better Java 8 compatibility
- Fix a few Java-related warnings

For the future:

- Update Java 8 examples to fully use Java 8
- Update Java tests to use lambdas for simplicity
- Update Java internal implementations to use lambdas

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #16871 from srowen/SPARK-19493.
2017-02-16 12:32:45 +00:00
Sean Owen e8d3fca450
[SPARK-19464][CORE][YARN][TEST-HADOOP2.6] Remove support for Hadoop 2.5 and earlier
## What changes were proposed in this pull request?

- Remove support for Hadoop 2.5 and earlier
- Remove reflection and code constructs only needed to support multiple versions at once
- Update docs to reflect newer versions
- Remove older versions' builds and profiles.

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #16810 from srowen/SPARK-19464.
2017-02-08 12:20:07 +00:00
Yin Huai 0c92318588 Update known_translations for contributor names
## What changes were proposed in this pull request?
Update known_translations per https://github.com/apache/spark/pull/16423#issuecomment-269739634

Author: Yin Huai <yhuai@databricks.com>

Closes #16628 from yhuai/known_translations.
2017-01-18 18:18:51 -08:00
hyukjinkwon 46b2126024
[SPARK-19002][BUILD][PYTHON] Check pep8 against all Python scripts
## What changes were proposed in this pull request?

This PR proposes to check pep8 against all other Python scripts and fix the errors as below:

```bash
./dev/create-release/generate-contributors.py
./dev/create-release/releaseutils.py
./dev/create-release/translate-contributors.py
./dev/lint-python
./python/docs/epytext.py
./examples/src/main/python/mllib/decision_tree_classification_example.py
./examples/src/main/python/mllib/decision_tree_regression_example.py
./examples/src/main/python/mllib/gradient_boosting_classification_example.py
./examples/src/main/python/mllib/gradient_boosting_regression_example.py
./examples/src/main/python/mllib/linear_regression_with_sgd_example.py
./examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
./examples/src/main/python/mllib/naive_bayes_example.py
./examples/src/main/python/mllib/random_forest_classification_example.py
./examples/src/main/python/mllib/random_forest_regression_example.py
./examples/src/main/python/mllib/svm_with_sgd_example.py
./examples/src/main/python/streaming/network_wordjoinsentiments.py
./sql/hive/src/test/resources/data/scripts/cat.py
./sql/hive/src/test/resources/data/scripts/cat_error.py
./sql/hive/src/test/resources/data/scripts/doubleescapedtab.py
./sql/hive/src/test/resources/data/scripts/dumpdata_script.py
./sql/hive/src/test/resources/data/scripts/escapedcarriagereturn.py
./sql/hive/src/test/resources/data/scripts/escapednewline.py
./sql/hive/src/test/resources/data/scripts/escapedtab.py
./sql/hive/src/test/resources/data/scripts/input20_script.py
./sql/hive/src/test/resources/data/scripts/newline.py
```

## How was this patch tested?

- `./python/docs/epytext.py`

  ```bash
  cd ./python/docs $$ make html
  ```

- pep8 check (Python 2.7 / Python 3.3.6)

  ```
  ./dev/lint-python
  ```

- `./dev/merge_spark_pr.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python -m doctest -v ./dev/merge_spark_pr.py
  ```

- `./dev/create-release/releaseutils.py` `./dev/create-release/generate-contributors.py` `./dev/create-release/translate-contributors.py` (Python 2.7 only / Python 3.3.6 not working)

  ```bash
  python generate-contributors.py
  python translate-contributors.py
  ```

- Examples (Python 2.7 / Python 3.3.6)

  ```bash
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/decision_tree_regression_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/gradient_boosting_regression_example.p
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_classification_example.py
  ./bin/spark-submit examples/src/main/python/mllib/random_forest_regression_example.py
  ```

- Examples (Python 2.7 only / Python 3.3.6 not working)
  ```
  ./bin/spark-submit examples/src/main/python/mllib/linear_regression_with_sgd_example.py
  ./bin/spark-submit examples/src/main/python/mllib/logistic_regression_with_lbfgs_example.py
  ./bin/spark-submit examples/src/main/python/mllib/naive_bayes_example.py
  ./bin/spark-submit examples/src/main/python/mllib/svm_with_sgd_example.py
  ```

- `sql/hive/src/test/resources/data/scripts/*.py` (Python 2.7 / Python 3.3.6 within suggested changes)

  Manually tested only changed ones.

- `./dev/github_jira_sync.py` (Python 2.7 only / Python 3.3.6 not working)

  Manually tested this after disabling actually adding comments and links.

And also via Jenkins tests.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #16405 from HyukjinKwon/minor-pep8.
2017-01-02 15:23:19 +00:00
Yin Huai 63036aee22 Update known_translations for contributor names and also fix a small issue in translate-contributors.py
## What changes were proposed in this pull request?
This PR updates dev/create-release/known_translations to add more contributor name mapping. It also fixes a small issue in translate-contributors.py

## How was this patch tested?
manually tested

Author: Yin Huai <yhuai@databricks.com>

Closes #16423 from yhuai/contributors.
2016-12-29 14:20:56 -08:00
Shivaram Venkataraman be5fc6ef72 [MINOR][SPARKR] Fix SparkR regex in copy command
Fix SparkR package copy regex. The existing code leads to
```
Copying release tarballs to /home/****/public_html/spark-nightly/spark-branch-2.1-bin/spark-2.1.1-SNAPSHOT-2016_12_08_22_38-e8f351f-bin
mput: SparkR-*: no files found
```

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #16231 from shivaram/typo-sparkr-build.
2016-12-09 10:12:56 -08:00
Felix Cheung c074c96dc5 Copy pyspark and SparkR packages to latest release dir too
## What changes were proposed in this pull request?

Copy pyspark and SparkR packages to latest release dir, as per comment [here](https://github.com/apache/spark/pull/16226#discussion_r91664822)

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #16227 from felixcheung/pyrftp.
2016-12-08 22:52:34 -08:00
Shivaram Venkataraman 934035ae7c Copy the SparkR source package with LFTP
This PR adds a line in release-build.sh to copy the SparkR source archive using LFTP

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #16226 from shivaram/fix-sparkr-copy-build.
2016-12-08 22:21:24 -08:00
Shivaram Venkataraman 202fcd21ce [SPARK-18590][SPARKR] Change the R source build to Hadoop 2.6
This PR changes the SparkR source release tarball to be built using the Hadoop 2.6 profile. Previously it was using the without hadoop profile which leads to an error as discussed in https://github.com/apache/spark/pull/16014#issuecomment-265843991

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #16218 from shivaram/fix-sparkr-release-build.
2016-12-08 13:01:46 -08:00
Felix Cheung c3d3a9d0e8 [SPARK-18590][SPARKR] build R source package when making distribution
## What changes were proposed in this pull request?

This PR has 2 key changes. One, we are building source package (aka bundle package) for SparkR which could be released on CRAN. Two, we should include in the official Spark binary distributions SparkR installed from this source package instead (which would have help/vignettes rds needed for those to work when the SparkR package is loaded in R, whereas earlier approach with devtools does not)

But, because of various differences in how R performs different tasks, this PR is a fair bit more complicated. More details below.

This PR also includes a few minor fixes.

### more details

These are the additional steps in make-distribution; please see [here](https://github.com/apache/spark/blob/master/R/CRAN_RELEASE.md) on what's going to a CRAN release, which is now run during make-distribution.sh.
1. package needs to be installed because the first code block in vignettes is `library(SparkR)` without lib path
2. `R CMD build` will build vignettes (this process runs Spark/SparkR code and captures outputs into pdf documentation)
3. `R CMD check` on the source package will install package and build vignettes again (this time from source packaged) - this is a key step required to release R package on CRAN
 (will skip tests here but tests will need to pass for CRAN release process to success - ideally, during release signoff we should install from the R source package and run tests)
4. `R CMD Install` on the source package (this is the only way to generate doc/vignettes rds files correctly, not in step # 1)
 (the output of this step is what we package into Spark dist and sparkr.zip)

Alternatively,
   R CMD build should already be installing the package in a temp directory though it might just be finding this location and set it to lib.loc parameter; another approach is perhaps we could try calling `R CMD INSTALL --build pkg` instead.
 But in any case, despite installing the package multiple times this is relatively fast.
Building vignettes takes a while though.

## How was this patch tested?

Manually, CI.

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #16014 from felixcheung/rdist.
2016-12-08 11:29:31 -08:00
Reynold Xin 37e52f8793 [SPARK-18639] Build only a single pip package
## What changes were proposed in this pull request?
We current build 5 separate pip binary tar balls, doubling the release script runtime. It'd be better to build one, especially for use cases that are just using Spark locally. In the long run, it would make more sense to have Hadoop support be pluggable.

## How was this patch tested?
N/A - this is a release build script that doesn't have any automated test coverage. We will know if it goes wrong when we prepare releases.

Author: Reynold Xin <rxin@databricks.com>

Closes #16072 from rxin/SPARK-18639.
2016-12-01 17:58:28 -08:00
Holden Karau a36a76ac43 [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed
## What changes were proposed in this pull request?

This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the jars over and package them with the Python code (to prevent challenges from trying to use different versions of the Python code with different versions of the JAR). It does not currently publish to PyPI but that is the natural follow up (SPARK-18129).

Done:
- pip installable on conda [manual tested]
- setup.py installed on a non-pip managed system (RHEL) with YARN [manual tested]
- Automated testing of this (virtualenv)
- packaging and signing with release-build*

Possible follow up work:
- release-build update to publish to PyPI (SPARK-18128)
- figure out who owns the pyspark package name on prod PyPI (is it someone with in the project or should we ask PyPI or should we choose a different name to publish with like ApachePySpark?)
- Windows support and or testing ( SPARK-18136 )
- investigate details of wheel caching and see if we can avoid cleaning the wheel cache during our test
- consider how we want to number our dev/snapshot versions

Explicitly out of scope:
- Using pip installed PySpark to start a standalone cluster
- Using pip installed PySpark for non-Python Spark programs

*I've done some work to test release-build locally but as a non-committer I've just done local testing.
## How was this patch tested?

Automated testing with virtualenv, manual testing with conda, a system wide install, and YARN integration.

release-build changes tested locally as a non-committer (no testing of upload artifacts to Apache staging websites)

Author: Holden Karau <holden@us.ibm.com>
Author: Juliet Hougland <juliet@cloudera.com>
Author: Juliet Hougland <not@myemail.com>

Closes #15659 from holdenk/SPARK-1267-pip-install-pyspark.
2016-11-16 14:22:15 -08:00
Holden Karau 1386fd28da [SPARK-18418] Fix flags for make_binary_release for hadoop profile
## What changes were proposed in this pull request?

Fix the flags used to specify the hadoop version

## How was this patch tested?

Manually tested as part of https://github.com/apache/spark/pull/15659 by having the build succeed.

cc joshrosen

Author: Holden Karau <holden@us.ibm.com>

Closes #15860 from holdenk/minor-fix-release-build-script.
2016-11-12 14:50:37 -08:00
Adam Roberts 3f8a0222e2
[SPARK-17828][DOCS] Remove unused generate-changelist.py
## What changes were proposed in this pull request?
We can remove this file based on discussion at https://issues.apache.org/jira/browse/SPARK-17828 it's evident this file has been redundant for a while, JIRA release notes serves this purpose for us already.

For ease of future reference you can find detailed release notes at, for example:

http://spark.apache.org/downloads.html -> http://spark.apache.org/releases/spark-release-2-0-1.html -> "Detailed changes" which links to https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315420&version=12336857

## How was this patch tested?
Searched the codebase and saw nothing referencing this, hasn't been used in a while (probably manually invoked a long time ago)

Author: Adam Roberts <aroberts@uk.ibm.com>

Closes #15419 from a-roberts/patch-7.
2016-10-10 23:16:40 +02:00
Shivaram Venkataraman 7c382524a9 [SPARK-17651][SPARKR] Set R package version number along with mvn
## What changes were proposed in this pull request?

This PR sets the R package version while tagging releases. Note that since R doesn't accept `-SNAPSHOT` in version number field, we remove that while setting the next version

## How was this patch tested?

Tested manually by running locally

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #15223 from shivaram/sparkr-version-change.
2016-09-23 14:35:18 -07:00
Michael Gummelt 8e5475be3c [SPARK-16967] move mesos to module
## What changes were proposed in this pull request?

Move Mesos code into a mvn module

## How was this patch tested?

unit tests
manually submitting a client mode and cluster mode job
spark/mesos integration test suite

Author: Michael Gummelt <mgummelt@mesosphere.io>

Closes #14637 from mgummelt/mesos-module.
2016-08-26 12:25:22 -07:00
Yin Huai 60ba436b70 [SPARK-16453][BUILD] release-build.sh is missing hive-thriftserver for scala 2.10
## What changes were proposed in this pull request?
This PR adds hive-thriftserver profile to scala 2.10 build created by release-build.sh.

Author: Yin Huai <yhuai@databricks.com>

Closes #14108 from yhuai/SPARK-16453.
2016-07-08 15:56:46 -07:00
Sean Owen 0eea12a3d9 [SPARK-13596][BUILD] Move misc top-level build files into appropriate subdirs
## What changes were proposed in this pull request?

Move many top-level files in dev/ or other appropriate directory. In particular, put `make-distribution.sh` in `dev` and update docs accordingly. Remove deprecated `sbt/sbt`.

I was (so far) unable to figure out how to move `tox.ini`. `scalastyle-config.xml` should be movable but edits to the project `.sbt` files didn't work; config file location is updatable for compile but not test scope.

## How was this patch tested?

`./dev/run-tests` to verify RAT and checkstyle work. Jenkins tests for the rest.

Author: Sean Owen <sowen@cloudera.com>

Closes #11522 from srowen/SPARK-13596.
2016-03-07 14:48:02 -08:00
Josh Rosen f77dc4e1e2 [SPARK-13474][PROJECT INFRA] Update packaging scripts to push artifacts to home.apache.org
Due to the people.apache.org -> home.apache.org migration, we need to update our packaging scripts to publish artifacts to the new server. Because the new server only supports sftp instead of ssh, we need to update the scripts to use lftp instead of ssh + rsync.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #11350 from JoshRosen/update-release-scripts-for-apache-home.
2016-02-26 18:40:00 -08:00
Josh Rosen 289373b28c [SPARK-6363][BUILD] Make Scala 2.11 the default Scala version
This patch changes Spark's build to make Scala 2.11 the default Scala version. To be clear, this does not mean that Spark will stop supporting Scala 2.10: users will still be able to compile Spark for Scala 2.10 by following the instructions on the "Building Spark" page; however, it does mean that Scala 2.11 will be the default Scala version used by our CI builds (including pull request builds).

The Scala 2.11 compiler is faster than 2.10, so I think we'll be able to look forward to a slight speedup in our CI builds (it looks like it's about 2X faster for the Maven compile-only builds, for instance).

After this patch is merged, I'll update Jenkins to add new compile-only jobs to ensure that Scala 2.10 compilation doesn't break.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10608 from JoshRosen/SPARK-6363.
2016-01-30 00:20:28 -08:00
Josh Rosen 8dbbf3e75e [SPARK-12842][TEST-HADOOP2.7] Add Hadoop 2.7 build profile
This patch adds a Hadoop 2.7 build profile in order to let us automate tests against that version.

/cc rxin srowen

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10775 from JoshRosen/add-hadoop-2.7-profile.
2016-01-15 17:07:24 -08:00
Reynold Xin 5b0d544339 [SPARK-12735] Consolidate & move spark-ec2 to AMPLab managed repository.
Author: Reynold Xin <rxin@databricks.com>

Closes #10673 from rxin/SPARK-12735.
2016-01-09 20:28:20 -08:00
Reynold Xin 0a38637d05 [SPARK-11807] Remove support for Hadoop < 2.2
i.e. Hadoop 1 and Hadoop 2.0

Author: Reynold Xin <rxin@databricks.com>

Closes #10404 from rxin/SPARK-11807.
2015-12-21 22:15:52 -08:00
Josh Rosen d0482f6af3 [SPARK-10932] [PROJECT INFRA] Port two minor changes to release-build.sh from scripts' old repo
Spark's release packaging scripts used to live in a separate repository. Although these scripts are now part of the Spark repo, there are some minor patches made against the old repos that are missing in Spark's copy of the script. This PR ports those changes.

/cc shivaram, who originally submitted these changes against https://github.com/rxin/spark-utils

Author: Josh Rosen <joshrosen@databricks.com>

Closes #8986 from JoshRosen/port-release-build-fixes-from-rxin-repo.
2015-10-13 15:18:20 -07:00
Luciano Resende 1894653edc [SPARK-10511] [BUILD] Reset git repository before packaging source distro
The calculation of Spark version is downloading
Scala and Zinc in the build directory which is
inflating the size of the source distribution.

Reseting the repo before packaging the source
distribution fix this issue.

Author: Luciano Resende <lresende@apache.org>

Closes #8774 from lresende/spark-10511.
2015-09-16 10:47:30 +01:00
Holden Karau 48817cc111 [SPARK-10497] [BUILD] [TRIVIAL] Handle both locations for JIRAError with python-jira
Location of JIRAError has moved between old and new versions of python-jira package.
Longer term it probably makes sense to pin to specific versions (as mentioned in https://issues.apache.org/jira/browse/SPARK-10498 ) but for now, making release tools works with both new and old versions of python-jira.

Author: Holden Karau <holden@pigscanfly.ca>

Closes #8661 from holdenk/SPARK-10497-release-utils-does-not-work-with-new-jira-python.
2015-09-10 16:42:12 +02:00
Reynold Xin ae74c3fa84 [RELEASE] Add more contributors & only show names in release notes.
Author: Reynold Xin <rxin@databricks.com>

Closes #8660 from rxin/contrib.
2015-09-08 17:36:00 -07:00
Josh Rosen 12de348332 [SPARK-10126] [PROJECT INFRA] Fix typo in release-build.sh which broke snapshot publishing for Scala 2.11
The current `release-build.sh` has a typo which breaks snapshot publication for Scala 2.11. We should change the Scala version to 2.11 and clean before building a 2.11 snapshot.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #8325 from JoshRosen/fix-2.11-snapshots.
2015-08-20 11:31:03 -07:00
Patrick Wendell 3ef0f32928 [SPARK-1517] Refactor release scripts to facilitate nightly publishing
This update contains some code changes to the release scripts that allow easier nightly publishing. I've been using these new scripts on Jenkins for cutting and publishing nightly snapshots for the last month or so, and it has been going well. I'd like to get them merged back upstream so this can be maintained by the community.

The main changes are:
1. Separates the release tagging from various build possibilities for an already tagged release (`release-tag.sh` and `release-build.sh`).
2. Allow for injecting credentials through the environment, including GPG keys. This is then paired with secure key injection in Jenkins.
3. Support for copying build results to a remote directory, and also "rotating" results, e.g. the ability to keep the last N copies of binary or doc builds.

I'm happy if anyone wants to take a look at this - it's not user facing but an internal utility used for generating releases.

Author: Patrick Wendell <patrick@databricks.com>

Closes #7411 from pwendell/release-script-updates and squashes the following commits:

74f9beb [Patrick Wendell] Moving maven build command to a variable
233ce85 [Patrick Wendell] [SPARK-1517] Refactor release scripts to facilitate nightly publishing
2015-08-11 21:16:48 -07:00
Sean Owen 6e5fd613ea [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated
Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix management of shade plugin version so children inherit it; bump assembly plugin version while here

See https://issues.apache.org/jira/browse/SPARK-8819

I verified that `mvn clean package -DskipTests` works with Maven 3.3.3.

pwendell are you up for trying this for the 1.5.0 release?

Author: Sean Owen <sowen@cloudera.com>

Closes #7826 from srowen/SPARK-9507 and squashes the following commits:

e0b0fd2 [Sean Owen] Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix management of shade plugin version so children inherit it; bump assembly plugin version while here
2015-07-31 21:51:55 +01:00
Michael Allman f5b6dc5e3e [SPARK-8401] [BUILD] Scala version switching build enhancements
These commits address a few minor issues in the Scala cross-version support in the build:

  1. Correct two missing `${scala.binary.version}` pom file substitutions.
  2. Don't update `scala.binary.version` in parent POM. This property is set through profiles.
  3. Update the source of the generated scaladocs in `docs/_plugins/copy_api_dirs.rb`.
  4. Factor common code out of `dev/change-version-to-*.sh` and add some validation. We also test `sed` to see if it's GNU sed and try `gsed` as an alternative if not. This prevents the script from running with a non-GNU sed.

This is my original work and I license this work to the Spark project under the Apache License.

Author: Michael Allman <michael@videoamp.com>

Closes #6832 from mallman/scala-versions and squashes the following commits:

cde2f17 [Michael Allman] Delete dev/change-version-to-*.sh, replacing them with single dev/change-scala-version.sh script that takes a version as argument
02296f2 [Michael Allman] Make the scala version change scripts cross-platform by restricting ourselves to POSIX sed syntax instead of looking for GNU sed
ad9b40a [Michael Allman] Factor change-scala-version.sh out of change-version-to-*.sh, adding command line argument validation and testing for GNU sed
bdd20bf [Michael Allman] Update source of scaladocs when changing Scala version
475088e [Michael Allman] Replace jackson-module-scala_2.10 with jackson-module-scala_${scala.binary.version}
2015-07-21 11:14:31 +01:00
Patrick Wendell 5572fd0c51 [HOTFIX] Adding new names to known contributors 2015-07-14 21:44:47 -07:00
Patrick Wendell 1cb2629f1a [HOTFIX] Rename release-profile to release
when publishing releases. We named it as 'release-profile' because that is
the Maven convention. However, it turns out this special name causes several
other things to kick-in when we are creating releases that are not desirable.
For instance, it triggers the javadoc plugin to run, which actually fails
in our current build set-up.

The fix is just to rename this to a different profile to have no
collateral damage associated with its use.
2015-07-06 22:17:30 -07:00
Andrew Or 9eae5fa642 [SPARK-8819] Fix build for maven 3.3.x
This is a workaround for MSHADE-148, which leads to an infinite loop when building Spark with maven 3.3.x. This was originally caused by #6441, which added a bunch of test dependencies on the spark-core test module. Recently, it was revealed by #7193.

This patch adds a `-Prelease` profile. If present, it will set `createDependencyReducedPom` to true. The consequences are:
- If you are releasing Spark with this profile, you are fine as long as you use maven 3.2.x or before.
- If you are releasing Spark without this profile, you will run into SPARK-8781.
- If you are not releasing Spark but you are using this profile, you may run into SPARK-8819.
- If you are not releasing Spark and you did not include this profile, you are fine.

This is all documented in `pom.xml` and tested locally with both versions of maven.

Author: Andrew Or <andrew@databricks.com>

Closes #7219 from andrewor14/fix-maven-build and squashes the following commits:

1d37e87 [Andrew Or] Merge branch 'master' of github.com:apache/spark into fix-maven-build
3574ae4 [Andrew Or] Review comments
f39199c [Andrew Or] Create a -Prelease profile that flags `createDependencyReducedPom`
2015-07-06 19:22:30 -07:00
Patrick Wendell e84545fa77 [HOTFIX] Fixing errors in name mappings 2015-06-10 22:56:36 -07:00
Patrick Wendell a777eb04bf [HOTFIX] Adding more contributor name bindings 2015-06-10 21:14:03 -07:00
Shivaram Venkataraman 3dc005282a [SPARK-8027] [SPARKR] Move man pages creation to install-dev.sh
This also helps us get rid of the sparkr-docs maven profile as docs are now built by just using -Psparkr when the roxygen2 package is available

Related to discussion in #6567

cc pwendell srowen -- Let me know if this looks better

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #6593 from shivaram/sparkr-pom-cleanup and squashes the following commits:

b282241 [Shivaram Venkataraman] Remove sparkr-docs from release script as well
8f100a5 [Shivaram Venkataraman] Move man pages creation to install-dev.sh This also helps us get rid of the sparkr-docs maven profile as docs are now built by just using -Psparkr when the roxygen2 package is available
2015-06-04 12:52:16 -07:00
Shivaram Venkataraman cae9306c4f [SPARK-8027] [SPARKR] Add maven profile to build R package docs
Also use that profile in create-release.sh

cc pwendell -- Note that this means that we need `knitr` and `roxygen` installed on the machines used for building the release. Let me know if you need help with that.

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #6567 from shivaram/SPARK-8027 and squashes the following commits:

8dc8ecf [Shivaram Venkataraman] Add maven profile to build R package docs Also use that profile in create-release.sh
2015-06-01 21:21:45 -07:00
Davies Liu a4df0f2d84 Fix install jira-python
jira-pytyhon package should be installed by

  sudo pip install jira

cc pwendell

Author: Davies Liu <davies@databricks.com>

Closes #6367 from davies/fix_jira_python2 and squashes the following commits:

fbb3c8e [Davies Liu] Fix install jira-python
2015-05-23 09:14:07 -07:00
Shivaram Venkataraman 017b3404a5 [MINOR] Add SparkR to create-release script
Enables the SparkR profiles for all the binary builds we create

cc pwendell

Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #6371 from shivaram/sparkr-create-release and squashes the following commits:

ca5a0b2 [Shivaram Venkataraman] Add -Psparkr to create-release.sh
2015-05-22 22:33:49 -07:00
FavioVazquez 7fb715de6d [SPARK-7249] Updated Hadoop dependencies due to inconsistency in the versions
Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons.

Changes proposed by vanzin resulting from previous pull-request https://github.com/apache/spark/pull/5783 that did not fixed the problem correctly.

Please let me know if this is the correct way of doing this, the comments of vanzin are in the pull-request mentioned.

Author: FavioVazquez <favio.vazquezp@gmail.com>

Closes #5786 from FavioVazquez/update-hadoop-dependencies and squashes the following commits:

11670e5 [FavioVazquez] - Added missing instance of -Phadoop-2.2 in create-release.sh
379f50d [FavioVazquez] - Added instances of -Phadoop-2.2 in create-release.sh, run-tests, scalastyle and building-spark.md - Reconstructed docs to not ask users to rely on default behavior
3f9249d [FavioVazquez] Merge branch 'master' of https://github.com/apache/spark into update-hadoop-dependencies
31bdafa [FavioVazquez] - Added missing instances in -Phadoop-1 in create-release.sh, run-tests and in the building-spark documentation
cbb93e8 [FavioVazquez] - Added comment related to SPARK-3710 about  hadoop-yarn-server-tests in Hadoop 2.2 that fails to pull some needed dependencies
83dc332 [FavioVazquez] - Cleaned up the main POM concerning the yarn profile - Erased hadoop-2.2 profile from yarn/pom.xml and its content was integrated into yarn/pom.xml
93f7624 [FavioVazquez] - Deleted unnecessary comments and <activation> tag on the YARN profile in the main POM
668d126 [FavioVazquez] - Moved <dependencies> <activation> and <properties> sections of the hadoop-2.2 profile in the YARN POM to the YARN profile in the root POM - Erased unnecessary hadoop-2.2 profile from the YARN POM
fda6a51 [FavioVazquez] - Updated hadoop1 releases in create-release.sh  due to changes in the default hadoop version set - Erased unnecessary instance of -Dyarn.version=2.2.0 in create-release.sh - Prettify comment in yarn/pom.xml
0470587 [FavioVazquez] - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in create-release.sh - Updated how the releases are made in the create-release.sh no that the default hadoop version is the 2.2.0 - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in scalastyle - Erased unnecessary instance of -Phadoop-2.2 -Dhadoop.version=2.2.0 in run-tests - Better example given in the hadoop-third-party-distributions.md now that the default hadoop version is 2.2.0
a650779 [FavioVazquez] - Default value of avro.mapred.classifier has been set to hadoop2 in pom.xml - Cleaned up hadoop-2.3 and 2.4 profiles due to change in the default set in avro.mapred.classifier in pom.xml
199f40b [FavioVazquez] - Erased unnecessary CDH5-specific note in docs/building-spark.md - Remove example of instance -Phadoop-2.2 -Dhadoop.version=2.2.0 in docs/building-spark.md - Enabled hadoop-2.2 profile when the Hadoop version is 2.2.0, which is now the default .Added comment in the yarn/pom.xml to specify that.
88a8b88 [FavioVazquez] - Simplified Hadoop profiles due to new setting of global properties in the pom.xml file - Added comment to specify that the hadoop-2.2 profile is now the default hadoop profile in the pom.xml file - Erased hadoop-2.2 from related hadoop profiles now that is a no-op in the make-distribution.sh file
70b8344 [FavioVazquez] - Fixed typo in the make-distribution.sh file and added hadoop-1 in the Related profiles
287fa2f [FavioVazquez] - Updated documentation about specifying the hadoop version in building-spark. Now is clear that Spark will build against Hadoop 2.2.0 by default. - Added Cloudera CDH 5.3.3 without MapReduce example in the building-spark doc.
1354292 [FavioVazquez] - Fixed hadoop-1 version to match jenkins build profile in hadoop1.0 tests and documentation
6b4bfaf [FavioVazquez] - Cleanup in hadoop-2.x profiles since they contained mostly redundant stuff.
7e9955d [FavioVazquez] - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons
660decc [FavioVazquez] - Updated Hadoop dependencies due to inconsistency in the versions. Now the global properties are the ones used by the hadoop-2.2 profile, and the profile was set to empty but kept for backwards compatibility reasons
ec91ce3 [FavioVazquez] - Updated protobuf-java version of com.google.protobuf dependancy to fix blocking error when connecting to HDFS via the Hadoop Cloudera HDFS CDH5 (fix for 2.5.0-cdh5.3.3 version)
2015-05-14 15:22:58 +01:00
Misha Chernetsov 998aac21f0 [SPARK-4925] Publish Spark SQL hive-thriftserver maven artifact
turned on hive-thriftserver profile in release script

Author: Misha Chernetsov <chernetsov@gmail.com>

Closes #5429 from chernetsov/master and squashes the following commits:

9cc36af [Misha Chernetsov] [SPARK-4925] Publish Spark SQL hive-thriftserver maven artifact turned on hive-thriftserver profile in release script for scala 2.10
2015-04-27 11:27:56 -07:00
Patrick Wendell 3980ebdf18 HOTFIX: Changes to release script.
This fixes a big in the release script and also properly sets things
up so that Zinc launches multiple processes. I had done something
similar in 0c9a8e but it didn't fully work.
2015-03-12 18:37:19 -07:00
Patrick Wendell e921a665c5 BUILD: Adding more known contributor names 2015-03-11 22:24:32 -07:00
Patrick Wendell 0c9a8eaed7 BUILD: Minor tweaks to internal build scripts
This adds two features:
1. The ability to publish with a different maven version than
   that specified in the release source.
2. Forking of different Zinc instances during the parallel dist
   creation (to help with some stability issues).
2015-03-03 01:53:48 -08:00
Davies Liu f3f4c87b3d [SPARK-5944] [PySpark] fix version in Python API docs
use RELEASE_VERSION when building the Python API docs

Author: Davies Liu <davies@databricks.com>

Closes #4731 from davies/api_version and squashes the following commits:

c9744c9 [Davies Liu] Update create-release.sh
08cbc3f [Davies Liu] fix python docs
2015-02-25 15:13:34 -08:00
Patrick Wendell 0ef38f5fad SPARK-5542: Decouple publishing, packaging, and tagging in release script
These are some changes to the build script to allow parts of it to be run independently. This has already been tested during the 1.2.1 release cycle.

Author: Patrick Wendell <patrick@databricks.com>
Author: Patrick Wendell <pwendell@gmail.com>

Closes #4319 from pwendell/release-updates and squashes the following commits:

dfe7ed9 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into release-updates
478b072 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into release-updates
126dd0c [Patrick Wendell] Allow decoupling Maven publishing from cutting release
2015-02-02 21:00:30 -08:00