Commit graph

135 commits

Author SHA1 Message Date
Sandy Ryza 912563aa35 SPARK-4338. [YARN] Ditch yarn-alpha.
Sorry if this is a little premature with 1.2 still not out the door, but it will make other work like SPARK-4136 and SPARK-2089 a lot easier.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #3215 from sryza/sandy-spark-4338 and squashes the following commits:

1c5ac08 [Sandy Ryza] Update building Spark docs and remove unnecessary newline
9c1421c [Sandy Ryza] SPARK-4338. Ditch yarn-alpha.
2014-12-09 11:02:43 -08:00
Patrick Wendell 8dae26f838 [HOTFIX] Fixing two issues with the release script.
1. The version replacement was still producing some false changes.
2. Uploads to the staging repo specifically.

Author: Patrick Wendell <pwendell@gmail.com>

Closes #3608 from pwendell/release-script and squashes the following commits:

3c63294 [Patrick Wendell] Fixing two issues with the release script:
2014-12-04 12:11:41 -08:00
Andrew Or a4dfb4efef [Release] Correctly translate contributors name in release notes
This commit involves three main changes:

(1) It separates the translation of contributor names from the
generation of the contributors list. This is largely motivated
by the Github API limit; even if we exceed this limit, we should
at least be able to proceed manually as before. This is why the
translation logic is abstracted into its own script
translate-contributors.py.

(2) When we look for candidate replacements for invalid author
names, we should look for the assignees of the associated JIRAs
too. As a result, the intermediate file must keep track of these.

(3) This provides an interactive mode with which the user can
sit at the terminal and manually pick the candidate replacement
that he/she thinks makes the most sense. As before, there is a
non-interactive mode that picks the first candidate that the
script considers "valid."

TODO: We should have a known_contributors file that stores
known mappings so we don't have to go through all of this
translation every time. This is also valuable because some
contributors simply cannot be automatically translated.
2014-12-03 19:10:07 -08:00
Andrew Or 5da21f07d8 [Release] Translate unknown author names automatically 2014-12-02 16:36:12 -08:00
Takayuki Hasegawa 4316a7b010 SPARK-4507: PR merge script should support closing multiple JIRA tickets
This will fix SPARK-4507.

For pull requests that reference multiple JIRAs in their titles, it would be helpful if the PR merge script offered to close all of them.

Author: Takayuki Hasegawa <takayuki.hasegawa0311@gmail.com>

Closes #3428 from hase1031/SPARK-4507 and squashes the following commits:

bf6d64b [Takayuki Hasegawa] SPARK-4507: try to resolve issue when no JIRAs in title
401224c [Takayuki Hasegawa] SPARK-4507: moved codes as before
ce89021 [Takayuki Hasegawa] SPARK-4507: PR merge script should support closing multiple JIRA tickets
2014-11-29 23:12:10 -05:00
Andrew Or c86e9bc4fd [Release] Automate generation of contributors list
This commit provides a script that computes the contributors list
by linking the github commits with JIRA issues. Automatically
translating github usernames remains a TODO at this point.
2014-11-26 23:16:23 -08:00
Patrick Wendell 4d95526a75 [HOTFIX]: Adding back without-hive dist 2014-11-25 23:10:46 -05:00
Patrick Wendell c6e0c2ab1c SPARK-4466: Provide support for publishing Scala 2.11 artifacts to Maven
The maven release plug-in does not have support for publishing two separate sets of artifacts for a single release. Because of the way that Scala 2.11 support in Spark works, we have to write some customized code to do this. The good news is that the Maven release API is just a thin wrapper on doing git commits and pushing artifacts to the HTTP API of Apache's Sonatype server and this might overall make our deployment easier to understand.

This was already used for the 1.2 snapshot, so I think it is working well. One other nice thing is this could be pretty easily extended to publish nightly snapshots.

Author: Patrick Wendell <pwendell@gmail.com>

Closes #3332 from pwendell/releases and squashes the following commits:

2fedaed [Patrick Wendell] Automate the opening and closing of Sonatype repos
e2a24bb [Patrick Wendell] Fixing issue where we overrode non-spark version numbers
9df3a50 [Patrick Wendell] Adding TODO
1cc1749 [Patrick Wendell] Don't build the thriftserver for 2.11
933201a [Patrick Wendell] Make tagging of release commit eager
d0388a6 [Patrick Wendell] Support Scala 2.11 build
4f4dc62 [Patrick Wendell] Change to 2.11 should not be included when committing new patch
bf742e1 [Patrick Wendell] Minor fixes
ffa1df2 [Patrick Wendell] Adding a Scala 2.11 package to test it
9ac4381 [Patrick Wendell] Addressing TODO
b3105ff [Patrick Wendell] Removing commented out code
d906803 [Patrick Wendell] Small fix
3f4d985 [Patrick Wendell] More work
fcd54c2 [Patrick Wendell] Consolidating use of keys
df2af30 [Patrick Wendell] Changes to release stuff
2014-11-17 21:07:50 -08:00
Andrew Or 723a86b04c [Release] Bring audit scripts up-to-date
This involves a few main changes:
- Log all output message to the log file. Previously the log file
  was not useful because it did not indicate progress.
- Remove hive-site.xml in sbt_hive_app to avoid interference
- Add the appropriate repositories for new dependencies
2014-11-12 16:35:39 -08:00
Andrew Or c3afd3266d [Release] Correct make-distribution.sh log path 2014-11-12 13:46:26 -08:00
Prashant Sharma daaca14c16 Support cross building for Scala 2.11
Let's give this another go using a version of Hive that shades its JLine dependency.

Author: Prashant Sharma <prashant.s@imaginea.com>
Author: Patrick Wendell <pwendell@gmail.com>

Closes #3159 from pwendell/scala-2.11-prashant and squashes the following commits:

e93aa3e [Patrick Wendell] Restoring -Phive-thriftserver profile and cleaning up build script.
f65d17d [Patrick Wendell] Fixing build issue due to merge conflict
a8c41eb [Patrick Wendell] Reverting dev/run-tests back to master state.
7a6eb18 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into scala-2.11-prashant
583aa07 [Prashant Sharma] REVERT ME: removed hive thirftserver
3680e58 [Prashant Sharma] Revert "REVERT ME: Temporarily removing some Cli tests."
935fb47 [Prashant Sharma] Revert "Fixed by disabling a few tests temporarily."
925e90f [Prashant Sharma] Fixed by disabling a few tests temporarily.
2fffed3 [Prashant Sharma] Exclude groovy from sbt build, and also provide a way for such instances in future.
8bd4e40 [Prashant Sharma] Switched to gmaven plus, it fixes random failures observer with its predecessor gmaven.
5272ce5 [Prashant Sharma] SPARK_SCALA_VERSION related bugs.
2121071 [Patrick Wendell] Migrating version detection to PySpark
b1ed44d [Patrick Wendell] REVERT ME: Temporarily removing some Cli tests.
1743a73 [Patrick Wendell] Removing decimal test that doesn't work with Scala 2.11
f5cad4e [Patrick Wendell] Add Scala 2.11 docs
210d7e1 [Patrick Wendell] Revert "Testing new Hive version with shaded jline"
48518ce [Patrick Wendell] Remove association of Hive and Thriftserver profiles.
e9d0a06 [Patrick Wendell] Revert "Enable thritfserver for Scala 2.10 only"
67ec364 [Patrick Wendell] Guard building of thriftserver around Scala 2.10 check
8502c23 [Patrick Wendell] Enable thritfserver for Scala 2.10 only
e22b104 [Patrick Wendell] Small fix in pom file
ec402ab [Patrick Wendell] Various fixes
0be5a9d [Patrick Wendell] Testing new Hive version with shaded jline
4eaec65 [Prashant Sharma] Changed scripts to ignore target.
5167bea [Prashant Sharma] small correction
a4fcac6 [Prashant Sharma] Run against scala 2.11 on jenkins.
80285f4 [Prashant Sharma] MAven equivalent of setting spark.executor.extraClasspath during tests.
034b369 [Prashant Sharma] Setting test jars on executor classpath during tests from sbt.
d4874cb [Prashant Sharma] Fixed Python Runner suite. null check should be first case in scala 2.11.
6f50f13 [Prashant Sharma] Fixed build after rebasing with master. We should use ${scala.binary.version} instead of just 2.10
e56ca9d [Prashant Sharma] Print an error if build for 2.10 and 2.11 is spotted.
937c0b8 [Prashant Sharma] SCALA_VERSION -> SPARK_SCALA_VERSION
cb059b0 [Prashant Sharma] Code review
0476e5e [Prashant Sharma] Scala 2.11 support with repl and all build changes.
2014-11-11 21:36:48 -08:00
Andrew Or 2ddb1415e2 [Release] Log build output for each distribution 2014-11-11 18:02:59 -08:00
Cheng Lian 534b231417 [SPARK-4000][Build] Uploads HiveCompatibilitySuite logs
This is a follow up of #2845. In addition to unit-tests.log files, also upload failure output files generated by `HiveCompatibilitySuite` to Jenkins master. These files can be very helpful to debug Hive compatibility test failures.

/cc pwendell marmbrus

Author: Cheng Lian <lian@databricks.com>

Closes #2993 from liancheng/upload-hive-compat-logs and squashes the following commits:

8e6247f [Cheng Lian] Uploads HiveCompatibilitySuite logs
2014-11-10 16:17:52 -08:00
Xiangrui Meng 1a9c6cddad [SPARK-3573][MLLIB] Make MLlib's Vector compatible with SQL's SchemaRDD
Register MLlib's Vector as a SQL user-defined type (UDT) in both Scala and Python. With this PR, we can easily map a RDD[LabeledPoint] to a SchemaRDD, and then select columns or save to a Parquet file. Examples in Scala/Python are attached. The Scala code was copied from jkbradley.

~~This PR contains the changes from #3068 . I will rebase after #3068 is merged.~~

marmbrus jkbradley

Author: Xiangrui Meng <meng@databricks.com>

Closes #3070 from mengxr/SPARK-3573 and squashes the following commits:

3a0b6e5 [Xiangrui Meng] organize imports
236f0a0 [Xiangrui Meng] register vector as UDT and provide dataset examples
2014-11-03 22:29:48 -08:00
wangfei 7c41d13570 [SPARK-3826][SQL]enable hive-thriftserver to support hive-0.13.1
In #2241 hive-thriftserver is not enabled. This patch enable hive-thriftserver to support hive-0.13.1 by using a shim layer refer to #2241.

 1 A light shim layer(code in sql/hive-thriftserver/hive-version) for each different hive version to handle api compatibility

 2 New pom profiles "hive-default" and "hive-versions"(copy from #2241) to activate different hive version

 3 SBT cmd for different version as follows:
   hive-0.12.0 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.12.0 assembly
   hive-0.13.1 --- sbt/sbt -Phive,hadoop-2.3 -Phive-0.13.1 assembly

 4 Since hive-thriftserver depend on hive subproject, this patch should be merged with #2241 to enable hive-0.13.1 for hive-thriftserver

Author: wangfei <wangfei1@huawei.com>
Author: scwf <wangfei1@huawei.com>

Closes #2685 from scwf/shim-thriftserver1 and squashes the following commits:

f26f3be [wangfei] remove clean to save time
f5cac74 [wangfei] remove local hivecontext test
578234d [wangfei] use new shaded hive
18fb1ff [wangfei] exclude kryo in hive pom
fa21d09 [wangfei] clean package assembly/assembly
8a4daf2 [wangfei] minor fix
0d7f6cf [wangfei] address comments
f7c93ae [wangfei] adding build with hive 0.13 before running tests
bcf943f [wangfei] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver1
c359822 [wangfei] reuse getCommandProcessor in hiveshim
52674a4 [scwf] sql/hive included since examples depend on it
3529e98 [scwf] move hive module to hive profile
f51ff4e [wangfei] update and fix conflicts
f48d3a5 [scwf] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver1
41f727b [scwf] revert pom changes
13afde0 [scwf] fix small bug
4b681f4 [scwf] enable thriftserver in profile hive-0.13.1
0bc53aa [scwf] fixed when result filed is null
dfd1c63 [scwf] update run-tests to run hive-0.12.0 default now
c6da3ce [scwf] Merge branch 'master' of https://github.com/apache/spark into shim-thriftserver
7c66b8e [scwf] update pom according spark-2706
ae47489 [scwf] update and fix conflicts
2014-10-31 11:27:59 -07:00
GuoQiang Li 89e8a5d8ba [SPARK-3997][Build]scalastyle should output the error location
Author: GuoQiang Li <witgo@qq.com>

Closes #2846 from witgo/SPARK-3997 and squashes the following commits:

d6a57f8 [GuoQiang Li] scalastyle should output the error location
2014-10-26 16:24:50 -07:00
Michael Armbrust 879a165858 [HOTFIX][SQL] Temporarily turn off hive-server tests.
The thirift server is not available in the default (hive13) profile yet which is breaking all SQL only PRs.  This turns off these test until #2685 is merged.

Author: Michael Armbrust <michael@databricks.com>

Closes #2950 from marmbrus/fixTests and squashes the following commits:

1a6dfee [Michael Armbrust] [HOTFIX][SQL] Temporarily turn of hive-server tests.
2014-10-26 15:24:41 -07:00
Michael Armbrust 3a845d3c04 [SQL] Update Hive test harness for Hive 12 and 13
As part of the upgrade I also copy the newest version of the query tests, and whitelist a bunch of new ones that are now passing.

Author: Michael Armbrust <michael@databricks.com>

Closes #2936 from marmbrus/fix13tests and squashes the following commits:

d9cbdab [Michael Armbrust] Remove user specific tests
65801cd [Michael Armbrust] style and rat
8f6b09a [Michael Armbrust] Update test harness to work with both Hive 12 and 13.
f044843 [Michael Armbrust] Update Hive query tests and golden files to 0.13
2014-10-24 18:36:35 -07:00
Zhan Zhang 7c89a8f0c8 [SPARK-2706][SQL] Enable Spark to support Hive 0.13
Given that a lot of users are trying to use hive 0.13 in spark, and the incompatibility between hive-0.12 and hive-0.13 on the API level I want to propose following approach, which has no or minimum impact on existing hive-0.12 support, but be able to jumpstart the development of hive-0.13 and future version support.

Approach: Introduce “hive-version” property,  and manipulate pom.xml files to support different hive version at compiling time through shim layer, e.g., hive-0.12.0 and hive-0.13.1. More specifically,

1. For each different hive version, there is a very light layer of shim code to handle API differences, sitting in sql/hive/hive-version, e.g., sql/hive/v0.12.0 or sql/hive/v0.13.1

2. Add a new profile hive-default active by default, which picks up all existing configuration and hive-0.12.0 shim (v0.12.0)  if no hive.version is specified.

3. If user specifies different version (currently only 0.13.1 by -Dhive.version = 0.13.1), hive-versions profile will be activated, which pick up hive-version specific shim layer and configuration, mainly the hive jars and hive-version shim, e.g., v0.13.1.

4. With this approach, nothing is changed with current hive-0.12 support.

No change by default: sbt/sbt -Phive
For example: sbt/sbt -Phive -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 assembly

To enable hive-0.13: sbt/sbt -Dhive.version=0.13.1
For example: sbt/sbt -Dhive.version=0.13.1 -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 assembly

Note that in hive-0.13, hive-thriftserver is not enabled, which should be fixed by other Jira, and we don’t need -Phive with -Dhive.version in building (probably we should use -Phive -Dhive.version=xxx instead after thrift server is also supported in hive-0.13.1).

Author: Zhan Zhang <zhazhan@gmail.com>
Author: zhzhan <zhazhan@gmail.com>
Author: Patrick Wendell <pwendell@gmail.com>

Closes #2241 from zhzhan/spark-2706 and squashes the following commits:

3ece905 [Zhan Zhang] minor fix
410b668 [Zhan Zhang] solve review comments
cbb4691 [Zhan Zhang] change run-test for new options
0d4d2ed [Zhan Zhang] rebase
497b0f4 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
8fad1cf [Zhan Zhang] change the pom file and make hive-0.13.1 as the default
ab028d1 [Zhan Zhang] rebase
4a2e36d [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
4cb1b93 [zhzhan] Merge pull request #1 from pwendell/pr-2241
b0478c0 [Patrick Wendell] Changes to simplify the build of SPARK-2706
2b50502 [Zhan Zhang] rebase
a72c0d4 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
cb22863 [Zhan Zhang] correct the typo
20f6cf7 [Zhan Zhang] solve compatability issue
f7912a9 [Zhan Zhang] rebase and solve review feedback
301eb4a [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
10c3565 [Zhan Zhang] address review comments
6bc9204 [Zhan Zhang] rebase and remove temparory repo
d3aa3f2 [Zhan Zhang] Merge branch 'master' into spark-2706
cedcc6f [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
3ced0d7 [Zhan Zhang] rebase
d9b981d [Zhan Zhang] rebase and fix error due to rollback
adf4924 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
3dd50e8 [Zhan Zhang] solve conflicts and remove unnecessary implicts
d10bf00 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
dc7bdb3 [Zhan Zhang] solve conflicts
7e0cc36 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
d7c3e1e [Zhan Zhang] Merge branch 'master' into spark-2706
68deb11 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
d48bd18 [Zhan Zhang] address review comments
3ee3b2b [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
57ea52e [Zhan Zhang] Merge branch 'master' into spark-2706
2b0d513 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
9412d24 [Zhan Zhang] address review comments
f4af934 [Zhan Zhang] rebase
1ccd7cc [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
128b60b [Zhan Zhang] ignore 0.12.0 test cases for the time being
af9feb9 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
5f5619f [Zhan Zhang] restructure the directory and different hive version support
05d3683 [Zhan Zhang] solve conflicts
e4c1982 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
94b4fdc [Zhan Zhang] Spark-2706: hive-0.13.1 support on spark
87ebf3b [Zhan Zhang] Merge branch 'master' into spark-2706
921e914 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
f896b2a [Zhan Zhang] Merge branch 'master' into spark-2706
789ea21 [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
cb53a2c [Zhan Zhang] Merge branch 'master' of https://github.com/apache/spark
f6a8a40 [Zhan Zhang] revert
ba14f28 [Zhan Zhang] test
dbedff3 [Zhan Zhang] Merge remote-tracking branch 'upstream/master'
70964fe [Zhan Zhang] revert
fe0f379 [Zhan Zhang] Merge branch 'master' of https://github.com/zhzhan/spark
70ffd93 [Zhan Zhang] revert
42585ec [Zhan Zhang] test
7d5fce2 [Zhan Zhang] test
2014-10-24 11:03:17 -07:00
Cheng Lian a29c9bd614 [SPARK-4000][BUILD] Sends archived unit tests logs to Jenkins master
This PR sends archived unit tests logs to the build history directory in Jenkins master, so that we can serve it via HTTP later to help debugging Jenkins build failures.

pwendell JoshRosen Please help review, thanks!

Author: Cheng Lian <lian@databricks.com>

Closes #2845 from liancheng/log-archive and squashes the following commits:

ac8d9d4 [Cheng Lian] Includes build number in messages posted to GitHub
68c7010 [Cheng Lian] Logs backup should be implemented in dev/run-tests-jenkins
4b912f7 [Cheng Lian] Sends archived unit tests logs to Jenkins master
2014-10-23 22:15:03 -07:00
Kousuke Saruta add174aa56 [SPARK-3843][Minor] Cleanup scalastyle.txt at the end of running dev/scalastyle
dev/scalastyle create a log file 'scalastyle.txt'. it is overwrote per running but never deleted even though dev/mima and dev/lint-python delete their log files.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2702 from sarutak/scalastyle-txt-cleanup and squashes the following commits:

d6e238e [Kousuke Saruta] Fixed dev/scalastyle to cleanup scalastyle.txt
2014-10-08 15:19:19 -07:00
Patrick Wendell bc4418727b HOTFIX: Use correct Hadoop profile in build 2014-10-08 13:43:34 -07:00
Nicholas Chammas 69c3f441a9 [SPARK-3479] [Build] Report failed test category
This PR allows SparkQA (i.e. Jenkins) to report in its posts to GitHub what category of test failed, if one can be determined.

The failure categories are:
* general failure
* RAT checks failed
* Scala style checks failed
* Python style checks failed
* Build failed
* Spark unit tests failed
* PySpark unit tests failed
* MiMa checks failed

This PR also fixes the diffing logic used to determine if a patch introduces new classes.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2606 from nchammas/report-failed-test-category and squashes the following commits:

d67df03 [Nicholas Chammas] report what test category failed
2014-10-06 14:19:06 -07:00
Patrick Wendell e222221e24 HOTFIX: Fix unicode error in merge script.
The merge script builds up a big command array and sometimes
this contains both unicode and ascii strings. This doesn't work
if you try to join them into a single string. Longer term a solution
is to go and make sure the source of all strings is unicode.

This patch provides a simpler solution... just print the array
rather than joining. I actually prefer printing an array here
anyways since joining on spaces is lossy in the case of arguments
that themselves contain spaces.

Author: Patrick Wendell <pwendell@gmail.com>

Closes #2645 from pwendell/merge-script and squashes the following commits:

167b792 [Patrick Wendell] HOTFIX: Fix unicode error in merge script.
2014-10-05 13:22:40 -07:00
Nicholas Chammas d3a3840e07 [Build] Post commit hash with timeout messages
[By request](https://github.com/apache/spark/pull/2588#issuecomment-57266871), and because it also makes sense.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2597 from nchammas/timeout-commit-hash and squashes the following commits:

3d90714 [Nicholas Chammas] Revert "testing: making timeout 1 minute"
2353c95 [Nicholas Chammas] testing: making timeout 1 minute
e3a477e [Nicholas Chammas] post commit hash with timeout
2014-09-30 13:28:41 -07:00
shane knapp a01a30927d SPARK-3745 - fix check-license to properly download and check jar
for details, see: https://issues.apache.org/jira/browse/SPARK-3745

Author: shane knapp <incomplete@gmail.com>

Closes #2596 from shaneknapp/SPARK-3745 and squashes the following commits:

c95eea9 [shane knapp] SPARK-3745 - fix check-license to properly download and check jar
2014-09-30 13:11:25 -07:00
Nicholas Chammas c429126066 [Build] Diff from branch point
Sometimes Jenkins posts [spurious reports of new classes being added](https://github.com/apache/spark/pull/2339#issuecomment-56570170). I believe this stems from diffing the patch against `master`, as opposed to against `master...`, which starts from the commit the PR was branched from.

This patch fixes that behavior.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2512 from nchammas/diff-only-commits-ahead and squashes the following commits:

c065599 [Nicholas Chammas] comment typo fix
a453c67 [Nicholas Chammas] diff from branch point
2014-09-24 11:33:58 -07:00
Nicholas Chammas 99b06b6fd2 [Build] Fix passing of args to sbt
Simple mistake, simple fix:
```shell
args="arg1 arg2 arg3"

sbt $args    # sbt sees 3 arguments
sbt "$args"  # sbt sees 1 argument
```

Should fix the problems we are seeing [here](https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-SBT/694/AMPLAB_JENKINS_BUILD_PROFILE=hadoop1.0,label=centos/console), for example.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2462 from nchammas/fix-sbt-master-build and squashes the following commits:

4500c86 [Nicholas Chammas] warn about quoting
10018a6 [Nicholas Chammas] Revert "test hadoop1 build"
7d5356c [Nicholas Chammas] Revert "re-add bad quoting for testing"
061600c [Nicholas Chammas] re-add bad quoting for testing
b2de56c [Nicholas Chammas] test hadoop1 build
43fb854 [Nicholas Chammas] unquote profile args
2014-09-19 15:44:47 -07:00
Nicholas Chammas 5547fa1ee9 [SPARK-3534] Add hive-thriftserver to SQL tests
Addresses the problem pointed out in [this comment](https://github.com/apache/spark/pull/2441#issuecomment-55990116).

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2442 from nchammas/patch-1 and squashes the following commits:

7e68b60 [Nicholas Chammas] [SPARK-3534] Add hive-thriftserver to SQL tests
2014-09-17 22:37:11 -07:00
Nicholas Chammas 7fc3bb7c88 [SPARK-3534] Fix expansion of testing arguments to sbt
Testing arguments to `sbt` need to be passed as an array, not a single, long string.

Fixes a bug introduced in #2420.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2437 from nchammas/selective-testing and squashes the following commits:

a9f9c1c [Nicholas Chammas] fix printing of sbt test arguments
cf57cbf [Nicholas Chammas] fix sbt test arguments
e33b978 [Nicholas Chammas] Merge pull request #2 from apache/master
0b47ca4 [Nicholas Chammas] Merge branch 'master' of github.com:nchammas/spark
8051486 [Nicholas Chammas] Merge pull request #1 from apache/master
03180a4 [Nicholas Chammas] Merge branch 'master' of github.com:nchammas/spark
d4c5f43 [Nicholas Chammas] Merge pull request #6 from apache/master
2014-09-17 15:14:04 -07:00
Nicholas Chammas 5044e4953a [SPARK-1455] [SPARK-3534] [Build] When possible, run SQL tests only.
If the only files changed are related to SQL, then only run the SQL tests.

This patch includes some cosmetic/maintainability refactoring. I would be more than happy to undo some of these changes if they are inappropriate.

We can accept this patch mostly as-is and address the immediate need documented in [SPARK-3534](https://issues.apache.org/jira/browse/SPARK-3534), or we can keep it open until a satisfactory solution along the lines [discussed here](https://issues.apache.org/jira/browse/SPARK-1455?focusedCommentId=14136424&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14136424) is reached.

Note: I had to hack this patch up to test it locally, so what I'm submitting here and what I tested are technically different.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2420 from nchammas/selective-testing and squashes the following commits:

db3fa2d [Nicholas Chammas] diff against master!
f9e23f6 [Nicholas Chammas] when possible, run SQL tests only
2014-09-17 12:44:44 -07:00
Prashant Sharma ecf0c02935 [SPARK-3433][BUILD] Fix for Mima false-positives with @DeveloperAPI and @Experimental annotations.
Actually false positive reported was due to mima generator not picking up the new jars in presence of old jars(theoretically this should not have happened.). So as a workaround, ran them both separately and just append them together.

Author: Prashant Sharma <prashant@apache.org>
Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #2285 from ScrapCodes/mima-fix and squashes the following commits:

093c76f [Prashant Sharma] Update mima
59012a8 [Prashant Sharma] Update mima
35b6c71 [Prashant Sharma] SPARK-3433 Fix for Mima false-positives with @DeveloperAPI and @Experimental annotations.
2014-09-15 21:14:00 -07:00
Matthew Farrellee fe2b1d6a20 [SPARK-3425] do not set MaxPermSize for OpenJDK 1.8
Closes #2387

Author: Matthew Farrellee <matt@redhat.com>

Closes #2301 from mattf/SPARK-3425 and squashes the following commits:

20f3c09 [Matthew Farrellee] [SPARK-3425] do not set MaxPermSize for OpenJDK 1.8
2014-09-15 10:57:59 -07:00
Cheng Lian ce5cb32587 [Build] Removed -Phive-thriftserver since this profile has been removed
Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #2269 from liancheng/clean-run-tests-profile and squashes the following commits:

08617bd [Cheng Lian] Removed -Phive-thriftserver since this profile has been removed
2014-09-09 00:50:59 -07:00
Prashant Sharma e16a8e7db5 SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
...

Tested ! TBH, it isn't a great idea to have directory with spaces within. Because emacs doesn't like it then hadoop doesn't like it. and so on...

Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #2229 from ScrapCodes/SPARK-3337/quoting-shell-scripts and squashes the following commits:

d4ad660 [Prashant Sharma] SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
2014-09-08 10:24:15 -07:00
Nicholas Chammas 9422c4ee0e [SPARK-3361] Expand PEP 8 checks to include EC2 script and Python examples
This PR resolves [SPARK-3361](https://issues.apache.org/jira/browse/SPARK-3361) by expanding the PEP 8 checks to cover the remaining Python code base:
* The EC2 script
* All Python / PySpark examples

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2297 from nchammas/pep8-rulez and squashes the following commits:

1e5ac9a [Nicholas Chammas] PEP 8 fixes to Python examples
c3dbeff [Nicholas Chammas] PEP 8 fixes to EC2 script
65ef6e8 [Nicholas Chammas] expand PEP 8 checks
2014-09-05 23:08:54 -07:00
Nicholas Chammas 19f61c1659 [Build] suppress curl/wget progress bars
In the Jenkins console output, `curl` gives us mountains of `#` symbols as it tries to show its download progress.

![noise from curl in Jenkins output](http://i.imgur.com/P2E7yUw.png)

I don't think this is useful so I've changed things to suppress these progress bars. If there is actually some use to this, feel free to reject this proposal.

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2279 from nchammas/trim-test-output and squashes the following commits:

14a720c [Nicholas Chammas] suppress curl/wget progress bars
2014-09-05 21:46:45 -07:00
Kousuke Saruta dc1ba9e9fc [SPARK-3378] [DOCS] Replace the word "SparkSQL" with right word "Spark SQL"
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2251 from sarutak/SPARK-3378 and squashes the following commits:

0bfe234 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-3378
bb5938f [Kousuke Saruta] Replaced rest of "SparkSQL" with "Spark SQL"
6df66de [Kousuke Saruta] Replaced "SparkSQL" with "Spark SQL"
2014-09-04 15:06:08 -07:00
Sean Owen 32ec0a8cd4 SPARK-3331 [BUILD] PEP8 tests fail because they check unzipped py4j code
PEP8 tests run on files under "./python", but unzipped py4j code is found at "./python/build/py4j". Py4J code fails style checks and can fail ./dev/run-tests if this code is present locally.

Author: Sean Owen <sowen@cloudera.com>

Closes #2222 from srowen/SPARK-3331 and squashes the following commits:

34711ec [Sean Owen] Restrict lint check to pyspark/, since the local directory can contain unzipped py4j code in build/py4j
2014-09-02 10:30:26 -07:00
Nicholas Chammas c567a68a59 [Spark QA] only check code files for new classes
Look only at code files (`.py`, `.java`, and `.scala`) for new classes.

Should get rid of false alarms like [the one reported here](https://github.com/apache/spark/pull/2014#issuecomment-52912040).

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2184 from nchammas/jenkins-ignore-noncode and squashes the following commits:

33786ac [Nicholas Chammas] break up long line
3f91a14 [Nicholas Chammas] rename array of source files
8b82a26 [Nicholas Chammas] [Spark QA] only check code files for new classes
2014-08-30 21:11:48 -07:00
Patrick Wendell a004a8d879 BUILD: Adding back CDH4 as per user requests 2014-08-29 22:24:35 -07:00
nchammas 3c517a812e [Spark QA] Link to console output on test time out
When tests time out we should link to the Jenkins console output for easy review. We already do this for when tests start or complete normally.

Here's [a recent example](https://github.com/apache/spark/pull/2109#issuecomment-53374032) of where this would be helpful.

Author: nchammas <nicholas.chammas@gmail.com>

Closes #2140 from nchammas/patch-1 and squashes the following commits:

3b26c8d [nchammas] [Spark QA] Link to console output on test time out
2014-08-28 18:08:28 -07:00
Matthew Farrellee 64d8ecbbe9 Add line continuation for script to work w/ py2.7.5
Error was -

$ SPARK_HOME=$PWD/dist ./dev/create-release/generate-changelist.py
  File "./dev/create-release/generate-changelist.py", line 128
    if day < SPARK_REPO_CHANGE_DATE1 or
                                      ^
SyntaxError: invalid syntax

Author: Matthew Farrellee <matt@redhat.com>

Closes #2139 from mattf/master-fix-generate-changelist.py-0 and squashes the following commits:

6b3a900 [Matthew Farrellee] Add line continuation for script to work w/ py2.7.5
2014-08-27 15:50:30 -07:00
Patrick Wendell 8712653f11 HOTFIX: Don't build with YARN support for Mapr3 2014-08-27 15:41:09 -07:00
Cheng Lian cf46e72581 [SPARK-3126][SPARK-3127][SQL] Fixed HiveThriftServer2Suite
This PR fixes two issues:

1. Fixes wrongly quoted command line option in `HiveThriftServer2Suite` that makes test cases hang until timeout.
1. Asks `dev/run-test` to run Spark SQL tests when `bin/spark-sql` and/or `sbin/start-thriftserver.sh` are modified.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #2036 from liancheng/fix-thriftserver-test and squashes the following commits:

f38c4eb [Cheng Lian] Fixed the same quotation issue in CliSuite
26b82a0 [Cheng Lian] Run SQL tests when dff contains bin/spark-sql and/or sbin/start-thriftserver.sh
a87f83d [Cheng Lian] Extended timeout
e5aa31a [Cheng Lian] Fixed metastore JDBC URI quotation
2014-08-20 12:57:39 -07:00
Patrick Wendell ceb19830b8 BUILD: Bump Hadoop versions in the release build.
Also, minor modifications to the MapR profile.
2014-08-20 12:19:19 -07:00
Patrick Wendell f2f26c2a1d SPARK-3092 [SQL]: Always include the thriftserver when -Phive is enabled.
Currently we have a separate profile called hive-thriftserver. I originally suggested this in case users did not want to bundle the thriftserver, but it's ultimately lead to a lot of confusion. Since the thriftserver is only a few classes, I don't see a really good reason to isolate it from the rest of Hive. So let's go ahead and just include it in the same profile to simplify things.

This has been suggested in the past by liancheng.

Author: Patrick Wendell <pwendell@gmail.com>

Closes #2006 from pwendell/hiveserver and squashes the following commits:

742ea40 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into hiveserver
034ad47 [Patrick Wendell] SPARK-3092: Always include the thriftserver when -Phive is enabled.
2014-08-20 12:13:31 -07:00
Josh Rosen 1f1819b20f [SPARK-3114] [PySpark] Fix Python UDFs in Spark SQL.
This fixes SPARK-3114, an issue where we inadvertently broke Python UDFs in Spark SQL.

This PR modifiers the test runner script to always run the PySpark SQL tests, irrespective of whether SparkSQL itself has been modified.  It also includes Davies' fix for the bug.

Closes #2026.

Author: Josh Rosen <joshrosen@apache.org>
Author: Davies Liu <davies.liu@gmail.com>

Closes #2027 from JoshRosen/pyspark-sql-fix and squashes the following commits:

9af2708 [Davies Liu] bugfix: disable compression of command
0d8d3a4 [Josh Rosen] Always run Python Spark SQL tests.
2014-08-18 20:42:19 -07:00
Patrick Wendell 5173f3c40f SPARK-2884: Create binary builds in parallel with release script. 2014-08-17 22:31:04 -07:00
Nicholas Chammas 4bdfaa16fc [SPARK-3076] [Jenkins] catch & report test timeouts
* Remove unused code to get jq
* Set timeout on tests and report gracefully on them

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #1974 from nchammas/master and squashes the following commits:

d1f1b6b [Nicholas Chammas] set timeout to realistic number
8b1ea41 [Nicholas Chammas] fix formatting
279526e [Nicholas Chammas] [SPARK-3076] catch & report test timeouts
2014-08-16 12:43:36 -07:00