Commit graph

71 commits

Author SHA1 Message Date
Ismaël Mejía 8a552bfc76 [SPARK-34778][BUILD] Upgrade to Avro 1.10.2
### What changes were proposed in this pull request?
Update the  Avro version to 1.10.2

### Why are the changes needed?
To stay up to date with upstream and catch compatibility issues with zstd

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Unit tests

Closes #31866 from iemejia/SPARK-27733-upgrade-avro-1.10.2.

Authored-by: Ismaël Mejía <iemejia@gmail.com>
Signed-off-by: Yuming Wang <yumwang@ebay.com>
2021-03-22 19:30:14 +08:00
Ismaël Mejía e9e81f798f [SPARK-27733][CORE] Upgrade Avro to version 1.10.1
### What changes were proposed in this pull request?

Update Avro dependency to version 1.10.1

### Why are the changes needed?

To catch up multiple improvements of Avro as well as fix security issues on transitive dependencies.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Since there were no API changes required we just run the tests

Closes #31232 from iemejia/SPARK-27733-avro-upgrade.

Authored-by: Ismaël Mejía <iemejia@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2021-01-20 15:42:27 -08:00
HyukjinKwon ad99f14b42 [SPARK-33109][BUILD][FOLLOW-UP] Remove the obsolete comment about bringing sbt-dependency-graph back
### What changes were proposed in this pull request?

This PR proposes to remove an obsolete comment about adding the `sbt-dependency-graph` back in SBT plugins.

### Why are the changes needed?

sbt-dependency-graph is now built-in from SBT 1.4.0, see https://github.com/sbt/sbt/releases/tag/v1.4.0.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Manually tested `./build/sbt dependencyTree`.

Closes #30085 from HyukjinKwon/SPARK-33109.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-10-18 09:24:44 -07:00
Dongjoon Hyun dfb7790a9d [SPARK-33108][BUILD] Remove sbt-dependency-graph SBT plugin
### What changes were proposed in this pull request?

This PR aims to remove `sbt-dependency-graph` SBT plugin.

### Why are the changes needed?

`sbt-dependency-graph` officially doesn't support SBT 1.3.x and it's broken due to `NoSuchMethodError`. This cannot be fixed in `sbt-dependency-graph` side at SBT 1.3.x
- https://github.com/sbt/sbt-dependency-graph
    > Note: Under sbt >= 1.3.x some features might currently not work as expected or not at all (like dependencyLicenses).

```
$ build/sbt dependencyTree
Launching sbt from build/sbt-launch-1.3.13.jar
[info] welcome to sbt 1.3.13 (AdoptOpenJDK Java 1.8.0_252)
...
[error] java.lang.NoSuchMethodError: sbt.internal.LibraryManagement$.cachedUpdate(Lsbt/librarymanagement/DependencyResolution;Lsbt/librarymanagement/ModuleDescriptor;Lsbt/util/CacheStoreFactory;Ljava/lang/String;Lsbt/librarymanagement/UpdateConfiguration;Lscala/Function1;ZZZLsbt/librarymanagement/UnresolvedWarningConfiguration;Lsbt/librarymanagement/EvictionWarningOptions;ZLsbt/internal/librarymanagement/CompatibilityWarningOptions;Lsbt/util/Logger;)Lsbt/librarymanagement/UpdateReport;
```

**ALTERNATIVES**
- One alternative is `coursier`, but it requires `coursier-based sbt launcher` which is more intrusive.
  - https://get-coursier.io/docs/sbt-coursier.html#sbt-13x
    > you'll have to use the coursier-based sbt launcher, via its custom sbt-extras launcher for example.

- Another alternative is moving to `SBT 1.4.0` which uses `sbt-dependency-graph` as a built-in, but it's still new and will requires many change.

So, this PR aims to remove the broken plugin simply.

### Does this PR introduce _any_ user-facing change?

No. This is a dev-only change.

### How was this patch tested?

Manual.
```
$ build/sbt dependencyTree
...
[error] Not a valid command: dependencyTree
[error] Not a valid project ID: dependencyTree
[error] Not a valid key: dependencyTree (similar: dependencyOverrides, sbtDependency, dependencyResolution)
[error] dependencyTree
[error]               ^
```

Closes #29997 from dongjoon-hyun/remove_depedencyTree.

Lead-authored-by: Dongjoon Hyun <dongjoon@apache.org>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-10-09 22:35:12 -07:00
Denis Pyshev 6daa2aeb01 [SPARK-21708][BUILD] Migrate build to sbt 1.x
### What changes were proposed in this pull request?

Migrate sbt-launcher URL to download one for sbt 1.x.
Update plugins versions where required by sbt update.
Change sbt version to be used to latest released at the moment, 1.3.13
Adjust build settings according to plugins and sbt changes.

### Why are the changes needed?

Migration to sbt 1.x:
1. enhances dev experience in development
2. updates build plugins to bring there new features/to fix bugs in them
3. enhances build performance on sbt side
4. eases movement to Scala 3 / dotty

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

All existing tests passed, both on Jenkins and via Github Actions, also manually for Scala 2.13 profile.

Closes #29286 from gemelen/feature/sbt-1.x.

Authored-by: Denis Pyshev <git@gemelen.net>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-10-07 15:28:00 -07:00
Dongjoon Hyun 1ac6bd9f79 [SPARK-29729][BUILD] Upgrade ASM to 7.2
### What changes were proposed in this pull request?

This PR aims to upgrade ASM to 7.2.
- https://issues.apache.org/jira/browse/XBEAN-322 (Upgrade to ASM 7.2)
- https://asm.ow2.io/versions.html

### Why are the changes needed?

This will bring the following patches.
- 317875: Infinite loop when parsing invalid method descriptor
- 317873: Add support for RET instruction in AdviceAdapter
- 317872: Throw an exception if visitFrame used incorrectly
- add support for Java 14

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with the existing UTs.

Closes #26373 from dongjoon-hyun/SPARK-29729.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-03 10:42:38 -08:00
Dongjoon Hyun f23c5d7f67 [SPARK-29560][BUILD] Add typesafe bintray repo for sbt-mima-plugin
### What changes were proposed in this pull request?

This add `typesafe` bintray repo for `sbt-mima-plugin`.

### Why are the changes needed?

Since Oct 21, the following plugin causes [Jenkins failures](https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-branch-2.4-test-sbt-hadoop-2.6/611/console
) due to the missing jar.

- `branch-2.4`: `sbt-mima-plugin:0.1.17` is missing.
- `master`: `sbt-mima-plugin:0.3.0` is missing.

These versions of `sbt-mima-plugin` seems to be removed from the old repo.

```
$ rm -rf ~/.ivy2/

$ build/sbt scalastyle test:scalastyle
...
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
[warn] 	::          UNRESOLVED DEPENDENCIES         ::
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
[warn] 	:: com.typesafe#sbt-mima-plugin;0.1.17: not found
[warn] 	::::::::::::::::::::::::::::::::::::::::::::::
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Check `GitHub Action` linter result. This PR should pass. Or, manual check.
(Note that Jenkins PR builder didn't fail until now due to the local cache.)

Closes #26217 from dongjoon-hyun/SPARK-29560.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-22 16:30:29 -07:00
Dongjoon Hyun 39d53d3e74 [SPARK-29470][BUILD] Update plugins to latest versions
### What changes were proposed in this pull request?

This PR updates plugins to latest versions.

### Why are the changes needed?

This brings bug fixes like the following.
- https://issues.apache.org/jira/projects/MCOMPILER/versions/12343484 (maven-compiler-plugin)
- https://issues.apache.org/jira/projects/MJAVADOC/versions/12345060 (maven-javadoc-plugin)
- https://issues.apache.org/jira/projects/MCHECKSTYLE/versions/12342397 (maven-checkstyle-plugin)
- https://checkstyle.sourceforge.io/releasenotes.html#Release_8.25 (checkstyle)
- https://checkstyle.sourceforge.io/releasenotes.html#Release_8.24 (checkstyle)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins building and testing with the existing code.

Closes #26117 from dongjoon-hyun/SPARK-29470.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-15 11:55:52 -07:00
Fokko Driesprong d8dd5719b4 [SPARK-28713][BUILD] Bump checkstyle from 8.14 to 8.23
## What changes were proposed in this pull request?

Fixes a vulnerability from the GitHub Security Advisory Database:

_Moderate severity vulnerability that affects com.puppycrawl.tools:checkstyle_
Checkstyle prior to 8.18 loads external DTDs by default, which can potentially lead to denial of service attacks or the leaking of confidential information.

https://github.com/checkstyle/checkstyle/issues/6474

Affected versions: < 8.18

## How was this patch tested?

Ran checkstyle locally.

Closes #25432 from Fokko/SPARK-28713.

Authored-by: Fokko Driesprong <fokko@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-13 11:09:14 -07:00
Dongjoon Hyun 810be5dd20 [SPARK-27493][BUILD][FOLLOWUP] Upgrade ASM to 7.1 in plugins.sbt
## What changes were proposed in this pull request?

This is a follow-up of https://github.com/apache/spark/pull/24395. This PR update `plugins.sbt`, too.

## How was this patch tested?

Pass the Jenkins.

Closes #24444 from dongjoon-hyun/SPARK-ASM71-2.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-04-23 18:18:02 +00:00
Yuming Wang 0cbef34ede [MINOR][BUILD] Add ASF license header to plugins.sbt
## What changes were proposed in this pull request?

This PR add ASF license header to plugins.sbt, otherwise:
![image](https://user-images.githubusercontent.com/5399861/55273959-670b8800-530d-11e9-9b6f-214a3cde802e.png)

## How was this patch tested?
Warning disappears after adding ASF license header:
![image](https://user-images.githubusercontent.com/5399861/55273961-6c68d280-530d-11e9-9d15-5fb73a1b991e.png)

Closes #24248 from wangyum/plugins.sbt.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-03-30 12:47:02 -05:00
Dilip Biswal b15423361b [SPARK-27016][SQL][BUILD][FOLLOW-UP] Treat all antlr warnings as errors while generating the parser.
## What changes were proposed in this pull request?
Use the sbt maven plugin option`antlr4TreatWarningsAsErrors` to make sure the warnings are treated as errors while generating the parser. In the absence of it, we may inadvertently introduce problems while making grammar changes. Please refer to PR-23897 to know more about the context. We made a change in [pr-23925](https://github.com/apache/spark/pull/23925) which handled only the maven build.

In this PR, we handle the sbt build. I had submitted [PR-23](https://github.com/ihji/sbt-antlr4/pull/23) to enhance the sbt-antlr plugin to make is possible to pass the error on warning option.

## How was this patch tested?
Force an warning in the grammar file to check if the build fails. Then remove the warning to verify the build succeeds.

Closes #24060 from dilipbiswal/sbt_build_antlr.

Authored-by: Dilip Biswal <dbiswal@us.ibm.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-03-11 21:27:51 -07:00
Sean Owen 47851056c2 [SPARK-26124][BUILD] Update plugins to latest versions
## What changes were proposed in this pull request?

Update many plugins we use to the latest version, especially MiMa, which entails excluding some new errors on old changes.

## How was this patch tested?

N/A

Closes #23087 from srowen/Plugins.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-11-20 18:05:39 -06:00
hyukjinkwon 4a14dc0aff [SPARK-22269][BUILD] Run Java linter via SBT for Jenkins
## What changes were proposed in this pull request?

This PR proposes to check Java lint via SBT for Jenkins. It uses the SBT wrapper for checkstyle.

I manually tested. If we build the codes once, running this script takes 2 mins at maximum in my local:

Test codes:

```
Checkstyle failed at following occurrences:
[error] Checkstyle error found in /.../spark/core/src/test/java/test/org/apache/spark/JavaAPISuite.java:82: Line is longer than 100 characters (found 103).
[error] 1 issue(s) found in Checkstyle report: /.../spark/core/target/checkstyle-test-report.xml
[error] Checkstyle error found in /.../spark/sql/hive/src/test/java/org/apache/spark/sql/hive/JavaDataFrameSuite.java:84: Line is longer than 100 characters (found 115).
[error] 1 issue(s) found in Checkstyle report: /.../spark/sql/hive/target/checkstyle-test-report.xml
...
```

Main codes:

```
Checkstyle failed at following occurrences:
[error] Checkstyle error found in /.../spark/sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/InputPartition.java:39: Line is longer than 100 characters (found 104).
[error] Checkstyle error found in /.../spark/sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/InputPartitionReader.java:26: Line is longer than 100 characters (found 110).
[error] Checkstyle error found in /.../spark/sql/core/src/main/java/org/apache/spark/sql/sources/v2/reader/InputPartitionReader.java:30: Line is longer than 100 characters (found 104).
...
```

## How was this patch tested?

Manually tested. Jenkins build should test this.

Author: hyukjinkwon <gurwls223@apache.org>

Closes #21399 from HyukjinKwon/SPARK-22269.
2018-05-24 14:19:32 +08:00
pj.fanning 1ff41d8693 [SPARK-21708][BUILD] update some sbt plugins
## What changes were proposed in this pull request?

These are just some straightforward upgrades to use the latest versions of some sbt plugins that also support sbt 1.0.
The remaining sbt plugins that need upgrading will require bigger changes.

## How was this patch tested?

Tested sbt use manually.

Author: pj.fanning <pj.fanning@workday.com>

Closes #19609 from pjfanning/SPARK-21708.
2017-10-31 08:16:54 +00:00
hyukjinkwon 7f3c6ff4ff [SPARK-21903][BUILD] Upgrade scalastyle to 1.0.0.
## What changes were proposed in this pull request?

1.0.0 fixes an issue with import order, explicit type for public methods, line length limitation and comment validation:

```
[error] .../spark/repl/scala-2.11/src/main/scala/org/apache/spark/repl/Main.scala:50:16: Are you sure you want to println? If yes, wrap the code block with
[error]       // scalastyle:off println
[error]       println(...)
[error]       // scalastyle:on println
[error] .../spark/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala:49: File line length exceeds 100 characters
[error] .../spark/repl/scala-2.11/src/main/scala/org/apache/spark/repl/SparkILoop.scala:22:21: Are you sure you want to println? If yes, wrap the code block with
[error]       // scalastyle:off println
[error]       println(...)
[error]       // scalastyle:on println
[error] .../spark/streaming/src/test/java/org/apache/spark/streaming/JavaTestUtils.scala:35:6: Public method must have explicit type
[error] .../spark/streaming/src/test/java/org/apache/spark/streaming/JavaTestUtils.scala:51:6: Public method must have explicit type
[error] .../spark/streaming/src/test/java/org/apache/spark/streaming/JavaTestUtils.scala:93:15: Public method must have explicit type
[error] .../spark/streaming/src/test/java/org/apache/spark/streaming/JavaTestUtils.scala:98:15: Public method must have explicit type
[error] .../spark/streaming/src/test/java/org/apache/spark/streaming/JavaTestUtils.scala:47:2: Insert a space after the start of the comment
[error] .../spark/streaming/src/test/java/org/apache/spark/streaming/JavaTestUtils.scala:26:43: JavaDStream should come before JavaDStreamLike.
```

This PR also fixes the workaround added in SPARK-16877 for `org.scalastyle.scalariform.OverrideJavaChecker` feature, added from 0.9.0.

## How was this patch tested?

Manually tested.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19116 from HyukjinKwon/scalastyle-1.0.0.
2017-09-05 19:40:05 +09:00
Marcelo Vanzin 3f958a9992 [SPARK-21731][BUILD] Upgrade scalastyle to 0.9.
This version fixes a few issues in the import order checker; it provides
better error messages, and detects more improper ordering (thus the need
to change a lot of files in this patch). The main fix is that it correctly
complains about the order of packages vs. classes.

As part of the above, I moved some "SparkSession" import in ML examples
inside the "$example on$" blocks; that didn't seem consistent across
different source files to start with, and avoids having to add more on/off blocks
around specific imports.

The new scalastyle also seems to have a better header detector, so a few
license headers had to be updated to match the expected indentation.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #18943 from vanzin/SPARK-21731.
2017-08-15 13:59:00 -07:00
pj.fanning c0e333dbed [SPARK-21709][BUILD] sbt 0.13.16 and some plugin updates
## What changes were proposed in this pull request?

Update sbt version to 0.13.16. I think this is a useful stepping stone to getting to sbt 1.0.0.

## How was this patch tested?

Existing Build.

Author: pj.fanning <pj.fanning@workday.com>

Closes #18921 from pjfanning/SPARK-21709.
2017-08-12 20:01:20 +01:00
Weiqing Yang 9338aa4f89
[SPARK-18697][BUILD] Upgrade sbt plugins
## What changes were proposed in this pull request?

This PR is to upgrade sbt plugins. The following sbt plugins will be upgraded:
```
sbteclipse-plugin: 4.0.0 -> 5.0.1
sbt-mima-plugin: 0.1.11 -> 0.1.12
org.ow2.asm/asm: 5.0.3 -> 5.1
org.ow2.asm/asm-commons: 5.0.3 -> 5.1
```
## How was this patch tested?
Pass the Jenkins build.

Author: Weiqing Yang <yangweiqing001@gmail.com>

Closes #16223 from weiqingy/SPARK_18697.
2016-12-09 14:13:01 +08:00
Sean Owen 4cc8d8906d
Revert "[SPARK-18697][BUILD] Upgrade sbt plugins"
This reverts commit 7f31d378c4.
2016-12-07 09:03:20 +08:00
Weiqing Yang 7f31d378c4
[SPARK-18697][BUILD] Upgrade sbt plugins
## What changes were proposed in this pull request?

This PR is to upgrade sbt plugins. The following sbt plugins will be upgraded:
```
sbt-assembly: 0.11.2 -> 0.14.3
sbteclipse-plugin: 4.0.0 -> 5.0.1
sbt-mima-plugin: 0.1.11 -> 0.1.12
org.ow2.asm/asm: 5.0.3 -> 5.1
org.ow2.asm/asm-commons: 5.0.3 -> 5.1
```
All other plugins are up-to-date.

## How was this patch tested?
Pass the Jenkins build.

Author: Weiqing Yang <yangweiqing001@gmail.com>

Closes #16159 from weiqingy/SPARK-18697.
2016-12-07 06:11:39 +08:00
Josh Rosen b3b4b95422 [SPARK-18034] Upgrade to MiMa 0.1.11 to fix flakiness
We should upgrade to the latest release of MiMa (0.1.11) in order to include a fix for a bug which led to flakiness in the MiMa checks (https://github.com/typesafehub/migration-manager/issues/115).

Author: Josh Rosen <joshrosen@databricks.com>

Closes #15571 from JoshRosen/SPARK-18034.
2016-10-21 11:25:01 -07:00
Josh Rosen f74b77713e [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central
Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from https://github.com/scrapcodes/sbt-pom-reader/tree/ignore_artifact_id. This unnecessarily slows down the initial build on fresh machines and is also risky because it risks a build breakage in case that GitHub repository ever changes or is deleted.

In order to address these issues, I have published a pre-built binary of our forked sbt-pom-reader plugin to Maven Central under the `org.spark-project` namespace and have updated Spark's build to use that artifact. This published artifact was built from https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark, which contains the contents of ScrapCodes's branch plus an additional patch to configure the build for artifact publication.

/cc srowen ScrapCodes for review.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #13564 from JoshRosen/use-published-fork-of-pom-reader.
2016-06-09 11:04:08 -07:00
Herman van Hovell 527499b624 [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin
## What changes were proposed in this pull request?
The ANTLR4 SBT plugin has been moved from its own repo to one on bintray. The version was also changed from `0.7.10` to `0.7.11`. The latter actually broke our build (ihji has fixed this by also adding `0.7.10` and others to the bin-tray repo).

This PR upgrades the SBT-ANTLR4 plugin and ANTLR4 to their most recent versions (`0.7.11`/`4.5.3`). I have also removed a few obsolete build configurations.

## How was this patch tested?
Manually running SBT/Maven builds.

Author: Herman van Hovell <hvanhovell@questtec.nl>

Closes #13299 from hvanhovell/SPARK-15525.
2016-05-25 15:35:38 -07:00
Josh Rosen 6466d6c8a4 Revert "[SPARK-14683][DOCUMENTATION] Configure external links in ScalaDoc"
This reverts commit 3f49afee93.
2016-04-27 15:49:21 -07:00
杨博 (Yang Bo) 3f49afee93 [SPARK-14683][DOCUMENTATION] Configure external links in ScalaDoc
Right now Spark's Scaladoc does not link to Scala standard library and other dependencies. This would bother Spark starters because they may be not experienced Scala programmers.

This patch fixes these links in ScalaDoc.

Author: 杨博 (Yang Bo) <pop.atry@gmail.com>

Closes #12444 from Atry/patch-1.
2016-04-16 11:44:12 -07:00
Luciano Resende a172e11cba [SPARK-14366] Remove sbt-idea plugin
## What changes were proposed in this pull request?

Remove sbt-idea plugin as importing sbt project provides much better support.

Author: Luciano Resende <lresende@apache.org>

Closes #12151 from lresende/SPARK-14366.
2016-04-04 16:55:59 -07:00
Herman van Hovell a9b93e0739 [SPARK-14211][SQL] Remove ANTLR3 based parser
### What changes were proposed in this pull request?

This PR removes the ANTLR3 based parser, and moves the new ANTLR4 based parser into the `org.apache.spark.sql.catalyst.parser package`.

### How was this patch tested?

Existing unit tests.

cc rxin andrewor14 yhuai

Author: Herman van Hovell <hvanhovell@questtec.nl>

Closes #12071 from hvanhovell/SPARK-14211.
2016-03-31 09:25:09 -07:00
Herman van Hovell 600c0b69ca [SPARK-13713][SQL] Migrate parser from ANTLR3 to ANTLR4
### What changes were proposed in this pull request?
The current ANTLR3 parser is quite complex to maintain and suffers from code blow-ups. This PR introduces a new parser that is based on ANTLR4.

This parser is based on the [Presto's SQL parser](https://github.com/facebook/presto/blob/master/presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4). The current implementation can parse and create Catalyst and SQL plans. Large parts of the HiveQl DDL and some of the DML functionality is currently missing, the plan is to add this in follow-up PRs.

This PR is a work in progress, and work needs to be done in the following area's:

- [x] Error handling should be improved.
- [x] Documentation should be improved.
- [x] Multi-Insert needs to be tested.
- [ ] Naming and package locations.

### How was this patch tested?

Catalyst and SQL unit tests.

Author: Herman van Hovell <hvanhovell@questtec.nl>

Closes #11557 from hvanhovell/ngParser.
2016-03-28 12:31:12 -07:00
Dongjoon Hyun 473263f959 [SPARK-13834][BUILD] Update sbt and sbt plugins for 2.x.
## What changes were proposed in this pull request?

For 2.0.0, we had better make **sbt** and **sbt plugins** up-to-date. This PR checks the status of each plugins and bumps the followings.

* sbt: 0.13.9 --> 0.13.11
* sbteclipse-plugin: 2.2.0 --> 4.0.0
* sbt-dependency-graph: 0.7.4 --> 0.8.2
* sbt-mima-plugin: 0.1.6 --> 0.1.9
* sbt-revolver: 0.7.2 --> 0.8.0

All other plugins are up-to-date. (Note that `sbt-avro` seems to be change from 0.3.2 to 1.0.1, but it's not published in the repository.)

During upgrade, this PR also updated the following MiMa error. Note that the related excluding filter is already registered correctly. It seems due to the change of MiMa exception result.
```
 // SPARK-12896 Send only accumulator updates to driver, not TaskMetrics
 ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.Accumulable.this"),
-ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.Accumulator.this"),
+ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.Accumulator.this"),
```

## How was this patch tested?

Pass the Jenkins build.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #11669 from dongjoon-hyun/update_mima.
2016-03-13 18:47:04 -07:00
Josh Rosen 090d691323 [SPARK-4628][BUILD] Remove all non-Maven-Central repositories from build
This patch removes all non-Maven-central repositories from Spark's build, thereby avoiding any risk of future build-breaks due to us accidentally depending on an artifact which is not present in an immutable public Maven repository.

I tested this by running

```
build/mvn \
        -Phive \
        -Phive-thriftserver \
        -Pkinesis-asl \
        -Pspark-ganglia-lgpl \
        -Pyarn \
        dependency:go-offline
```

inside of a fresh Ubuntu Docker container with no Ivy or Maven caches (I did a similar test for SBT).

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10659 from JoshRosen/SPARK-4628.
2016-01-08 20:58:53 -08:00
Herman van Hovell 970635a9f8 [SPARK-12362][SQL][WIP] Inline Hive Parser
This PR inlines the Hive SQL parser in Spark SQL.

The previous (merged) incarnation of this PR passed all tests, but had and still has problems with the build. These problems are caused by a the fact that - for some reason - in some cases the ANTLR generated code is not included in the compilation fase.

This PR is a WIP and should not be merged until we have sorted out the build issues.

Author: Herman van Hovell <hvanhovell@questtec.nl>
Author: Nong Li <nong@databricks.com>
Author: Nong Li <nongli@gmail.com>

Closes #10525 from hvanhovell/SPARK-12362.
2016-01-01 23:22:50 -08:00
Reynold Xin 27af6157f9 Revert "[SPARK-12362][SQL][WIP] Inline Hive Parser"
This reverts commit b600bccf41 due to non-deterministic build breaks.
2015-12-30 00:08:44 -08:00
Nong Li b600bccf41 [SPARK-12362][SQL][WIP] Inline Hive Parser
This is a WIP. The PR has been taken over from nongli (see https://github.com/apache/spark/pull/10420). I have removed some additional dead code, and fixed a few issues which were caused by the fact that the inlined Hive parser is newer than the Hive parser we currently use in Spark.

I am submitting this PR in order to get some feedback and testing done. There is quite a bit of work to do:
- [ ] Get it to pass jenkins build/test.
- [ ] Aknowledge Hive-project for using their parser.
- [ ] Refactorings between HiveQl and the java classes.
  - [ ] Create our own ASTNode and integrate the current implicit extentions.
  - [ ] Move remaining ```SemanticAnalyzer``` and ```ParseUtils``` functionality to ```HiveQl```.
- [ ] Removing Hive dependencies from the parser. This will require some edits in the grammar files.
  - [ ] Introduce our own context which needs to contain a ```TokenRewriteStream```.
  - [ ] Add ```useSQL11ReservedKeywordsForIdentifier``` and ```allowQuotedId``` to the catalyst or sql configuration.
  - [ ] Remove ```HiveConf``` from grammar files &HiveQl, and pass in our own configuration.
- [ ] Moving the parser into sql/core.

cc nongli rxin

Author: Herman van Hovell <hvanhovell@questtec.nl>
Author: Nong Li <nong@databricks.com>
Author: Nong Li <nongli@gmail.com>

Closes #10509 from hvanhovell/SPARK-12362.
2015-12-29 18:47:41 -08:00
Josh Rosen b7204e1d41 [SPARK-12112][BUILD] Upgrade to SBT 0.13.9
We should upgrade to SBT 0.13.9, since this is a requirement in order to use SBT's new Maven-style resolution features (which will be done in a separate patch, because it's blocked by some binary compatibility issues in the POM reader plugin).

I also upgraded Scalastyle to version 0.8.0, which was necessary in order to fix a Scala 2.10.5 compatibility issue (see https://github.com/scalastyle/scalastyle/issues/156). The newer Scalastyle is slightly stricter about whitespace surrounding tokens, so I fixed the new style violations.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #10112 from JoshRosen/upgrade-to-sbt-0.13.9.
2015-12-05 08:15:30 +08:00
Ahir Reddy 9bbe33f318 [SPARK-10556] Remove explicit Scala version for sbt project build files
Previously, project/plugins.sbt explicitly set scalaVersion to 2.10.4. This can cause issues when using a version of sbt that is compiled against a different version of Scala (for example sbt 0.13.9 uses 2.10.5). Removing this explicit setting will cause build files to be compiled and run against the same version of Scala that sbt is compiled against.

Note that this only applies to the project build files (items in project/), it is distinct from the version of Scala we target for the actual spark compilation.

Author: Ahir Reddy <ahirreddy@gmail.com>

Closes #8709 from ahirreddy/sbt-scala-version-fix.
2015-09-11 13:06:14 +01:00
Imran Rashid a46594435e [SPARK-6782] add sbt-revolver plugin
to make it easier to start & stop http servers in sbt
https://issues.apache.org/jira/browse/SPARK-6782

Author: Imran Rashid <irashid@cloudera.com>

Closes #5426 from squito/SPARK-6782 and squashes the following commits:

dc4fb19 [Imran Rashid] add sbt-revolved plugin, to make it easier to start & stop http servers in sbt
2015-06-17 13:34:26 -07:00
Xiangrui Meng 2b258e1c07 [SPARK-5610] [DOC] update genjavadocSettings to use the patched version of genjavadoc
This PR updates `genjavadocSettings` to use a patched version of `genjavadoc-plugin` that hides package private classes/methods/interfaces in the generated Java API doc. The patch can be found at: https://github.com/typesafehub/genjavadoc/compare/master...mengxr:spark-1.4.

It wasn't merged into the main repo because there exist corner cases where a package private Scala class has to be a Java public class in order to compile. This doesn't seem to apply to the Spark codebase. So we release a patched version under `org.spark-project` and use it in the Spark build. brkyvz is publishing the artifacts to Maven Central.

Need more people audit the generated APIs and make sure we don't have false negatives.

Current listed classes under `org.apache.spark.rdd`:
![screen shot 2015-05-29 at 12 48 52 pm](https://cloud.githubusercontent.com/assets/829644/7891396/28fb9daa-0601-11e5-8ed8-4e9522d25a71.png)

After this PR:
![screen shot 2015-05-29 at 12 48 23 pm](https://cloud.githubusercontent.com/assets/829644/7891408/408e210e-0601-11e5-975c-ff0a02eb5c91.png)

cc: pwendell rxin srowen

Author: Xiangrui Meng <meng@databricks.com>

Closes #6506 from mengxr/SPARK-5610 and squashes the following commits:

489c785 [Xiangrui Meng] update genjavadocSettings to use the patched version of genjavadoc
2015-05-30 17:21:41 -07:00
Reynold Xin 1232215914 [SPARK-6750] Upgrade ScalaStyle to 0.7.
0.7 fixes a bug that's pretty useful, i.e. inline functions no longer return explicit type definition.

Author: Reynold Xin <rxin@databricks.com>

Closes #5399 from rxin/style0.7 and squashes the following commits:

54c41b2 [Reynold Xin] Actually update the version.
09c759c [Reynold Xin] [SPARK-6750] Upgrade ScalaStyle to 0.7.
2015-04-07 12:37:33 -07:00
GuoQiang Li 89e8a5d8ba [SPARK-3997][Build]scalastyle should output the error location
Author: GuoQiang Li <witgo@qq.com>

Closes #2846 from witgo/SPARK-3997 and squashes the following commits:

d6a57f8 [GuoQiang Li] scalastyle should output the error location
2014-10-26 16:24:50 -07:00
Prashant Sharma d6a3025392 [BUILD] Fixed resolver for scalastyle plugin and upgrade sbt version.
Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #2877 from ScrapCodes/scalastyle-fix and squashes the following commits:

a17b9fe [Prashant Sharma] [BUILD] Fixed resolver for scalastyle plugin.
2014-10-22 22:13:11 -07:00
prudhvi 044583a241 [Core] Upgrading ScalaStyle version to 0.5 and removing SparkSpaceAfterCommentStartChecker.
Author: prudhvi <prudhvi953@gmail.com>

Closes #2799 from prudhvije/ScalaStyle/space-after-comment-start and squashes the following commits:

fc263a1 [prudhvi] [Core] Using scalastyle to check the space after comment start
2014-10-16 02:05:44 -04:00
Marcelo Vanzin c9f743957f [SPARK-2848] Shade Guava in uber-jars.
For further discussion, please check the JIRA entry.

This change moves Guava classes to a different package so that they don't conflict with the user-provided Guava (or the Hadoop-provided one). Since one class (Optional) was exposed through Spark's public API, that class was forked from Guava at the current dependency version (14.0.1) so that it can be kept going forward (until the API is cleaned).

Note this change has a few implications:
- *all* classes in the final jars will reference the relocated classes. If Hadoop classes are included (i.e. "-Phadoop-provided" is not activated), those will also reference the Guava 14 classes (instead of the Guava 11 classes from the Hadoop classpath).
- if the Guava version in Spark is ever changed, the new Guava will still reference the forked Optional class; this may or may not be a problem, but in the long term it's better to think about removing Optional from the public API.

For the end user, there are two visible implications:

- Guava is not provided as a transitive dependency anymore (since it's "provided" in Spark)
- At runtime, unless they provide their own, they'll either have no Guava or Hadoop's version of Guava (11), depending on how they set up their classpath.

Note that this patch does not change the sbt deliverables; those will still contain guava in its original package, and provide guava as a compile-time dependency. This assumes that maven is the canonical build, and sbt-built artifacts are not (officially) published.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #1813 from vanzin/SPARK-2848 and squashes the following commits:

9bdffb0 [Marcelo Vanzin] Undo sbt build changes.
819b445 [Marcelo Vanzin] Review feedback.
05e0a3d [Marcelo Vanzin] Merge branch 'master' into SPARK-2848
fef4370 [Marcelo Vanzin] Unfork Optional.java.
d3ea8e1 [Marcelo Vanzin] Exclude asm classes from final jar.
637189b [Marcelo Vanzin] Add hacky filter to prefer Spark's copy of Optional.
2fec990 [Marcelo Vanzin] Shade Guava in the sbt build.
616998e [Marcelo Vanzin] Shade Guava in the maven build, fork Guava's Optional.java.
2014-08-20 16:23:10 -07:00
Prashant Sharma 32096c2aed SPARK-2899 Doc generation is back to working in new SBT Build.
The reason for this bug was introduciton of OldDeps project. It had to be excluded to prevent unidocs from trying to put it on "docs compile" classpath.

Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #1830 from ScrapCodes/doc-fix and squashes the following commits:

e5d52e6 [Prashant Sharma] SPARK-2899 Doc generation is back to working in new SBT Build.
2014-08-07 16:24:22 -07:00
Hari Shreedharan 800ecff4b1 [STREAMING] SPARK-1729. Make Flume pull data from source, rather than the current pu...
...sh model

Currently Spark uses Flume's internal Avro Protocol to ingest data from Flume. If the executor running the
receiver fails, it currently has to be restarted on the same node to be able to receive data.

This commit adds a new Sink which can be deployed to a Flume agent. This sink can be polled by a new
DStream that is also included in this commit. This model ensures that data can be pulled into Spark from
Flume even if the receiver is restarted on a new node. This also allows the receiver to receive data on
multiple threads for better performance.

Author: Hari Shreedharan <harishreedharan@gmail.com>
Author: Hari Shreedharan <hshreedharan@apache.org>
Author: Tathagata Das <tathagata.das1565@gmail.com>
Author: harishreedharan <hshreedharan@cloudera.com>

Closes #807 from harishreedharan/master and squashes the following commits:

e7f70a3 [Hari Shreedharan] Merge remote-tracking branch 'asf-git/master'
96cfb6f [Hari Shreedharan] Merge remote-tracking branch 'asf/master'
e48d785 [Hari Shreedharan] Documenting flume-sink being ignored for Mima checks.
5f212ce [Hari Shreedharan] Ignore Spark Sink from mima.
981bf62 [Hari Shreedharan] Merge remote-tracking branch 'asf/master'
7a1bc6e [Hari Shreedharan] Fix SparkBuild.scala
a082eb3 [Hari Shreedharan] Merge remote-tracking branch 'asf/master'
1f47364 [Hari Shreedharan] Minor fixes.
73d6f6d [Hari Shreedharan] Cleaned up tests a bit. Added some docs in multiple places.
65b76b4 [Hari Shreedharan] Fixing the unit test.
e59cc20 [Hari Shreedharan] Use SparkFlumeEvent instead of the new type. Also, Flume Polling Receiver now uses the store(ArrayBuffer) method.
f3c99d1 [Hari Shreedharan] Merge remote-tracking branch 'asf/master'
3572180 [Hari Shreedharan] Adding a license header, making Jenkins happy.
799509f [Hari Shreedharan] Fix a compile issue.
3c5194c [Hari Shreedharan] Merge remote-tracking branch 'asf/master'
d248d22 [harishreedharan] Merge pull request #1 from tdas/flume-polling
10b6214 [Tathagata Das] Changed public API, changed sink package, and added java unit test to make sure Java API is callable from Java.
1edc806 [Hari Shreedharan] SPARK-1729. Update logging in Spark Sink.
8c00289 [Hari Shreedharan] More debug messages
393bd94 [Hari Shreedharan] SPARK-1729. Use LinkedBlockingQueue instead of ArrayBuffer to keep track of connections.
120e2a1 [Hari Shreedharan] SPARK-1729. Some test changes and changes to utils classes.
9fd0da7 [Hari Shreedharan] SPARK-1729. Use foreach instead of map for all Options.
8136aa6 [Hari Shreedharan] Adding TransactionProcessor to map on returning batch of data
86aa274 [Hari Shreedharan] Merge remote-tracking branch 'asf/master'
205034d [Hari Shreedharan] Merging master in
4b0c7fc [Hari Shreedharan] FLUME-1729. New Flume-Spark integration.
bda01fc [Hari Shreedharan] FLUME-1729. Flume-Spark integration.
0d69604 [Hari Shreedharan] FLUME-1729. Better Flume-Spark integration.
3c23c18 [Hari Shreedharan] SPARK-1729. New Spark-Flume integration.
70bcc2a [Hari Shreedharan] SPARK-1729. New Flume-Spark integration.
d6fa3aa [Hari Shreedharan] SPARK-1729. New Flume-Spark integration.
e7da512 [Hari Shreedharan] SPARK-1729. Fixing import order
9741683 [Hari Shreedharan] SPARK-1729. Fixes based on review.
c604a3c [Hari Shreedharan] SPARK-1729. Optimize imports.
0f10788 [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
87775aa [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
8df37e4 [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
03d6c1c [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
08176ad [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
d24d9d4 [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
6d6776a [Hari Shreedharan] SPARK-1729. Make Flume pull data from source, rather than the current push model
2014-07-29 11:11:29 -07:00
DB Tsai ac9cdc116e [SPARK-2413] Upgrade junit_xml_listener to 0.5.1
which fixes the following issues

1) fix the class name to be fully qualified classpath
2) make sure the the reporting time is in second not in miliseond, which causing JUnit HTML to report incorrect number
3) make sure the duration of the tests are accumulative.

Author: DB Tsai <dbtsai@alpinenow.com>

Closes #1333 from dbtsai/dbtsai-junit and squashes the following commits:

bbeac4b [DB Tsai] Upgrade junit_xml_listener to 0.5.1 which fixes the following issues
2014-07-08 17:50:36 -07:00
Kalpit Shah 5473aa7c02 sbt 0.13.X should be using sbt-assembly 0.11.X
https://github.com/sbt/sbt-assembly/blob/master/README.md

Author: Kalpit Shah <shahkalpit84@gmail.com>

Closes #555 from kalpit/upgrade/sbtassembly and squashes the following commits:

1fa7324 [Kalpit Shah] sbt 0.13.X should be using sbt-assembly 0.11.X
2014-06-05 13:07:26 -07:00
Matei Zaharia fc78384704 [SPARK-1439, SPARK-1440] Generate unified Scaladoc across projects and Javadocs
I used the sbt-unidoc plugin (https://github.com/sbt/sbt-unidoc) to create a unified Scaladoc of our public packages, and generate Javadocs as well. One limitation is that I haven't found an easy way to exclude packages in the Javadoc; there is a SBT task that identifies Java sources to run javadoc on, but it's been very difficult to modify it from outside to change what is set in the unidoc package. Some SBT-savvy people should help with this. The Javadoc site also lacks package-level descriptions and things like that, so we may want to look into that. We may decide not to post these right now if it's too limited compared to the Scala one.

Example of the built doc site: http://people.csail.mit.edu/matei/spark-unified-docs/

Author: Matei Zaharia <matei@databricks.com>

This patch had conflicts when merged, resolved by
Committer: Patrick Wendell <pwendell@gmail.com>

Closes #457 from mateiz/better-docs and squashes the following commits:

a63d4a3 [Matei Zaharia] Skip Java/Scala API docs for Python package
5ea1f43 [Matei Zaharia] Fix links to Java classes in Java guide, fix some JS for scrolling to anchors on page load
f05abc0 [Matei Zaharia] Don't include java.lang package names
995e992 [Matei Zaharia] Skip internal packages and class names with $ in JavaDoc
a14a93c [Matei Zaharia] typo
76ce64d [Matei Zaharia] Add groups to Javadoc index page, and a first package-info.java
ed6f994 [Matei Zaharia] Generate JavaDoc as well, add titles, update doc site to use unified docs
acb993d [Matei Zaharia] Add Unidoc plugin for the projects we want Unidoced
2014-04-21 21:57:40 -07:00
Xiangrui Meng 8517911efb [FIX] update sbt-idea to version 1.6.0
I saw `No "scala-library*.jar" in Scala compiler library` error in IDEA. It seems upgrading `sbt-idea` to 1.6.0 fixed the problem.

Author: Xiangrui Meng <meng@databricks.com>

Closes #419 from mengxr/idea-plugin and squashes the following commits:

fb3c35f [Xiangrui Meng] update sbt-idea to version 1.6.0
2014-04-15 19:37:32 -07:00
Mark Hamstra 764353d2c5 [SPARK-1342] Scala 2.10.4
Just a Scala version increment

Author: Mark Hamstra <markhamstra@gmail.com>

Closes #259 from markhamstra/scala-2.10.4 and squashes the following commits:

fbec547 [Mark Hamstra] [SPARK-1342] Bumped Scala version to 2.10.4
2014-04-01 18:35:50 -07:00