Commit graph

1030 commits

Author SHA1 Message Date
Gengliang Wang 4bd358474b Preparing development version 3.2.1-SNAPSHOT 2021-09-28 10:53:42 +00:00
Gengliang Wang dde73e2e1c Preparing Spark release v3.2.0-rc6 2021-09-28 10:53:35 +00:00
Gengliang Wang 0c57bb8f7f Preparing development version 3.2.1-SNAPSHOT 2021-09-27 08:24:50 +00:00
Gengliang Wang 49aea14c5a Preparing Spark release v3.2.0-rc5 2021-09-27 08:24:44 +00:00
Chao Sun 228d12e30e [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Move spark.yarn.isHadoopProvided to parent pom
### What changes were proposed in this pull request?

Move `spark.yarn.isHadoopProvided` to Spark parent pom, so that under `resource-managers/yarn` we can make `hadoop-3.2` as the default profile.

### Why are the changes needed?

Currently under `resource-managers/yarn` there are 3 maven profiles : `hadoop-provided`, `hadoop-2.7`, and `hadoop-3.2`, of which `hadoop-3.2` is activated by default (via `activeByDefault`). The activation, however, doesn't work when there is other explicitly activated profiles. In specific, if users build Spark with `hadoop-provided`, maven will fail because it can't find Hadoop 3.2 related dependencies, which are defined in the `hadoop-3.2` profile section.

To fix the issue, this proposes to move the `hadoop-provided` section to the parent pom. Currently this is only used to define a property `spark.yarn.isHadoopProvided`, and it shouldn't matter where we define it.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Tested via running the command:
```
build/mvn clean package -DskipTests -B -Pmesos -Pyarn -Pkubernetes -Pscala-2.12 -Phadoop-provided
```
which was failing before this PR but is succeeding with it.

Also checked active profiles with the command:
```
build/mvn -Pyarn -Phadoop-provided help:active-profiles
```
and it shows that `hadoop-3.2` is active for `spark-yarn` module now.

Closes #34110 from sunchao/SPARK-36835-followup2.

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit f9efdeea8c)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
2021-09-27 15:17:28 +08:00
Gengliang Wang 2348cce37e Preparing development version 3.2.1-SNAPSHOT 2021-09-26 12:28:46 +00:00
Gengliang Wang 2ed8c08c5b Preparing Spark release v3.2.0-rc5 2021-09-26 12:28:40 +00:00
Chao Sun 540e45c3cc [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom
### What changes were proposed in this pull request?

Fix an issue where Maven may stuck in an infinite loop when building Spark, for Hadoop 2.7 profile.

### Why are the changes needed?

After re-enabling `createDependencyReducedPom` for `maven-shade-plugin`, Spark build stopped working for Hadoop 2.7 profile and will stuck in an infinitely loop, likely due to a Maven shade plugin bug similar to https://issues.apache.org/jira/browse/MSHADE-148. This seems to be caused by the fact that, under `hadoop-2.7` profile, variable `hadoop-client-runtime.artifact` and `hadoop-client-api.artifact`are both `hadoop-client` which triggers the issue.

As a workaround, this changes `hadoop-client-runtime.artifact` to be `hadoop-yarn-api` when using `hadoop-2.7`. Since `hadoop-yarn-api` is a dependency of `hadoop-client`, this essentially moves the former to the same level as the latter. It should have no effect as both are dependencies of Spark.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A

Closes #34100 from sunchao/SPARK-36835-followup.

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit 937a74e6e7)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
2021-09-26 13:39:50 +08:00
Gengliang Wang da722d43cb Preparing development version 3.2.1-SNAPSHOT 2021-09-24 10:03:23 +00:00
Gengliang Wang 9e35703211 Preparing Spark release v3.2.0-rc5 2021-09-24 10:03:16 +00:00
Chao Sun 09283d3210 [SPARK-36835][BUILD] Enable createDependencyReducedPom for Maven shaded plugin
### What changes were proposed in this pull request?

Enable `createDependencyReducedPom` for Spark's Maven shaded plugin so that the effective pom won't contain those shaded artifacts such as `org.eclipse.jetty`

### Why are the changes needed?

At the moment, the effective pom leaks transitive dependencies to downstream apps for those shaded artifacts, which potentially will cause issues.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

I manually tested and the `core/dependency-reduced-pom.xml` no longer contains dependencies such as `jetty-XX`.

Closes #34085 from sunchao/SPARK-36835.

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit ed88e610f0)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
2021-09-24 10:16:46 +08:00
Gengliang Wang 0fb7127f85 Preparing development version 3.2.1-SNAPSHOT 2021-09-23 08:46:28 +00:00
Gengliang Wang b609f2fe0c Preparing Spark release v3.2.0-rc4 2021-09-23 08:46:22 +00:00
Gengliang Wang affd7a4d47 [SPARK-36670][FOLLOWUP][TEST] Remove brotli-codec dependency
### What changes were proposed in this pull request?

Remove `com.github.rdblue:brotli-codec:0.1.1` dependency.

### Why are the changes needed?

As Stephen Coy pointed out in the dev list, we should not have `com.github.rdblue:brotli-codec:0.1.1` dependency which is not available on Maven Central. This is to avoid possible artifact changes on `Jitpack.io`.
Also, the dependency is for tests only. I suggest that we remove it now to unblock the 3.2.0 release ASAP.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

GA tests.

Closes #34059 from gengliangwang/removeDeps.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit ba5708d944)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-09-21 10:57:34 -07:00
Gengliang Wang b0249851f6 Preparing development version 3.2.1-SNAPSHOT 2021-09-18 11:30:12 +00:00
Gengliang Wang 96044e9735 Preparing Spark release v3.2.0-rc3 2021-09-18 11:30:06 +00:00
Dongjoon Hyun fbd24621ce [SPARK-36759][BUILD][FOLLOWUP] Update version in scala-2.12 profile and doc
### What changes were proposed in this pull request?

This is a follow-up to fix the leftover during switching the Scala version.

### Why are the changes needed?

This should be consistent.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

This is not tested by UT. We need to check manually. There is no more `2.12.14`.
```
$ git grep 2.12.14
R/pkg/tests/fulltests/test_sparkSQL.R:               c(as.Date("2012-12-14"), as.Date("2013-12-15"), as.Date("2014-12-16")))
data/mllib/ridge-data/lpsa.data:3.5307626,0.987291634724086 -0.36279314978779 -0.922212414640967 0.232904453212813 -0.522940888712441 1.79270085261407 0.342627053981254 1.26288870310799
sql/hive/src/test/resources/data/files/over10k:-3|454|65705|4294967468|62.12|14.32|true|mike white|2013-03-01 09:11:58.703087|40.18|joggying
```

Closes #34020 from dongjoon-hyun/SPARK-36759-2.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit adbea252db)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-09-16 05:11:05 -07:00
Dongjoon Hyun 63b8417794 [SPARK-36732][SQL][BUILD] Upgrade ORC to 1.6.11
### What changes were proposed in this pull request?

This PR aims to upgrade Apache ORC to 1.6.11 to bring the latest bug fixes.

### Why are the changes needed?

Apache ORC 1.6.11 has the following fixes.
- https://issues.apache.org/jira/projects/ORC/versions/12350499

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #33971 from dongjoon-hyun/SPARK-36732.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit c217797297)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-09-15 23:36:36 -07:00
Dongjoon Hyun 2067661869 [SPARK-36759][BUILD] Upgrade Scala to 2.12.15
### What changes were proposed in this pull request?

This PR aims to upgrade Scala to 2.12.15 to support Java 17/18 better.

### Why are the changes needed?

Scala 2.12.15 improves compatibility with JDK 17 and 18:

https://github.com/scala/scala/releases/tag/v2.12.15

- Avoids IllegalArgumentException in JDK 17+ for lambda deserialization
- Upgrades to ASM 9.2, for JDK 18 support in optimizer

### Does this PR introduce _any_ user-facing change?

Yes, this is a Scala version change.

### How was this patch tested?

Pass the CIs

Closes #33999 from dongjoon-hyun/SPARK-36759.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 16f1f71ba5)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-09-15 13:43:36 -07:00
Chao Sun a7dc8242ea [SPARK-36726] Upgrade Parquet to 1.12.1
### What changes were proposed in this pull request?

Upgrade Apache Parquet to 1.12.1

### Why are the changes needed?

Parquet 1.12.1 contains the following bug fixes:
- PARQUET-2064: Make Range public accessible in RowRanges
- PARQUET-2022: ZstdDecompressorStream should close `zstdInputStream`
- PARQUET-2052: Integer overflow when writing huge binary using dictionary encoding
- PARQUET-1633: Fix integer overflow
- PARQUET-2054: fix TCP leaking when calling ParquetFileWriter.appendFile
- PARQUET-2072: Do Not Determine Both Min/Max for Binary Stats
- PARQUET-2073: Fix estimate remaining row count in ColumnWriteStoreBase
- PARQUET-2078: Failed to read parquet file after writing with the same

In particular PARQUET-2078 is a blocker for the upcoming Apache Spark 3.2.0 release.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing tests + a new test for the issue in SPARK-36696

Closes #33969 from sunchao/upgrade-parquet-12.1.

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
(cherry picked from commit a927b0836b)
Signed-off-by: DB Tsai <d_tsai@apple.com>
2021-09-15 19:17:49 +00:00
Lukas Rytz 2e7583799e [SPARK-36712][BUILD] Make scala-parallel-collections in 2.13 POM a direct dependency (not in maven profile)
As [reported on `devspark.apache.org`](https://lists.apache.org/thread.html/r84cff66217de438f1389899e6d6891b573780159cd45463acf3657aa%40%3Cdev.spark.apache.org%3E), the published POMs when building with Scala 2.13 have the `scala-parallel-collections` dependency only in the `scala-2.13` profile of the pom.

### What changes were proposed in this pull request?

This PR suggests to work around this by un-commenting the `scala-parallel-collections` dependency when switching to 2.13 using the the `change-scala-version.sh` script.

I included an upgrade to scala-parallel-collections version 1.0.3, the changes compared to 0.2.0 are minor.
  - removed OSGi metadata
  - renamed some internal inner classes
  - added `Automatic-Module-Name`

### Why are the changes needed?

According to the posts, this solves issues for developers that write unit tests for their applications.

Stephen Coy suggested to use the https://www.mojohaus.org/flatten-maven-plugin. While this sounds like a more principled solution, it is possibly too risky to do at this specific point in time?

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Locally

Closes #33948 from lrytz/parCollDep.

Authored-by: Lukas Rytz <lukas.rytz@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit 1a62e6a2c1)
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-09-13 11:06:58 -05:00
Kousuke Saruta dad566c1f2 [SPARK-36729][BUILD] Upgrade Netty from 4.1.63 to 4.1.68
### What changes were proposed in this pull request?

This PR upgrades Netty from `4.1.63` to `4.1.68`.

All the changes from `4.1.64` to `4.1.68` are as follows.

* 4.1.64 and 4.1.65
  * https://netty.io/news/2021/05/19/4-1-65-Final.html
* 4.1.66
  * https://netty.io/news/2021/07/16/4-1-66-Final.html
* 4.1.67
  * https://netty.io/news/2021/08/16/4-1-67-Final.html
* 4.1.68
  * https://netty.io/news/2021/09/09/4-1-68-Final.html

### Why are the changes needed?

Recently Netty `4.1.68` was released, which includes official M1 Mac support.
* Add support for mac m1
  * https://github.com/netty/netty/pull/11666

`4.1.65` also includes a critical bug fix which Spark might be affected.
* JNI classloader deadlock with latest JDK version
  * https://github.com/netty/netty/issues/11209

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

CIs.

Closes #33970 from sarutak/upgrade-netty-4.1.68.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit e1e19619b7)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-09-12 10:07:40 -07:00
Liang-Chi Hsieh e39948fada [SPARK-36670][SQL][TEST] Add FileSourceCodecSuite
### What changes were proposed in this pull request?

This patch mainly proposes to add some e2e test cases in Spark for codec used by main datasources.

### Why are the changes needed?

We found there is no e2e test cases available for main datasources like Parquet, Orc. It makes developers harder to identify possible bugs early. We should add such tests in Spark.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Added tests.

Closes #33912 from viirya/SPARK-36670.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
(cherry picked from commit 5a0ae694d0)
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
2021-09-07 16:53:25 -07:00
Dongjoon Hyun cdf21f729a [SPARK-36629][BUILD] Upgrade aircompressor to 1.21
### What changes were proposed in this pull request?

This PR aims to upgrade `aircompressor` dependency from 1.19 to 1.21.

### Why are the changes needed?

This will bring the latest bug fix which exists in `aircompressor` 1.17 ~ 1.20.
- 1e364f7133

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #33883 from dongjoon-hyun/SPARK-36629.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit ff8cc4b800)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-08-31 22:35:53 -07:00
Gengliang Wang 1bad04d028 Preparing development version 3.2.1-SNAPSHOT 2021-08-31 17:04:14 +00:00
Gengliang Wang 03f5d23e96 Preparing Spark release v3.2.0-rc2 2021-08-31 17:04:08 +00:00
Gengliang Wang 69be513c5e Preparing development version 3.2.1-SNAPSHOT 2021-08-20 12:40:47 +00:00
Gengliang Wang 6bb3523d8e Preparing Spark release v3.2.0-rc1 2021-08-20 12:40:40 +00:00
Gengliang Wang fafdc1482b Revert "Preparing Spark release v3.2.0-rc1"
This reverts commit 8e58fafb05.
2021-08-20 20:07:02 +08:00
Gengliang Wang c829ed53ff Revert "Preparing development version 3.2.1-SNAPSHOT"
This reverts commit 4f1d21571d.
2021-08-20 20:07:01 +08:00
Gengliang Wang 6357b22ba8 [SPARK-36547][BUILD] Downgrade scala-maven-plugin to 4.3.0
### What changes were proposed in this pull request?

When preparing Spark 3.2.0 RC1, I hit the same issue of https://github.com/apache/spark/pull/31031.
```
[INFO] Compiling 21 Scala sources and 3 Java sources to /opt/spark-rm/output/spark-3.1.0-bin-hadoop2.7/resource-managers/yarn/target/scala-2.12/test-classes ...
[ERROR] ## Exception when compiling 24 sources to /opt/spark-rm/output/spark-3.1.0-bin-hadoop2.7/resource-managers/yarn/target/scala-2.12/test-classes
java.lang.SecurityException: class "javax.servlet.SessionCookieConfig"'s signer information does not match signer information of other classes in the same package
java.lang.ClassLoader.checkCerts(ClassLoader.java:891)
java.lang.ClassLoader.preDefineClass(ClassLoader.java:661)
```
This PR is to apply the same fix again by downgrading scala-maven-plugin to 4.3.0

### Why are the changes needed?

To unblock the release process.

### Does this PR introduce _any_ user-facing change?

No
### How was this patch tested?

Build test

Closes #33791 from gengliangwang/downgrade.

Authored-by: Gengliang Wang <gengliang@apache.org>
Signed-off-by: Gengliang Wang <gengliang@apache.org>
(cherry picked from commit f0775d215e)
Signed-off-by: Gengliang Wang <gengliang@apache.org>
2021-08-20 10:45:35 +08:00
Gengliang Wang 4f1d21571d Preparing development version 3.2.1-SNAPSHOT 2021-08-19 14:08:32 +00:00
Gengliang Wang 8e58fafb05 Preparing Spark release v3.2.0-rc1 2021-08-19 14:08:26 +00:00
Sean Owen b8c1014e23 Update Spark key negotiation protocol 2021-08-14 09:08:29 -05:00
William Hyun 1a371fbfa1 [SPARK-36482][BUILD] Bump orc to 1.6.10
### What changes were proposed in this pull request?
This PR aims to bump ORC to 1.6.10

### Why are the changes needed?
This will bring the latest bug fixes.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass the CIs.

Closes #33712 from williamhyun/orc.

Authored-by: William Hyun <william@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit aff1b5594a)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-08-11 11:32:18 -07:00
Sajith Ariyarathna f0844f70b1 [SPARK-36432][BUILD] Upgrade Jetty version to 9.4.43
### What changes were proposed in this pull request?
This PR upgrades Jetty version to `9.4.43.v20210629`.

### Why are the changes needed?
To address vulnerability https://nvd.nist.gov/vuln/detail/CVE-2021-34429 which affects Jetty `9.4.42.v20210604`.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI

Closes #33656 from this/upgrade-jetty-9.4.43.

Lead-authored-by: Sajith Ariyarathna <sajith.janaprasad@gmail.com>
Co-authored-by: Sajith Ariyarathna <this@users.noreply.github.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 5a22f9ceaf)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-08-09 10:14:16 +09:00
Liang-Chi Hsieh 712c311736 [SPARK-36393][BUILD] Try to raise memory for GHA
### What changes were proposed in this pull request?

According to the feedback from GitHub, the change causing memory issue has been rolled back. We can try to raise memory again for GA.

### Why are the changes needed?

Trying higher memory settings for GA. It could speed up the testing time.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

GA

Closes #33623 from viirya/increasing-mem-ga.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
(cherry picked from commit 7d13ac177b)
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
2021-08-05 01:31:45 -07:00
Sean Owen c7d246ba4e [SPARK-35310][MLLIB] Update to breeze 1.2
Update to the latest breeze 1.2

Minor bug fixes

No.

Existing tests

Closes #33449 from srowen/SPARK-35310.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-07-24 08:20:25 -05:00
Liang-Chi Hsieh a6418a3463 [SPARK-36270][BUILD] Change memory settings for enabling GA
### What changes were proposed in this pull request?

Trying to adjust build memory settings and serial execution to re-enable GA.

### Why are the changes needed?

GA tests are failed recently due to return code 137. We need to adjust build settings to make GA work.

### Does this PR introduce _any_ user-facing change?

No, dev only.

### How was this patch tested?

GA

Closes #33447 from viirya/test-ga.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit fd36ed4550)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-07-23 19:11:09 +09:00
Dongjoon Hyun 60566f9d8e [SPARK-36262][BUILD] Upgrade ZSTD-JNI to 1.5.0-4
### What changes were proposed in this pull request?

This PR aims to upgrade ZSTD-JNI to 1.5.0-4.

### Why are the changes needed?

ZSTD-JNI 1.5.0-3 has a packaging issue. 1.5.0-4 is recommended to be used instead.
- https://github.com/luben/zstd-jni/issues/181#issuecomment-885138495

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #33483 from dongjoon-hyun/SPARK-36262.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit a1a197403b)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-22 14:04:14 -07:00
Kousuke Saruta fef7bf9fcc [SPARK-36244][BUILD] Upgrade zstd-jni to 1.5.0-3 to avoid a bug about buffer size calculation
### What changes were proposed in this pull request?

This PR upgrades `zstd-jni` from `1.5.0-2` to `1.5.0-3`.
`1.5.0-3` was released few days ago.
This release resolves an issue about buffer size calculation, which can affect usage in Spark.
https://github.com/luben/zstd-jni/releases/tag/v1.5.0-3

### Why are the changes needed?

It might be a corner case that skipping length is greater than `2^31 - 1` but it's possible to affect Spark.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

CI.

Closes #33464 from sarutak/upgrade-zstd-jni-1.5.0-3.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit dcb7db5370)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-21 19:37:18 -07:00
William Hyun b93fa15ce2 [SPARK-36199][BUILD] Bump scalatest-maven-plugin to 2.0.2
### What changes were proposed in this pull request?
This PR aims to upgrade scalatest-maven-plugin to version 2.0.2.

### Why are the changes needed?
2.0.2 supports build on JDK 11 officially.
- f45ce192f3

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Pass the CIs.

Closes #33408 from williamhyun/SMP.

Authored-by: William Hyun <william@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit df8bae0689)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-18 22:14:43 -07:00
Dongjoon Hyun 8059a7e5e6 [SPARK-36195][BUILD] Set MaxMetaspaceSize JVM option to 2g
### What changes were proposed in this pull request?

This PR aims to set `MaxMetaspaceSize` to `2g` because it's increasing the native memory consumption unlimitedly by default. The unlimited increasing memory causes GitHub Action flakiness. The value I observed during `hive` module test was over 1.8G and growing.

- https://docs.oracle.com/javase/10/gctuning/other-considerations.htm#JSGCT-GUID-BFB89453-60C0-42AC-81CA-87D59B0ACE2E
> Starting with JDK 8, the permanent generation was removed and the class metadata is allocated in native memory. The amount of native memory that can be used for class metadata is by default unlimited. Use the option -XX:MaxMetaspaceSize to put an upper limit on the amount of native memory used for class metadata.

In addition, I increased the following memory limit to 4g consistently from two places.
```xml
- <jvmArg>-Xms2048m</jvmArg>
- <jvmArg>-Xmx2048m</jvmArg>
+ <jvmArg>-Xms4g</jvmArg>
+ <jvmArg>-Xmx4g</jvmArg>
```

```scala
- javaOptions += "-Xmx3g",
+ javaOptions ++= "-Xmx4g -XX:MaxMetaspaceSize=2g".split(" ").toSeq,
```

### Why are the changes needed?

This will reduce the flakiness in CI environment by limiting the memory usage explicitly.

When we limit it with `1g`, Hive module fails with `OOM` like the following.
```
java.lang.OutOfMemoryError: Metaspace
Error: Exception in thread "dispatcher-event-loop-110" java.lang.OutOfMemoryError: Metaspace
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #33405 from dongjoon-hyun/SPARK-36195.

Lead-authored-by: Dongjoon Hyun <dongjoon@apache.org>
Co-authored-by: Kyle Bendickson <kbendickson@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit d7df7a805f)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-18 10:15:26 -07:00
Kousuke Saruta ca8d2670b7 [SPARK-36129][BUILD] Upgrade commons-compress to 1.21 to deal with CVEs
### What changes were proposed in this pull request?

This PR upgrades `commons-compress` from `1.20` to `1.21` to deal with CVEs.

### Why are the changes needed?

Some CVEs which affect `commons-compress 1.20` are reported and fixed in `1.21`.
https://commons.apache.org/proper/commons-compress/security-reports.html

* CVE-2021-35515
* CVE-2021-35516
* CVE-2021-35517
* CVE-2021-36090

The severities are reported as low for all the CVEs but it would be better to deal with them just in case.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

CI.

Closes #33333 from sarutak/upgrade-commons-compress-1.21.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit fd06cc211d)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-13 22:53:22 -07:00
Wenchen Fan c1d8ccfb64 Revert "[SPARK-35253][SPARK-35398][SQL][BUILD] Bump up the janino version to v3.1.4"
### What changes were proposed in this pull request?

This PR reverts https://github.com/apache/spark/pull/32455 and its followup https://github.com/apache/spark/pull/32536 , because the new janino version has a bug that is not fixed yet: https://github.com/janino-compiler/janino/pull/148

### Why are the changes needed?

avoid regressions

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests

Closes #33302 from cloud-fan/revert.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit ae6199af44)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-07-13 12:14:21 +09:00
Dongjoon Hyun d7990943c3 [SPARK-35992][BUILD] Upgrade ORC to 1.6.9
### What changes were proposed in this pull request?

This PR aims to upgrade Apache ORC to 1.6.9.

### Why are the changes needed?

This is required to bring ORC-804 in order to fix ORC encryption masking bug.

### Does this PR introduce _any_ user-facing change?

No. This is not released yet.

### How was this patch tested?

Pass the newly added test case.

Closes #33189 from dongjoon-hyun/SPARK-35992.

Lead-authored-by: Dongjoon Hyun <dongjoon@apache.org>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit c55b9fd1e0)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-07-02 09:50:00 -07:00
Holden Karau 34286ae5bf [SPARK-35960][BUILD][TEST] Bump the scalatest version to 3.2.9
### What changes were proposed in this pull request?

Bump the scalatest version to 3.2.9

### Why are the changes needed?

With the scalatestplus change to 3.2.9.0, recent sbt fails to handle the mismatch between scalatest and scalatestplus and resolve resulting in test:compile errors of not being able to find the org.scalatest package.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

sbt tags/test:compile failed before and passes with this change.

Closes #33163 from holdenk/SPARK-35960-test-compile-sbt-issue.

Authored-by: Holden Karau <hkarau@netflix.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-06-30 21:39:12 -07:00
Kousuke Saruta 7ad682aaa1 Revert "[SPARK-34549][BUILD] Upgrade aws kinesis to 1.14.0 and java sdk 1.11.844"
### What changes were proposed in this pull request?

This PR reverts the change of SPARK-34549 ( #31658).

### Why are the changes needed?

See #33133.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Closes #33145 from sarutak/revert-SPARK-34549.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
2021-06-30 10:45:41 +09:00
Dongjoon Hyun 7e7028282c [SPARK-35928][BUILD] Upgrade ASM to 9.1
### What changes were proposed in this pull request?

This PR aims to upgrade ASM to 9.1

### Why are the changes needed?

The latest `xbean-asm9-shaded` is built with ASM 9.1.

- https://mvnrepository.com/artifact/org.apache.xbean/xbean-asm9-shaded/4.20
- 5e0e3c0c64/pom.xml (L67)

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #33130 from dongjoon-hyun/SPARK-35928.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-06-29 10:27:51 -07:00
Dongjoon Hyun c45a6f5d09 [SPARK-35922][BUILD] Upgrade maven-shade-plugin to 3.2.4
### What changes were proposed in this pull request?

This PR aims to upgrade `maven-shade-plugin` to 3.2.4.

### Why are the changes needed?

This is required to build with Java 17-ea.

Since `maven-shade-plugin` 3.2.3, `asm` 8.0 is used now. We should remove our custom dependency of `7.3.1`.
- https://mvnrepository.com/artifact/org.apache.maven.plugins/maven-shade-plugin/3.2.4
- https://mvnrepository.com/artifact/org.apache.maven.plugins/maven-shade-plugin/3.2.3

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass the CIs.

Closes #33122 from dongjoon-hyun/SPARK-35922.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2021-06-28 22:08:44 -07:00