### What changes were proposed in this pull request?
This PR aims to increase the JVM CodeCacheSize from 0.5G to 1G.
### Why are the changes needed?
After upgrading to `Scala 2.12.10`, the following is observed during building.
```
2019-09-18T20:49:23.5030586Z OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
2019-09-18T20:49:23.5032920Z OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=
2019-09-18T20:49:23.5034959Z CodeCache: size=524288Kb used=521399Kb max_used=521423Kb free=2888Kb
2019-09-18T20:49:23.5035472Z bounds [0x00007fa62c000000, 0x00007fa64c000000, 0x00007fa64c000000]
2019-09-18T20:49:23.5035781Z total_blobs=156549 nmethods=155863 adapters=592
2019-09-18T20:49:23.5036090Z compilation: disabled (not enough contiguous free space left)
```
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Manually check the Jenkins or GitHub Action build log (which should not have the above).
Closes#25836 from dongjoon-hyun/SPARK-CODE-CACHE-1G.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
This PR upgrade Scala to **2.12.10**.
Release notes:
- Fix regression in large string interpolations with non-String typed splices
- Revert "Generate shallower ASTs in pattern translation"
- Fix regression in classpath when JARs have 'a.b' entries beside 'a/b'
- Faster compiler: 5–10% faster since 2.12.8
- Improved compatibility with JDK 11, 12, and 13
- Experimental support for build pipelining and outline type checking
More details:
https://github.com/scala/scala/releases/tag/v2.12.10https://github.com/scala/scala/releases/tag/v2.12.9
## How was this patch tested?
Existing tests
Closes#25404 from wangyum/SPARK-28683.
Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
It upgrades ORC from 1.5.5 to 1.5.6 and adds closes the ORC readers when they aren't used to
create RecordReaders.
## How was this patch tested?
The changed unit tests were run.
Closes#25006 from omalley/spark-28208.
Lead-authored-by: Owen O'Malley <omalley@apache.org>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR allows `bin/spark-submit --version` to show the correct information while the previous versions, which were created by `dev/create-release/do-release-docker.sh`, show incorrect information.
There are two root causes to show incorrect information:
1. Did not pass `USER` environment variable to the docker container
1. Did not keep `.git` directory in the work directory
### Why are the changes needed?
The information is missing while the previous versions show the correct information.
### Does this PR introduce any user-facing change?
Yes, the following is the console output in branch-2.3
```
$ bin/spark-submit --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.4
/_/
Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_212
Branch HEAD
Compiled by user ishizaki on 2019-09-02T02:18:10Z
Revision 8c6f8150f3
Url https://gitbox.apache.org/repos/asf/spark.git
Type --help for more information.
```
Without this PR, the console output is as follows
```
$ spark-submit --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.3.4
/_/
Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_212
Branch
Compiled by user on 2019-08-26T08:29:39Z
Revision
Url
Type --help for more information.
```
### How was this patch tested?
After building the package, I manually executed `bin/spark-submit --version`
Closes#25655 from kiszk/SPARK-28906.
Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
### What changes were proposed in this pull request?
- Remove SQLContext.createExternalTable and Catalog.createExternalTable, deprecated in favor of createTable since 2.2.0, plus tests of deprecated methods
- Remove HiveContext, deprecated in 2.0.0, in favor of `SparkSession.builder.enableHiveSupport`
- Remove deprecated KinesisUtils.createStream methods, plus tests of deprecated methods, deprecate in 2.2.0
- Remove deprecated MLlib (not Spark ML) linear method support, mostly utility constructors and 'train' methods, and associated docs. This includes methods in LinearRegression, LogisticRegression, Lasso, RidgeRegression. These have been deprecated since 2.0.0
- Remove deprecated Pyspark MLlib linear method support, including LogisticRegressionWithSGD, LinearRegressionWithSGD, LassoWithSGD
- Remove 'runs' argument in KMeans.train() method, which has been a no-op since 2.0.0
- Remove deprecated ChiSqSelector isSorted protected method
- Remove deprecated 'yarn-cluster' and 'yarn-client' master argument in favor of 'yarn' and deploy mode 'cluster', etc
Notes:
- I was not able to remove deprecated DataFrameReader.json(RDD) in favor of DataFrameReader.json(Dataset); the former was deprecated in 2.2.0, but, it is still needed to support Pyspark's .json() method, which can't use a Dataset.
- Looks like SQLContext.createExternalTable was not actually deprecated in Pyspark, but, almost certainly was meant to be? Catalog.createExternalTable was.
- I afterwards noted that the toDegrees, toRadians functions were almost removed fully in SPARK-25908, but Felix suggested keeping just the R version as they hadn't been technically deprecated. I'd like to revisit that. Do we really want the inconsistency? I'm not against reverting it again, but then that implies leaving SQLContext.createExternalTable just in Pyspark too, which seems weird.
- I *kept* LogisticRegressionWithSGD, LinearRegressionWithSGD, LassoWithSGD, RidgeRegressionWithSGD in Pyspark, though deprecated, as it is hard to remove them (still used by StreamingLogisticRegressionWithSGD?) and they are not fully removed in Scala. Maybe should not have been deprecated.
### Why are the changes needed?
Deprecated items are easiest to remove in a major release, so we should do so as much as possible for Spark 3. This does not target items deprecated 'recently' as of Spark 2.3, which is still 18 months old.
### Does this PR introduce any user-facing change?
Yes, in that deprecated items are removed from some public APIs.
### How was this patch tested?
Existing tests.
Closes#25684 from srowen/SPARK-28980.
Lead-authored-by: Sean Owen <sean.owen@databricks.com>
Co-authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
### What changes were proposed in this pull request?
Upgrade netty-all to latest in the 4.1.x line which is 4.1.39-Final.
### Why are the changes needed?
Currency of dependencies.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Existing unit-tests against master branch.
Closes#25712 from n-marion/master.
Authored-by: Nicholas Marion <nmarion@us.ibm.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
We are adding other resource type support to the executors and Spark. We should show the resource information for each executor on the UI Executors page.
This also adds a toggle button to show the resources column. It is off by default.
![executorui1](https://user-images.githubusercontent.com/4563792/63891432-c815b580-c9aa-11e9-9f41-62975649efbc.png)
![Screenshot from 2019-08-28 14-56-26](https://user-images.githubusercontent.com/4563792/63891516-fd220800-c9aa-11e9-9fe4-89fcdca37306.png)
### Why are the changes needed?
to show user what resources the executors have. Like Gpus, fpgas, etc
### Does this PR introduce any user-facing change?
Yes introduces UI and rest api changes to show the resources
### How was this patch tested?
Unit tests and manual UI tests on yarn and standalone modes.
Closes#25613 from tgravescs/SPARK-27489-gpu-ui-latest.
Authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
### What changes were proposed in this pull request?
Upgrade kubernetes client from 4.1.2 to 4.4.2
### Why are the changes needed?
To fix compatibility issue with EKS since Amazon rolled out some security patches over the past week; 1.15.3, 1.14.6, 1.13.10, 1.12.10, and 1.11.10.
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
Pass the Jenkins and manually test on EKS.
Closes#25640 from andygrove/SPARK-28921.
Authored-by: Andy Grove <andygrove73@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR adds a release announce template.
### Why are the changes needed?
- We want to use a formal template including HTTPS in the future release.
- The future release managers don't need to search mailing list to find this form.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
N/A.
Closes#25656 from dongjoon-hyun/SPARK-28951.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
i broke run-tests.py for non-PRB builds in this PR:
https://github.com/apache/spark/pull/25423
### Why are the changes needed?
to fix what i broke
### Does this PR introduce any user-facing change?
no
### How was this patch tested?
the build system will test this
Closes#25585 from shaneknapp/fix-run-tests.
Authored-by: shane knapp <incomplete@gmail.com>
Signed-off-by: shane knapp <incomplete@gmail.com>
## What changes were proposed in this pull request?
we need to add the ability to test PRBs against java11.
see comments here: https://github.com/apache/spark/pull/25405
## How was this patch tested?
the build system will test this.
Closes#25423 from shaneknapp/spark-prb-java11.
Authored-by: shane knapp <incomplete@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
### What changes were proposed in this pull request?
This PR aims to specify Jekyll Version explicitly in our release docker image.
### Why are the changes needed?
Recently, Jekyll 4.0 is released and it dropped Ruby 2.3 support.
This breaks our release docker image build.
```
Building native extensions. This could take a while...
ERROR: Error installing jekyll:
jekyll-sass-converter requires Ruby version >= 2.4.0.
```
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
The following should succeed.
```
$ docker build -t spark-rm:test --build-arg UID=501 dev/create-release/spark-rm
...
Successfully tagged spark-rm:test
```
Closes#25578 from dongjoon-hyun/SPARK-28868.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR aims to clean up the commit logs by removing the comments of our PR template.
### Why are the changes needed?
Apache Spark PR template has comments. Sometime we forget to clean up them because GitHub hides them nicely. It would be great if we clean up this. Otherwise, this makes the commit logs too verbose. (There are a few commits already.)
### Does this PR introduce any user-facing change?
No. (only for committers)
### How was this patch tested?
Manually with Python2/Python3.
Closes#25564 from dongjoon-hyun/SPARK-28857.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
### What changes were proposed in this pull request?
Spark uses Netty 4 directly, but also includes Netty 3 only because transitive dependencies do. The dependencies (Hadoop HDFS, Zookeeper, Avro) don't seem to need this dependency as used in Spark. I think we can forcibly remove it to slim down the dependencies.
Previous attempts were blocked by its usage in Flume, but that dependency has gone away.
https://github.com/apache/spark/pull/15436
### Why are the changes needed?
Mostly to reduce the transitive dependency size and complexity a little bit and avoid triggering spurious security alerts on Netty 3.x usage.
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
Existing tests
Closes#25544 from srowen/SPARK-17875.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Update Jersey to 2.27+, ideally 2.29, for possible JDK 11 fixes.
## How was this patch tested?
Existing tests.
Closes#25455 from srowen/SPARK-28737.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR aims to upgrade `Janino` from `3.0.13` to `3.0.15` in order to bring the bug fixes. Please note that `3.1.0` is a major refactoring instead of bug fixes. We had better use `3.0.15` and wait for the stabler 3.1.x.
### Why are the changes needed?
This brings the following bug fixes.
**3.0.15 (2019-07-28)**
- Fix overloaded single static method import
**3.0.14 (2019-07-05)**
- Conflict in sbt-assembly
- Overloaded static on-demand imported methods cause a CompileException: Ambiguous static method import
- Handle overloaded static on-demand imports
- Major refactoring of the Java 8 and Java 9 retrofit mechanism
- Added tests for "JLS8 8.6 Instance Initializers" and "JLS8 8.7 Static Initializers"
- Local variables in instance initializers don't work
- Provide an option to keep generated code files
- Added compile error handler and warning handler to ICompiler
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Pass the Jenkins with the existing tests.
Closes#25474 from dongjoon-hyun/SPARK-28758.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Update Jackson databind to the latest version for some latest changes.
## How was this patch tested?
Pass the Jenkins.
Closes#25451 from Fokko/fd-bump-jackson-databind.
Lead-authored-by: Fokko Driesprong <fokko@apache.org>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
R version 3.6.1 (Action of the Toes) was released on 2019-07-05. This PR aims to upgrade R installation for AppVeyor CI environment.
## How was this patch tested?
Pass the AppVeyor CI.
Closes#25441 from dongjoon-hyun/SPARK-28720.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
## What changes were proposed in this pull request?
I made an audit and update all dev scripts to support python3. (except `merge_spark_pr.py` which already updated)
## How was this patch tested?
Manual.
Closes#25289 from WeichenXu123/dev_py3.
Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
This PR aims to improve the `merge-spark-pr` script in the following two ways.
1. `[WIP]` is useful when we show that a PR is not ready for merge. Apache Spark allows merging `WIP` PRs. However, sometime, we accidentally forgot to clean up the title for the completed PRs. We had better warn once more during merging stage and get a confirmation from the committers.
2. We have two kinds of PR titles in terms of the ending period. This PR aims to remove the trailing `dot` since the shorter is the better in the commit title. Also, the PR titles without the trailing `dot` is dominant in the Apache Spark commit logs.
```
$ git log --oneline | grep '[.]$' | wc -l
4090
$ git log --oneline | grep '[^.]$' | wc -l
20747
```
## How was this patch tested?
Manual.
```
$ dev/merge_spark_pr.py
git rev-parse --abbrev-ref HEAD
Which pull request would you like to merge? (e.g. 34): 25157
The PR title has `[WIP]`:
[WIP][SPARK-28396][SQL] Add PathCatalog for data source V2
Continue? (y/n):
```
```
$ dev/merge_spark_pr.py
git rev-parse --abbrev-ref HEAD
Which pull request would you like to merge? (e.g. 34): 25304
I've re-written the title as follows to match the standard format:
Original: [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API.
Modified: [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API
Would you like to use the modified title? (y/n):
```
Closes#25356 from dongjoon-hyun/SPARK-28616.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
CRAN repo changed the key and it causes our release script failure. This is a release blocker for Apache Spark 2.4.4 and 3.0.0.
- https://cran.r-project.org/bin/linux/ubuntu/README.html
```
Err:1 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease
The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 51716619E084DAB9
...
W: GPG error: https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 51716619E084DAB9
E: The repository 'https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease' is not signed.
```
Note that they are reusing `cran35` for R 3.6 although they changed the key.
```
Even though R has moved to version 3.6, for compatibility the sources.list entry still uses the cran3.5 designation.
```
This PR aims to recover the docker image generation first. We will verify the R doc generation in a separate JIRA and PR.
## How was this patch tested?
Manual. After `docker-build.log`, it should continue to the next stage, `Building v3.0.0-rc1`.
```
$ dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n -s docs
...
Log file: docker-build.log
Building v3.0.0-rc1; output will be at /tmp/spark-3.0.0/output
```
Closes#25339 from dongjoon-hyun/SPARK-28606.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
## What changes were proposed in this pull request?
Prior to this change, in an executor, on each heartbeat, memory metrics are polled and sent in the heartbeat. The heartbeat interval is 10s by default. With this change, in an executor, memory metrics can optionally be polled in a separate poller at a shorter interval.
For each executor, we use a map of (stageId, stageAttemptId) to (count of running tasks, executor metric peaks) to track what stages are active as well as the per-stage memory metric peaks. When polling the executor memory metrics, we attribute the memory to the active stage(s), and update the peaks. In a heartbeat, we send the per-stage peaks (for stages active at that time), and then reset the peaks. The semantics would be that the per-stage peaks sent in each heartbeat are the peaks since the last heartbeat.
We also keep a map of taskId to memory metric peaks. This tracks the metric peaks during the lifetime of the task. The polling thread updates this as well. At end of a task, we send the peak metric values in the task result. In case of task failure, we send the peak metric values in the `TaskFailedReason`.
We continue to do the stage-level aggregation in the EventLoggingListener.
For the driver, we still only poll on heartbeats. What the driver sends will be the current values of the metrics in the driver at the time of the heartbeat. This is semantically the same as before.
## How was this patch tested?
Unit tests. Manually tested applications on an actual system and checked the event logs; the metrics appear in the SparkListenerTaskEnd and SparkListenerStageExecutorMetrics events.
Closes#23767 from wypoon/wypoon_SPARK-26329.
Authored-by: Wing Yew Poon <wypoon@cloudera.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
## What changes were proposed in this pull request?
This PR aims to update `zstd-jni` library to bring the latest improvement and bug fixes in `1.4.1` and `1.4.2`.
- https://github.com/facebook/zstd/releases/tag/v1.4.1 (4.5 ~ 11.8% performance improvement from v1.4.0 and bug fixes)
- https://github.com/facebook/zstd/releases/tag/v1.4.2 (bug fixes)
## How was this patch tested?
Pass the Jenkins.
Closes#25275 from dongjoon-hyun/SPARK-28544.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Unlike the other versions, `x.x.0-SNAPSHOT` causes `x.x.-1`. Although this will not happen in the tags (there is no `SNAPSHOT` postfix), we had better fix this.
```
$ dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n
Output directory already exists. Overwrite and continue? [y/n] y
Branch [branch-2.4]: master
Current branch version is 3.0.0-SNAPSHOT.
Release [3.0.-1]:
```
Since we already have `RELEASE_VERSION` by removing `SNAPSHOT`. This PR uses `RELEASE_VERSION` instead of `VERSION`.
```
$ dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n
Branch [branch-2.4]: master
Current branch version is 3.0.0-SNAPSHOT.
Release [3.0.0]:
```
## How was this patch tested?
Manually do `dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n` and see the default value of `Release`.
Closes#25254 from dongjoon-hyun/SPARK-28511.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
There are two cases when we use `dry run`.
First, when the tag already exists, we can ask `confirmation` on the existing tag name.
```
$ dev/create-release/do-release-docker.sh -d /tmp/spark-2.4.4 -n -s docs
Output directory already exists. Overwrite and continue? [y/n] y
Branch [branch-2.4]:
Current branch version is 2.4.4-SNAPSHOT.
Release [2.4.4]: 2.4.3
RC # [1]:
v2.4.3-rc1 already exists. Continue anyway [y/n]? y
This is a dry run. Please confirm the ref that will be built for testing.
Ref [v2.4.3-rc1]:
```
Second, when the tag doesn't exist, we had better ask `confirmation` on the branch name. If we do not change the default value, it will fail eventually.
```
$ dev/create-release/do-release-docker.sh -d /tmp/spark-2.4.4 -n -s docs
Branch [branch-2.4]:
Current branch version is 2.4.4-SNAPSHOT.
Release [2.4.4]:
RC # [1]:
This is a dry run. Please confirm the ref that will be built for testing.
Ref [v2.4.4-rc1]:
```
This PR improves the second case by providing the branch name instead. This helps the release testing before tagging.
## How was this patch tested?
Manually do the following and check the default value of `Ref` field.
```
$ dev/create-release/do-release-docker.sh -d /tmp/spark-2.4.4 -n -s docs
Branch [branch-2.4]:
Current branch version is 2.4.4-SNAPSHOT.
Release [2.4.4]:
RC # [1]:
This is a dry run. Please confirm the ref that will be built for testing.
Ref [branch-2.4]:
...
```
Closes#25240 from dongjoon-hyun/SPARK-28496.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
## What changes were proposed in this pull request?
This upgraded to a newer version of Pyrolite. Most updates [1] in the newer version are for dotnot. For java, it includes a bug fix to Unpickler regarding cleaning up Unpickler memo, and support of protocol 5.
After upgrading, we can remove the fix at SPARK-27629 for the bug in Unpickler.
[1] https://github.com/irmen/Pyrolite/compare/pyrolite-4.23...master
## How was this patch tested?
Manually tested on Python 3.6 in local on existing tests.
Closes#25143 from viirya/upgrade-pyrolite.
Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
`dev/merge_spark_pr.py` script always fail for some users because they have different `name` and `key`.
- https://issues.apache.org/jira/rest/api/2/user?username=yumwang
JIRA Client expects `name`, but we are using `key`. This PR fixes it.
```python
# This is JIRA client code `/usr/local/lib/python2.7/site-packages/jira/client.py`
def assign_issue(self, issue, assignee):
"""Assign an issue to a user. None will set it to unassigned. -1 will set it to Automatic.
:param issue: the issue ID or key to assign
:param assignee: the user to assign the issue to
:type issue: int or str
:type assignee: str
:rtype: bool
"""
url = self._options['server'] + \
'/rest/api/latest/issue/' + str(issue) + '/assignee'
payload = {'name': assignee}
r = self._session.put(
url, data=json.dumps(payload))
raise_on_error(r)
return True
```
## How was this patch tested?
Manual with the committer ID/password.
```python
import jira.client
asf_jira = jira.client.JIRA({'server': 'https://issues.apache.org/jira'}, basic_auth=('yourid', 'passwd'))
asf_jira.assign_issue("SPARK-28354", "q79969786") # This will raise exception.
asf_jira.assign_issue("SPARK-28354", "yumwang") # This works.
```
Closes#25120 from dongjoon-hyun/SPARK-28354.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
Mainly change logs:
### Version 3.0.13:
- Support for JDK 9/10 in Full Compiler
- The syntax elements that can have modifiers now all have sets of "is...()" methods that check for each modifier. Some also have methods "getAccess()" and/or "getAnnotations()".
- Implement "type annotations" (JLS8 9.7.4)
- Implemented parsing (but not compilation) of "modular compilation units" (JLS11 7.3).
- Replaced all "assert...Uncookable(..., Pattern messageRegex)" and "assert...Uncookable(..., String messageInfix)" method pairs with a single "assert...Uncookable(..., String messageRegex)" method.
Minor refactoring: Allowed modifiers are now checked in the Parser, not in Java.*. This saves a lot of THROWS clauses.
- Parse Type inference syntax: Type inference for generic instance creation implemented, test cases added.
- Parse MethodReference, ClassInstanceCreationReference and ArrayCreationReference
### Version 3.0.12
- Fixed: Operator "&" not defined on types "java.lang.Long" and "int"
- Major bug in JavaSourceClassLoader: When loading the second and following classes, CUs were compiled again, leading to an inconsistent class hierarchy.
- Fixed: Java 9 added "Override public final CharBuffer CharBuffer.rewind() { ..." -- leads easily to a java.lang.NoSuchMethodError
- Changed all occurences of the words "Java bytecode" to "JVM bytecode" to make clearer that the generated bytecode is for the JVMS and not suitable for, e.g. DALVIK.
http://janino-compiler.github.io/janino/changelog.html
## How was this patch tested?
Existing test
Closes#25021 from wangyum/SPARK-28221.
Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
This PR proposes to remove cloned `pyspark-coverage-site` repo.
it doesn't looks a problem in PR builder but somehow it's problematic in `spark-master-test-sbt-hadoop-2.7`.
## How was this patch tested?
Jenkins.
Closes#23729 from HyukjinKwon/followup-coverage.
Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org>
Co-authored-by: shane knapp <incomplete@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
This PR aims to update `xbean-asm7-shaded` to bring [XBEAN-318](https://issues.apache.org/jira/browse/XBEAN-318) which is helpful to log the class definition reading failures.
- https://issues.apache.org/jira/projects/XBEAN/versions/12345220
## How was this patch tested?
Pass the Jenkins.
Closes#24914 from dongjoon-hyun/SPARK-28111.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
We're using an old-ish jQuery, 1.12.4, and should probably update for Spark 3 to keep up in general, but also to keep up with CVEs. In fact, we know of at least one resolved in only 3.4.0+ (https://nvd.nist.gov/vuln/detail/CVE-2019-11358). They may not affect Spark, but, if the update isn't painful, maybe worthwhile in order to make future 3.x updates easier.
jQuery 1 -> 2 doesn't sound like a breaking change, as 2.0 is supposed to maintain compatibility with 1.9+ (https://blog.jquery.com/2013/04/18/jquery-2-0-released/)
2 -> 3 has breaking changes: https://jquery.com/upgrade-guide/3.0/. It's hard to evaluate each one, but the most likely area for problems is in ajax(). However, our usage of jQuery (and plugins) is pretty simple.
Update jquery to 3.4.1; update jquery blockUI and mustache to latest
## How was this patch tested?
Manual testing of docs build (except R docs), worker/master UI, spark application UI.
Note: this really doesn't guarantee it works, as our tests can't test javascript, and this is merely anecdotal testing, although I clicked about every link I could find. There's a risk this breaks a minor part of the UI; it does seem to work fine in the main.
Closes#24843 from srowen/SPARK-28004.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
This PR aims to expose JIRA issue component types at GitHub PRs.
## How was this patch tested?
Manual.
```
$ export GITHUB_OAUTH_KEY=...
$ export JIRA_PASSWORD=...
$ export GITHUB_API_BASE='https://api.github.com/repos/your-id/spark'
$ dev/github_jira_sync.py
```
Please note that the existing script will raise the following exceptions if your repo has less than 100 PRs. This will be handled at #24874 .
```
Traceback (most recent call last):
File "dev/github_jira_sync.py", line 139, in <module>
jira_prs = get_jira_prs()
File "dev/github_jira_sync.py", line 83, in get_jira_prs
link_header = filter(lambda k: k.startswith("Link"), page.info().headers)[0]
IndexError: list index out of range
```
That is beyond the scope of this PR.
Closes#24871 from dongjoon-hyun/SPARK-28051.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Currently, `github_jira_sync.py` assumes that there is `Link` always. However, it will fail when the number of the open PR is less than 100 (the default paging number). It will not happen in Apache Spark, but we had better fix that because it happens during review process for `github_jira_sync.py` script.
```
Traceback (most recent call last):
File "dev/github_jira_sync.py", line 139, in <module>
jira_prs = get_jira_prs()
File "dev/github_jira_sync.py", line 83, in get_jira_prs
link_header = filter(lambda k: k.startswith("Link"), page.info().headers)[0]
IndexError: list index out of range
```
## How was this patch tested?
Manually check with another repo which has small number of open PRs (< 100).
```
$ export JIRA_PASSWORD=...
$ export GITHUB_API_BASE='https://api.github.com/repos/your-id/spark'
$ dev/github_jira_sync.py
```
Closes#24874 from dongjoon-hyun/SPARK-28053.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
This is a second try of #24824.
Since Apache Spark 2.0.0, SPARK-14867 deprecated `--force` option and made it ignored. This PR cleans up the related code completely at 3.0.0.
**BEFORE (Jenkins)**
```
========================================================================
Building Spark
========================================================================
[info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests
WARNING: '--force' is deprecated and ignored.
...
========================================================================
Running Spark unit tests
========================================================================
[info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end
WARNING: '--force' is deprecated and ignored.
```
**AFTER (Jenkins)**
```
========================================================================
Building Spark
========================================================================
[info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests
...
========================================================================
Running Spark unit tests
========================================================================
[info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Phive -Pkinesis-asl -Pmesos -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end
```
## How was this patch tested?
Manually check the Jenkins logs.
Closes#24833 from dongjoon-hyun/SPARK-FORCE-2.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
This PR introduces the necessary Maven modules for the new [Spark Graph](https://issues.apache.org/jira/browse/SPARK-25994) feature for Spark 3.0.
* `spark-graph` is a parent module that users depend on to get all graph functionalities (Cypher and Graph Algorithms)
* `spark-graph-api` defines the [Property Graph API](https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI) that is being shared between Cypher and Algorithms
* `spark-cypher` contains a Cypher query engine implementation
Both, `spark-graph-api` and `spark-cypher` depend on Spark SQL.
Note, that the Maven module for Graph Algorithms is not part of this PR and will be introduced in https://issues.apache.org/jira/browse/SPARK-27302
A PoC for a running Cypher implementation can be found in this WIP PR https://github.com/apache/spark/pull/24297
## How was this patch tested?
Pass the Jenkins with all profiles and manually build and check the followings.
```
$ ls assembly/target/scala-2.12/jars/spark-cypher*
assembly/target/scala-2.12/jars/spark-cypher_2.12-3.0.0-SNAPSHOT.jar
$ ls assembly/target/scala-2.12/jars/spark-graph* | grep -v graphx
assembly/target/scala-2.12/jars/spark-graph-api_2.12-3.0.0-SNAPSHOT.jar
assembly/target/scala-2.12/jars/spark-graph_2.12-3.0.0-SNAPSHOT.jar
```
Closes#24490 from s1ck/SPARK-27300.
Lead-authored-by: Martin Junghanns <martin.junghanns@neotechnology.com>
Co-authored-by: Max Kießling <max@kopfueber.org>
Co-authored-by: Martin Junghanns <martin.junghanns@neo4j.com>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Since Apache Spark 2.0.0, SPARK-14867 deprecated `--force` option and made it ignored. This PR cleans up the related code completely at 3.0.0.
**BEFORE (Jenkins)**
```
========================================================================
Building Spark
========================================================================
[info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests
WARNING: '--force' is deprecated and ignored.
...
========================================================================
Running Spark unit tests
========================================================================
[info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end
WARNING: '--force' is deprecated and ignored.
```
**AFTER (Jenkins)**
```
========================================================================
Building Spark
========================================================================
[info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests
...
========================================================================
Running Spark unit tests
========================================================================
[info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Phive -Pkinesis-asl -Pmesos -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end
```
## How was this patch tested?
Manually check the Jenkins logs.
Closes#24824 from dongjoon-hyun/SPARK-27979.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Move to json4s version 3.6.6
Add scala-xml 1.2.0
## How was this patch tested?
Pass the Jenkins
Closes#24736 from igreenfield/master.
Authored-by: Izek Greenfield <igreenfield@axiomsl.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
## What changes were proposed in this pull request?
This fixes CVE-2019-12086 on Databind: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9.9
## How was this patch tested?
Existing tests
Closes#24646 from Fokko/SPARK-27757.
Authored-by: Fokko Driesprong <fokko@apache.org>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
## What changes were proposed in this pull request?
R 3.6.0 is released 2019-04-26. This PR targets to change R version from 3.5.1 to 3.6.0 in AppVeyor.
This PR sets `R_REMOTES_NO_ERRORS_FROM_WARNINGS` to `true` to avoid the warnings below:
```
Error in strptime(xx, f, tz = tz) :
(converted from warning) unable to identify current timezone 'C':
please set environment variable 'TZ'
Error in i.p(...) :
(converted from warning) installation of package 'praise' had non-zero exit status
Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p
Execution halted
```
## How was this patch tested?
AppVeyor
Closes#24716 from HyukjinKwon/SPARK-27848.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
Use https URL for CRAN repo (and for a Scala download in a Dockerfile)
## How was this patch tested?
Existing tests.
Closes#24664 from srowen/SPARK-27794.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
Tighten up some key links to the project and download pages to use HTTPS
## How was this patch tested?
N/A
Closes#24665 from srowen/HTTPSURLs.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
This PR excludes 'hive-thriftserver' in modules to test for hadoop3.2 for now as well
## How was this patch tested?
Manually tested via `run-tests.py`
Closes#24644 from HyukjinKwon/SPARK-27402.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
## What changes were proposed in this pull request?
This PR aims to update `zstd-jni` library to `1.4.0-1` which improves the `level 1 compression speed` performance by 6% in most scenarios. The following is the full release note.
- https://github.com/facebook/zstd/releases/tag/v1.4.0
## How was this patch tested?
Pass the Jenkins.
Closes#24632 from dongjoon-hyun/SPARK-27755.
Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>