ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Dongjoon Hyun	3bf43fb60d	[SPARK-29159][BUILD] Increase ReservedCodeCacheSize to 1G ### What changes were proposed in this pull request? This PR aims to increase the JVM CodeCacheSize from 0.5G to 1G. ### Why are the changes needed? After upgrading to `Scala 2.12.10`, the following is observed during building. ``` 2019-09-18T20:49:23.5030586Z OpenJDK 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled. 2019-09-18T20:49:23.5032920Z OpenJDK 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize= 2019-09-18T20:49:23.5034959Z CodeCache: size=524288Kb used=521399Kb max_used=521423Kb free=2888Kb 2019-09-18T20:49:23.5035472Z bounds [0x00007fa62c000000, 0x00007fa64c000000, 0x00007fa64c000000] 2019-09-18T20:49:23.5035781Z total_blobs=156549 nmethods=155863 adapters=592 2019-09-18T20:49:23.5036090Z compilation: disabled (not enough contiguous free space left) ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually check the Jenkins or GitHub Action build log (which should not have the above). Closes #25836 from dongjoon-hyun/SPARK-CODE-CACHE-1G. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-09-19 00:24:15 -07:00
Yuming Wang	8c3f27ceb4	[SPARK-28683][BUILD] Upgrade Scala to 2.12.10 ## What changes were proposed in this pull request? This PR upgrade Scala to 2.12.10. Release notes: - Fix regression in large string interpolations with non-String typed splices - Revert "Generate shallower ASTs in pattern translation" - Fix regression in classpath when JARs have 'a.b' entries beside 'a/b' - Faster compiler: 5–10% faster since 2.12.8 - Improved compatibility with JDK 11, 12, and 13 - Experimental support for build pipelining and outline type checking More details: https://github.com/scala/scala/releases/tag/v2.12.10 https://github.com/scala/scala/releases/tag/v2.12.9 ## How was this patch tested? Existing tests Closes #25404 from wangyum/SPARK-28683. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-09-18 13:30:36 -07:00
Owen O'Malley	dfb0a8bb04	[SPARK-28208][BUILD][SQL] Upgrade to ORC 1.5.6 including closing the ORC readers ## What changes were proposed in this pull request? It upgrades ORC from 1.5.5 to 1.5.6 and adds closes the ORC readers when they aren't used to create RecordReaders. ## How was this patch tested? The changed unit tests were run. Closes #25006 from omalley/spark-28208. Lead-authored-by: Owen O'Malley <omalley@apache.org> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-09-18 09:32:43 -07:00
Kazuaki Ishizaki	8d1b5ba766	[SPARK-28906][BUILD] Fix incorrect information in bin/spark-submit --version ### What changes were proposed in this pull request? This PR allows `bin/spark-submit --version` to show the correct information while the previous versions, which were created by `dev/create-release/do-release-docker.sh`, show incorrect information. There are two root causes to show incorrect information: 1. Did not pass `USER` environment variable to the docker container 1. Did not keep `.git` directory in the work directory ### Why are the changes needed? The information is missing while the previous versions show the correct information. ### Does this PR introduce any user-facing change? Yes, the following is the console output in branch-2.3 ``` $ bin/spark-submit --version Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.4 /_/ Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_212 Branch HEAD Compiled by user ishizaki on 2019-09-02T02:18:10Z Revision `8c6f8150f3` Url https://gitbox.apache.org/repos/asf/spark.git Type --help for more information. ``` Without this PR, the console output is as follows ``` $ spark-submit --version Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.4 /_/ Using Scala version 2.11.8, OpenJDK 64-Bit Server VM, 1.8.0_212 Branch Compiled by user on 2019-08-26T08:29:39Z Revision Url Type --help for more information. ``` ### How was this patch tested? After building the package, I manually executed `bin/spark-submit --version` Closes #25655 from kiszk/SPARK-28906. Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-09-11 08:12:44 -05:00
Sean Owen	6378d4bc06	[SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 ### What changes were proposed in this pull request? - Remove SQLContext.createExternalTable and Catalog.createExternalTable, deprecated in favor of createTable since 2.2.0, plus tests of deprecated methods - Remove HiveContext, deprecated in 2.0.0, in favor of `SparkSession.builder.enableHiveSupport` - Remove deprecated KinesisUtils.createStream methods, plus tests of deprecated methods, deprecate in 2.2.0 - Remove deprecated MLlib (not Spark ML) linear method support, mostly utility constructors and 'train' methods, and associated docs. This includes methods in LinearRegression, LogisticRegression, Lasso, RidgeRegression. These have been deprecated since 2.0.0 - Remove deprecated Pyspark MLlib linear method support, including LogisticRegressionWithSGD, LinearRegressionWithSGD, LassoWithSGD - Remove 'runs' argument in KMeans.train() method, which has been a no-op since 2.0.0 - Remove deprecated ChiSqSelector isSorted protected method - Remove deprecated 'yarn-cluster' and 'yarn-client' master argument in favor of 'yarn' and deploy mode 'cluster', etc Notes: - I was not able to remove deprecated DataFrameReader.json(RDD) in favor of DataFrameReader.json(Dataset); the former was deprecated in 2.2.0, but, it is still needed to support Pyspark's .json() method, which can't use a Dataset. - Looks like SQLContext.createExternalTable was not actually deprecated in Pyspark, but, almost certainly was meant to be? Catalog.createExternalTable was. - I afterwards noted that the toDegrees, toRadians functions were almost removed fully in SPARK-25908, but Felix suggested keeping just the R version as they hadn't been technically deprecated. I'd like to revisit that. Do we really want the inconsistency? I'm not against reverting it again, but then that implies leaving SQLContext.createExternalTable just in Pyspark too, which seems weird. - I kept LogisticRegressionWithSGD, LinearRegressionWithSGD, LassoWithSGD, RidgeRegressionWithSGD in Pyspark, though deprecated, as it is hard to remove them (still used by StreamingLogisticRegressionWithSGD?) and they are not fully removed in Scala. Maybe should not have been deprecated. ### Why are the changes needed? Deprecated items are easiest to remove in a major release, so we should do so as much as possible for Spark 3. This does not target items deprecated 'recently' as of Spark 2.3, which is still 18 months old. ### Does this PR introduce any user-facing change? Yes, in that deprecated items are removed from some public APIs. ### How was this patch tested? Existing tests. Closes #25684 from srowen/SPARK-28980. Lead-authored-by: Sean Owen <sean.owen@databricks.com> Co-authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-09-09 10:19:40 -05:00
Nicholas Marion	6fb5ef108e	[SPARK-29011][BUILD] Update netty-all from 4.1.30-Final to 4.1.39-Final ### What changes were proposed in this pull request? Upgrade netty-all to latest in the 4.1.x line which is 4.1.39-Final. ### Why are the changes needed? Currency of dependencies. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Existing unit-tests against master branch. Closes #25712 from n-marion/master. Authored-by: Nicholas Marion <nmarion@us.ibm.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-09-06 17:48:53 -07:00
Thomas Graves	4c8f114783	[SPARK-27489][WEBUI] UI updates to show executor resource information ### What changes were proposed in this pull request? We are adding other resource type support to the executors and Spark. We should show the resource information for each executor on the UI Executors page. This also adds a toggle button to show the resources column. It is off by default. ![executorui1](https://user-images.githubusercontent.com/4563792/63891432-c815b580-c9aa-11e9-9f41-62975649efbc.png) ![Screenshot from 2019-08-28 14-56-26](https://user-images.githubusercontent.com/4563792/63891516-fd220800-c9aa-11e9-9fe4-89fcdca37306.png) ### Why are the changes needed? to show user what resources the executors have. Like Gpus, fpgas, etc ### Does this PR introduce any user-facing change? Yes introduces UI and rest api changes to show the resources ### How was this patch tested? Unit tests and manual UI tests on yarn and standalone modes. Closes #25613 from tgravescs/SPARK-27489-gpu-ui-latest. Authored-by: Thomas Graves <tgraves@nvidia.com> Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>	2019-09-04 09:45:44 +08:00
Xiao Li	2856398de9	[SPARK-28961][HOT-FIX][BUILD] Upgrade Maven from 3.6.1 to 3.6.2 ### What changes were proposed in this pull request? This PR is to upgrade the maven dependence from 3.6.1 to 3.6.2. ### Why are the changes needed? All the builds are broken because 3.6.1 is not available. http://ftp.wayne.edu/apache//maven/maven-3/ - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-maven-hadoop-3.2/485/ - https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Compile/job/spark-master-compile-maven-hadoop-2.7/10536/ ![image](https://user-images.githubusercontent.com/11567269/64196667-36d69100-ce39-11e9-8f93-40eb333d595d.png) ### Does this PR introduce any user-facing change? No ### How was this patch tested? N/A Closes #25665 from gatorsmile/upgradeMVN. Authored-by: Xiao Li <gatorsmile@gmail.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>	2019-09-03 11:06:57 -07:00
Andy Grove	35d4edffa2	[SPARK-28921][BUILD][K8S] Upgrade kubernetes client to 4.4.2 ### What changes were proposed in this pull request? Upgrade kubernetes client from 4.1.2 to 4.4.2 ### Why are the changes needed? To fix compatibility issue with EKS since Amazon rolled out some security patches over the past week; 1.15.3, 1.14.6, 1.13.10, 1.12.10, and 1.11.10. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Pass the Jenkins and manually test on EKS. Closes #25640 from andygrove/SPARK-28921. Authored-by: Andy Grove <andygrove73@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-09-02 16:50:58 -07:00
Dongjoon Hyun	560df0ea8e	[SPARK-28951][INFRA] Add release announce template ### What changes were proposed in this pull request? This PR adds a release announce template. ### Why are the changes needed? - We want to use a formal template including HTTPS in the future release. - The future release managers don't need to search mailing list to find this form. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? N/A. Closes #25656 from dongjoon-hyun/SPARK-28951. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-09-02 14:55:05 -07:00
shane knapp	84d4f94596	[SPARK-28701][INFRA][FOLLOWUP] Fix the key error when looking in os.environ ### What changes were proposed in this pull request? i broke run-tests.py for non-PRB builds in this PR: https://github.com/apache/spark/pull/25423 ### Why are the changes needed? to fix what i broke ### Does this PR introduce any user-facing change? no ### How was this patch tested? the build system will test this Closes #25585 from shaneknapp/fix-run-tests. Authored-by: shane knapp <incomplete@gmail.com> Signed-off-by: shane knapp <incomplete@gmail.com>	2019-08-26 12:40:31 -07:00
shane knapp	13fd32c9a9	[SPARK-28701][TEST-HADOOP3.2][TEST-JAVA11][K8S] adding java11 support for pull request builds ## What changes were proposed in this pull request? we need to add the ability to test PRBs against java11. see comments here: https://github.com/apache/spark/pull/25405 ## How was this patch tested? the build system will test this. Closes #25423 from shaneknapp/spark-prb-java11. Authored-by: shane knapp <incomplete@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-08-27 00:48:01 +09:00
Dongjoon Hyun	6214b6a541	[SPARK-28868][INFRA] Specify Jekyll version to 3.8.6 in release docker image ### What changes were proposed in this pull request? This PR aims to specify Jekyll Version explicitly in our release docker image. ### Why are the changes needed? Recently, Jekyll 4.0 is released and it dropped Ruby 2.3 support. This breaks our release docker image build. ``` Building native extensions. This could take a while... ERROR: Error installing jekyll: jekyll-sass-converter requires Ruby version >= 2.4.0. ``` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? The following should succeed. ``` $ docker build -t spark-rm:test --build-arg UID=501 dev/create-release/spark-rm ... Successfully tagged spark-rm:test ``` Closes #25578 from dongjoon-hyun/SPARK-28868. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-25 15:38:41 -07:00
Dongjoon Hyun	1fd7f290ab	[SPARK-28857][INFRA] Clean up the comments of PR template during merging ### What changes were proposed in this pull request? This PR aims to clean up the commit logs by removing the comments of our PR template. ### Why are the changes needed? Apache Spark PR template has comments. Sometime we forget to clean up them because GitHub hides them nicely. It would be great if we clean up this. Otherwise, this makes the commit logs too verbose. (There are a few commits already.) ### Does this PR introduce any user-facing change? No. (only for committers) ### How was this patch tested? Manually with Python2/Python3. Closes #25564 from dongjoon-hyun/SPARK-28857. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-08-23 18:08:10 +09:00
Sean Owen	9ea37b09cf	[SPARK-17875][CORE][BUILD] Remove dependency on Netty 3 ### What changes were proposed in this pull request? Spark uses Netty 4 directly, but also includes Netty 3 only because transitive dependencies do. The dependencies (Hadoop HDFS, Zookeeper, Avro) don't seem to need this dependency as used in Spark. I think we can forcibly remove it to slim down the dependencies. Previous attempts were blocked by its usage in Flume, but that dependency has gone away. https://github.com/apache/spark/pull/15436 ### Why are the changes needed? Mostly to reduce the transitive dependency size and complexity a little bit and avoid triggering spurious security alerts on Netty 3.x usage. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Existing tests Closes #25544 from srowen/SPARK-17875. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-21 21:27:56 -07:00
Sean Owen	c9b49f3978	[SPARK-28737][CORE] Update Jersey to 2.29 ## What changes were proposed in this pull request? Update Jersey to 2.27+, ideally 2.29, for possible JDK 11 fixes. ## How was this patch tested? Existing tests. Closes #25455 from srowen/SPARK-28737. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-16 15:08:04 -07:00
Dongjoon Hyun	43101c7328	[SPARK-28758][BUILD][SQL] Upgrade Janino to 3.0.15 ### What changes were proposed in this pull request? This PR aims to upgrade `Janino` from `3.0.13` to `3.0.15` in order to bring the bug fixes. Please note that `3.1.0` is a major refactoring instead of bug fixes. We had better use `3.0.15` and wait for the stabler 3.1.x. ### Why are the changes needed? This brings the following bug fixes. 3.0.15 (2019-07-28) - Fix overloaded single static method import 3.0.14 (2019-07-05) - Conflict in sbt-assembly - Overloaded static on-demand imported methods cause a CompileException: Ambiguous static method import - Handle overloaded static on-demand imports - Major refactoring of the Java 8 and Java 9 retrofit mechanism - Added tests for "JLS8 8.6 Instance Initializers" and "JLS8 8.7 Static Initializers" - Local variables in instance initializers don't work - Provide an option to keep generated code files - Added compile error handler and warning handler to ICompiler ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Pass the Jenkins with the existing tests. Closes #25474 from dongjoon-hyun/SPARK-28758. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-16 11:33:02 -07:00
Fokko Driesprong	babdba0f9e	[SPARK-28728][BUILD] Bump Jackson Databind to 2.9.9.3 ## What changes were proposed in this pull request? Update Jackson databind to the latest version for some latest changes. ## How was this patch tested? Pass the Jenkins. Closes #25451 from Fokko/fd-bump-jackson-databind. Lead-authored-by: Fokko Driesprong <fokko@apache.org> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-16 03:40:41 -07:00
Dongjoon Hyun	f1d6b19de5	[SPARK-28720][BUILD][R] Update AppVeyor R version to 3.6.1 ## What changes were proposed in this pull request? R version 3.6.1 (Action of the Toes) was released on 2019-07-05. This PR aims to upgrade R installation for AppVeyor CI environment. ## How was this patch tested? Pass the AppVeyor CI. Closes #25441 from dongjoon-hyun/SPARK-28720. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: DB Tsai <d_tsai@apple.com>	2019-08-13 22:56:53 +00:00
WeichenXu	f21bc1874a	[SPARK-27889][INFRA] Make development scripts under dev/ support Python 3 ## What changes were proposed in this pull request? I made an audit and update all dev scripts to support python3. (except `merge_spark_pr.py` which already updated) ## How was this patch tested? Manual. Closes #25289 from WeichenXu123/dev_py3. Authored-by: WeichenXu <weichen.xu@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-08-09 18:55:48 +09:00
Dongjoon Hyun	ae08387b4c	[SPARK-28616][INFRA] Improve merge-spark-pr script to warn WIP PRs and strip trailing dots ## What changes were proposed in this pull request? This PR aims to improve the `merge-spark-pr` script in the following two ways. 1. `[WIP]` is useful when we show that a PR is not ready for merge. Apache Spark allows merging `WIP` PRs. However, sometime, we accidentally forgot to clean up the title for the completed PRs. We had better warn once more during merging stage and get a confirmation from the committers. 2. We have two kinds of PR titles in terms of the ending period. This PR aims to remove the trailing `dot` since the shorter is the better in the commit title. Also, the PR titles without the trailing `dot` is dominant in the Apache Spark commit logs. ``` $ git log --oneline \| grep '[.]$' \| wc -l 4090 $ git log --oneline \| grep '[^.]$' \| wc -l 20747 ``` ## How was this patch tested? Manual. ``` $ dev/merge_spark_pr.py git rev-parse --abbrev-ref HEAD Which pull request would you like to merge? (e.g. 34): 25157 The PR title has `[WIP]`: [WIP][SPARK-28396][SQL] Add PathCatalog for data source V2 Continue? (y/n): ``` ``` $ dev/merge_spark_pr.py git rev-parse --abbrev-ref HEAD Which pull request would you like to merge? (e.g. 34): 25304 I've re-written the title as follows to match the standard format: Original: [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API. Modified: [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API Would you like to use the modified title? (y/n): ``` Closes #25356 from dongjoon-hyun/SPARK-28616. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-04 21:23:54 -07:00
Dongjoon Hyun	0c6874fb37	[SPARK-28606][INFRA] Update CRAN key to recover docker image generation ## What changes were proposed in this pull request? CRAN repo changed the key and it causes our release script failure. This is a release blocker for Apache Spark 2.4.4 and 3.0.0. - https://cran.r-project.org/bin/linux/ubuntu/README.html ``` Err:1 https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 51716619E084DAB9 ... W: GPG error: https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 51716619E084DAB9 E: The repository 'https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/ InRelease' is not signed. ``` Note that they are reusing `cran35` for R 3.6 although they changed the key. ``` Even though R has moved to version 3.6, for compatibility the sources.list entry still uses the cran3.5 designation. ``` This PR aims to recover the docker image generation first. We will verify the R doc generation in a separate JIRA and PR. ## How was this patch tested? Manual. After `docker-build.log`, it should continue to the next stage, `Building v3.0.0-rc1`. ``` $ dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n -s docs ... Log file: docker-build.log Building v3.0.0-rc1; output will be at /tmp/spark-3.0.0/output ``` Closes #25339 from dongjoon-hyun/SPARK-28606. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: DB Tsai <d_tsai@apple.com>	2019-08-02 23:41:00 +00:00
HyukjinKwon	24c1bc2483	[SPARK-28586][INFRA] Make merge-spark-pr script compatible with Python 3 ## What changes were proposed in this pull request? This PR proposes to make `merge_spark_pr.py` script Python 3 compatible. ## How was this patch tested? Manually tested against my forked remote with the PR and JIRA below: https://github.com/apache/spark/pull/25321 https://github.com/apache/spark/pull/25286 https://issues.apache.org/jira/browse/SPARK-28153 Closes #25322 from HyukjinKwon/merge-script. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-08-01 10:17:17 -07:00
Wing Yew Poon	80ab19b9fd	[SPARK-26329][CORE] Faster polling of executor memory metrics. ## What changes were proposed in this pull request? Prior to this change, in an executor, on each heartbeat, memory metrics are polled and sent in the heartbeat. The heartbeat interval is 10s by default. With this change, in an executor, memory metrics can optionally be polled in a separate poller at a shorter interval. For each executor, we use a map of (stageId, stageAttemptId) to (count of running tasks, executor metric peaks) to track what stages are active as well as the per-stage memory metric peaks. When polling the executor memory metrics, we attribute the memory to the active stage(s), and update the peaks. In a heartbeat, we send the per-stage peaks (for stages active at that time), and then reset the peaks. The semantics would be that the per-stage peaks sent in each heartbeat are the peaks since the last heartbeat. We also keep a map of taskId to memory metric peaks. This tracks the metric peaks during the lifetime of the task. The polling thread updates this as well. At end of a task, we send the peak metric values in the task result. In case of task failure, we send the peak metric values in the `TaskFailedReason`. We continue to do the stage-level aggregation in the EventLoggingListener. For the driver, we still only poll on heartbeats. What the driver sends will be the current values of the metrics in the driver at the time of the heartbeat. This is semantically the same as before. ## How was this patch tested? Unit tests. Manually tested applications on an actual system and checked the event logs; the metrics appear in the SparkListenerTaskEnd and SparkListenerStageExecutorMetrics events. Closes #23767 from wypoon/wypoon_SPARK-26329. Authored-by: Wing Yew Poon <wypoon@cloudera.com> Signed-off-by: Imran Rashid <irashid@cloudera.com>	2019-08-01 09:09:46 -05:00
Dongjoon Hyun	a428f40669	[SPARK-28549][BUILD][CORE][SQL] Use `text.StringEscapeUtils` instead `lang3.StringEscapeUtils` ## What changes were proposed in this pull request? `org.apache.commons.lang3.StringEscapeUtils` was deprecated over two years ago at [LANG-1316](https://issues.apache.org/jira/browse/LANG-1316). There is no bug fixes after that. ```java /** * <p>Escapes and unescapes {code String}s for * Java, Java Script, HTML and XML.</p> * * <p>#ThreadSafe#</p> * since 2.0 * deprecated as of 3.6, use commons-text * <a href="https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringEscapeUtils.html"> * StringEscapeUtils</a> instead */ Deprecated public class StringEscapeUtils { ``` This PR aims to use the latest one from `commons-text` module which has more bug fixes like [TEXT-100](https://issues.apache.org/jira/browse/TEXT-100), [TEXT-118](https://issues.apache.org/jira/browse/TEXT-118) and [TEXT-120](https://issues.apache.org/jira/browse/TEXT-120) by the following replacement. ```scala -import org.apache.commons.lang3.StringEscapeUtils +import org.apache.commons.text.StringEscapeUtils ``` This will add a new dependency to `hadoop-2.7` profile distribution. In `hadoop-3.2` profile, we already have it. ``` +commons-text-1.6.jar ``` ## How was this patch tested? Pass the Jenkins with the existing tests. - [Hadoop 2.7](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108281) - [Hadoop 3.2](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108282) Closes #25281 from dongjoon-hyun/SPARK-28549. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-07-29 11:45:29 +09:00
Dongjoon Hyun	33e6e4703d	[SPARK-28544][BUILD] Update zstd-jni to 1.4.2-1 ## What changes were proposed in this pull request? This PR aims to update `zstd-jni` library to bring the latest improvement and bug fixes in `1.4.1` and `1.4.2`. - https://github.com/facebook/zstd/releases/tag/v1.4.1 (4.5 ~ 11.8% performance improvement from v1.4.0 and bug fixes) - https://github.com/facebook/zstd/releases/tag/v1.4.2 (bug fixes) ## How was this patch tested? Pass the Jenkins. Closes #25275 from dongjoon-hyun/SPARK-28544. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-07-27 18:08:20 -07:00
Dongjoon Hyun	dbd0a2aa37	[SPARK-28511][INFRA] Get REV from RELEASE_VERSION instead of VERSION ## What changes were proposed in this pull request? Unlike the other versions, `x.x.0-SNAPSHOT` causes `x.x.-1`. Although this will not happen in the tags (there is no `SNAPSHOT` postfix), we had better fix this. ``` $ dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n Output directory already exists. Overwrite and continue? [y/n] y Branch [branch-2.4]: master Current branch version is 3.0.0-SNAPSHOT. Release [3.0.-1]: ``` Since we already have `RELEASE_VERSION` by removing `SNAPSHOT`. This PR uses `RELEASE_VERSION` instead of `VERSION`. ``` $ dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n Branch [branch-2.4]: master Current branch version is 3.0.0-SNAPSHOT. Release [3.0.0]: ``` ## How was this patch tested? Manually do `dev/create-release/do-release-docker.sh -d /tmp/spark-3.0.0 -n` and see the default value of `Release`. Closes #25254 from dongjoon-hyun/SPARK-28511. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-07-25 10:54:24 -07:00
Dongjoon Hyun	cfca26e973	[SPARK-28496][INFRA] Use branch name instead of tag during dry-run ## What changes were proposed in this pull request? There are two cases when we use `dry run`. First, when the tag already exists, we can ask `confirmation` on the existing tag name. ``` $ dev/create-release/do-release-docker.sh -d /tmp/spark-2.4.4 -n -s docs Output directory already exists. Overwrite and continue? [y/n] y Branch [branch-2.4]: Current branch version is 2.4.4-SNAPSHOT. Release [2.4.4]: 2.4.3 RC # [1]: v2.4.3-rc1 already exists. Continue anyway [y/n]? y This is a dry run. Please confirm the ref that will be built for testing. Ref [v2.4.3-rc1]: ``` Second, when the tag doesn't exist, we had better ask `confirmation` on the branch name. If we do not change the default value, it will fail eventually. ``` $ dev/create-release/do-release-docker.sh -d /tmp/spark-2.4.4 -n -s docs Branch [branch-2.4]: Current branch version is 2.4.4-SNAPSHOT. Release [2.4.4]: RC # [1]: This is a dry run. Please confirm the ref that will be built for testing. Ref [v2.4.4-rc1]: ``` This PR improves the second case by providing the branch name instead. This helps the release testing before tagging. ## How was this patch tested? Manually do the following and check the default value of `Ref` field. ``` $ dev/create-release/do-release-docker.sh -d /tmp/spark-2.4.4 -n -s docs Branch [branch-2.4]: Current branch version is 2.4.4-SNAPSHOT. Release [2.4.4]: RC # [1]: This is a dry run. Please confirm the ref that will be built for testing. Ref [branch-2.4]: ... ``` Closes #25240 from dongjoon-hyun/SPARK-28496. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-07-24 14:20:25 -07:00
Liang-Chi Hsieh	591de42351	[SPARK-28381][PYSPARK] Upgraded version of Pyrolite to 4.30 ## What changes were proposed in this pull request? This upgraded to a newer version of Pyrolite. Most updates [1] in the newer version are for dotnot. For java, it includes a bug fix to Unpickler regarding cleaning up Unpickler memo, and support of protocol 5. After upgrading, we can remove the fix at SPARK-27629 for the bug in Unpickler. [1] https://github.com/irmen/Pyrolite/compare/pyrolite-4.23...master ## How was this patch tested? Manually tested on Python 3.6 in local on existing tests. Closes #25143 from viirya/upgrade-pyrolite. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-07-15 12:29:58 +09:00
Dongjoon Hyun	13ae9ebb38	[SPARK-28354][INFRA] Use JIRA user name instead of JIRA user key ## What changes were proposed in this pull request? `dev/merge_spark_pr.py` script always fail for some users because they have different `name` and `key`. - https://issues.apache.org/jira/rest/api/2/user?username=yumwang JIRA Client expects `name`, but we are using `key`. This PR fixes it. ```python # This is JIRA client code `/usr/local/lib/python2.7/site-packages/jira/client.py` def assign_issue(self, issue, assignee): """Assign an issue to a user. None will set it to unassigned. -1 will set it to Automatic. :param issue: the issue ID or key to assign :param assignee: the user to assign the issue to :type issue: int or str :type assignee: str :rtype: bool """ url = self._options['server'] + \ '/rest/api/latest/issue/' + str(issue) + '/assignee' payload = {'name': assignee} r = self._session.put( url, data=json.dumps(payload)) raise_on_error(r) return True ``` ## How was this patch tested? Manual with the committer ID/password. ```python import jira.client asf_jira = jira.client.JIRA({'server': 'https://issues.apache.org/jira'}, basic_auth=('yourid', 'passwd')) asf_jira.assign_issue("SPARK-28354", "q79969786") # This will raise exception. asf_jira.assign_issue("SPARK-28354", "yumwang") # This works. ``` Closes #25120 from dongjoon-hyun/SPARK-28354. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-07-12 18:44:29 +09:00
Yuming Wang	4ad0c33be4	[SPARK-28221][BUILD] Upgrade janino to 3.0.13 ## What changes were proposed in this pull request? Mainly change logs: ### Version 3.0.13: - Support for JDK 9/10 in Full Compiler - The syntax elements that can have modifiers now all have sets of "is...()" methods that check for each modifier. Some also have methods "getAccess()" and/or "getAnnotations()". - Implement "type annotations" (JLS8 9.7.4) - Implemented parsing (but not compilation) of "modular compilation units" (JLS11 7.3). - Replaced all "assert...Uncookable(..., Pattern messageRegex)" and "assert...Uncookable(..., String messageInfix)" method pairs with a single "assert...Uncookable(..., String messageRegex)" method. Minor refactoring: Allowed modifiers are now checked in the Parser, not in Java.*. This saves a lot of THROWS clauses. - Parse Type inference syntax: Type inference for generic instance creation implemented, test cases added. - Parse MethodReference, ClassInstanceCreationReference and ArrayCreationReference ### Version 3.0.12 - Fixed: Operator "&" not defined on types "java.lang.Long" and "int" - Major bug in JavaSourceClassLoader: When loading the second and following classes, CUs were compiled again, leading to an inconsistent class hierarchy. - Fixed: Java 9 added "Override public final CharBuffer CharBuffer.rewind() { ..." -- leads easily to a java.lang.NoSuchMethodError - Changed all occurences of the words "Java bytecode" to "JVM bytecode" to make clearer that the generated bytecode is for the JVMS and not suitable for, e.g. DALVIK. http://janino-compiler.github.io/janino/changelog.html ## How was this patch tested? Existing test Closes #25021 from wangyum/SPARK-28221. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-07-06 10:02:42 -07:00
Marcelo Vanzin	11e21cc17a	[SPARK-28187][BUILD] Add support for hadoop-cloud to the PR builder. Closes #24987 from vanzin/SPARK-28187. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>	2019-06-27 15:59:05 -07:00
Hyukjin Kwon	1d36b892ab	[SPARK-7721][INFRA][FOLLOW-UP] Remove cloned coverage repo after posting HTMLs ## What changes were proposed in this pull request? This PR proposes to remove cloned `pyspark-coverage-site` repo. it doesn't looks a problem in PR builder but somehow it's problematic in `spark-master-test-sbt-hadoop-2.7`. ## How was this patch tested? Jenkins. Closes #23729 from HyukjinKwon/followup-coverage. Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org> Co-authored-by: shane knapp <incomplete@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-06-25 09:18:32 +09:00
Sean Owen	67042e90e7	[MINOR][BUILD] Exclude pyspark-coverage-site/ dir from RAT ## What changes were proposed in this pull request? Looks like a directory `pyspark-site-coverage/` is now (?) generated and fails RAT checks. It should just be excluded. See: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/6029/console ## How was this patch tested? N/A Closes #24950 from srowen/pysparkcoveragesite. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-06-24 14:07:41 -05:00
Dongjoon Hyun	ea0e119f84	[SPARK-28111][BUILD] Upgrade `xbean-asm7-shaded` to 4.14 ## What changes were proposed in this pull request? This PR aims to update `xbean-asm7-shaded` to bring [XBEAN-318](https://issues.apache.org/jira/browse/XBEAN-318) which is helpful to log the class definition reading failures. - https://issues.apache.org/jira/projects/XBEAN/versions/12345220 ## How was this patch tested? Pass the Jenkins. Closes #24914 from dongjoon-hyun/SPARK-28111. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-20 07:59:59 -07:00
Sean Owen	15462e1a8f	[SPARK-28004][UI] Update jquery to 3.4.1 ## What changes were proposed in this pull request? We're using an old-ish jQuery, 1.12.4, and should probably update for Spark 3 to keep up in general, but also to keep up with CVEs. In fact, we know of at least one resolved in only 3.4.0+ (https://nvd.nist.gov/vuln/detail/CVE-2019-11358). They may not affect Spark, but, if the update isn't painful, maybe worthwhile in order to make future 3.x updates easier. jQuery 1 -> 2 doesn't sound like a breaking change, as 2.0 is supposed to maintain compatibility with 1.9+ (https://blog.jquery.com/2013/04/18/jquery-2-0-released/) 2 -> 3 has breaking changes: https://jquery.com/upgrade-guide/3.0/. It's hard to evaluate each one, but the most likely area for problems is in ajax(). However, our usage of jQuery (and plugins) is pretty simple. Update jquery to 3.4.1; update jquery blockUI and mustache to latest ## How was this patch tested? Manual testing of docs build (except R docs), worker/master UI, spark application UI. Note: this really doesn't guarantee it works, as our tests can't test javascript, and this is merely anecdotal testing, although I clicked about every link I could find. There's a risk this breaks a minor part of the UI; it does seem to work fine in the main. Closes #24843 from srowen/SPARK-28004. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-14 22:19:20 -07:00
Dongjoon Hyun	fd8240d10c	[SPARK-28051][INFRA] Exposing JIRA issue component types at GitHub PRs ## What changes were proposed in this pull request? This PR aims to expose JIRA issue component types at GitHub PRs. ## How was this patch tested? Manual. ``` $ export GITHUB_OAUTH_KEY=... $ export JIRA_PASSWORD=... $ export GITHUB_API_BASE='https://api.github.com/repos/your-id/spark' $ dev/github_jira_sync.py ``` Please note that the existing script will raise the following exceptions if your repo has less than 100 PRs. This will be handled at #24874 . ``` Traceback (most recent call last): File "dev/github_jira_sync.py", line 139, in <module> jira_prs = get_jira_prs() File "dev/github_jira_sync.py", line 83, in get_jira_prs link_header = filter(lambda k: k.startswith("Link"), page.info().headers)[0] IndexError: list index out of range ``` That is beyond the scope of this PR. Closes #24871 from dongjoon-hyun/SPARK-28051. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-14 20:36:45 -07:00
Dongjoon Hyun	7533cccc5d	[SPARK-28053][INFRA] Handle a corner case where there is no `Link` header ## What changes were proposed in this pull request? Currently, `github_jira_sync.py` assumes that there is `Link` always. However, it will fail when the number of the open PR is less than 100 (the default paging number). It will not happen in Apache Spark, but we had better fix that because it happens during review process for `github_jira_sync.py` script. ``` Traceback (most recent call last): File "dev/github_jira_sync.py", line 139, in <module> jira_prs = get_jira_prs() File "dev/github_jira_sync.py", line 83, in get_jira_prs link_header = filter(lambda k: k.startswith("Link"), page.info().headers)[0] IndexError: list index out of range ``` ## How was this patch tested? Manually check with another repo which has small number of open PRs (< 100). ``` $ export JIRA_PASSWORD=... $ export GITHUB_API_BASE='https://api.github.com/repos/your-id/spark' $ dev/github_jira_sync.py ``` Closes #24874 from dongjoon-hyun/SPARK-28053. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-06-14 16:33:34 +09:00
Dongjoon Hyun	e5d95117e4	[SPARK-27979][BUILD][test-maven] Remove deprecated `--force` option in `build/mvn` and `run-tests.py` ## What changes were proposed in this pull request? This is a second try of #24824. Since Apache Spark 2.0.0, SPARK-14867 deprecated `--force` option and made it ignored. This PR cleans up the related code completely at 3.0.0. BEFORE (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests WARNING: '--force' is deprecated and ignored. ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end WARNING: '--force' is deprecated and ignored. ``` AFTER (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Phive -Pkinesis-asl -Pmesos -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end ``` ## How was this patch tested? Manually check the Jenkins logs. Closes #24833 from dongjoon-hyun/SPARK-FORCE-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-10 18:40:46 -07:00
Dongjoon Hyun	742f805177	Revert "[SPARK-27979][BUILD][test-maven] Remove deprecated `--force` option in `build/mvn` and `run-tests.py`" This reverts commit `354ec254c5`.	2019-06-09 08:33:21 -07:00
Martin Junghanns	709387d660	[SPARK-27300][GRAPH] Add Spark Graph modules and dependencies ## What changes were proposed in this pull request? This PR introduces the necessary Maven modules for the new [Spark Graph](https://issues.apache.org/jira/browse/SPARK-25994) feature for Spark 3.0. * `spark-graph` is a parent module that users depend on to get all graph functionalities (Cypher and Graph Algorithms) * `spark-graph-api` defines the [Property Graph API](https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI) that is being shared between Cypher and Algorithms * `spark-cypher` contains a Cypher query engine implementation Both, `spark-graph-api` and `spark-cypher` depend on Spark SQL. Note, that the Maven module for Graph Algorithms is not part of this PR and will be introduced in https://issues.apache.org/jira/browse/SPARK-27302 A PoC for a running Cypher implementation can be found in this WIP PR https://github.com/apache/spark/pull/24297 ## How was this patch tested? Pass the Jenkins with all profiles and manually build and check the followings. ``` $ ls assembly/target/scala-2.12/jars/spark-cypher* assembly/target/scala-2.12/jars/spark-cypher_2.12-3.0.0-SNAPSHOT.jar $ ls assembly/target/scala-2.12/jars/spark-graph* \| grep -v graphx assembly/target/scala-2.12/jars/spark-graph-api_2.12-3.0.0-SNAPSHOT.jar assembly/target/scala-2.12/jars/spark-graph_2.12-3.0.0-SNAPSHOT.jar ``` Closes #24490 from s1ck/SPARK-27300. Lead-authored-by: Martin Junghanns <martin.junghanns@neotechnology.com> Co-authored-by: Max Kießling <max@kopfueber.org> Co-authored-by: Martin Junghanns <martin.junghanns@neo4j.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-09 00:26:26 -07:00
Dongjoon Hyun	354ec254c5	[SPARK-27979][BUILD][test-maven] Remove deprecated `--force` option in `build/mvn` and `run-tests.py` ## What changes were proposed in this pull request? Since Apache Spark 2.0.0, SPARK-14867 deprecated `--force` option and made it ignored. This PR cleans up the related code completely at 3.0.0. BEFORE (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests WARNING: '--force' is deprecated and ignored. ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Phive-thriftserver -Phive -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end WARNING: '--force' is deprecated and ignored. ``` AFTER (Jenkins) ``` ======================================================================== Building Spark ======================================================================== [info] Building Spark using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pkinesis-asl -Pyarn -Pspark-ganglia-lgpl -Phive -Pmesos clean package -DskipTests ... ======================================================================== Running Spark unit tests ======================================================================== [info] Running Spark tests using Maven with these arguments: -Phadoop-2.7 -Pkubernetes -Phive-thriftserver -Pyarn -Pspark-ganglia-lgpl -Phive -Pkinesis-asl -Pmesos -Dtest.exclude.tags=org.apache.spark.tags.ExtendedHiveTest,org.apache.spark.tags.ExtendedYarnTest test --fail-at-end ``` ## How was this patch tested? Manually check the Jenkins logs. Closes #24824 from dongjoon-hyun/SPARK-27979. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-06-08 08:17:12 -07:00
Yuming Wang	3f102a8229	[SPARK-27749][SQL] hadoop-3.2 support hive-thriftserver ## What changes were proposed in this pull request? This PR mainly makes the following changes to make `hadoop-3.2` support `sql/hive-thriftserver`: 1. Upgrade [`TCLIService.thrift`](https://github.com/apache/hive/blob/rel/release-2.3.5/service-rpc/if/TCLIService.thrift) and related code to Hive 2.3.5 because of [HIVE-12442](https://issues.apache.org/jira/browse/HIVE-12442)(Note that we only migrate code without adding features, such as [HIVE-4924](https://issues.apache.org/jira/browse/HIVE-4924) and [HIVE-15473](https://issues.apache.org/jira/browse/HIVE-15473)). 2. Use slf4j as logging facade because of [HIVE-12237](https://issues.apache.org/jira/browse/HIVE-12237). 3. Port [HIVE-13169](https://issues.apache.org/jira/browse/HIVE-13169) to compatible with Hive 2.3. ## How was this patch tested? Exiting test Closes #24628 from wangyum/SPARK-27749. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>	2019-06-05 08:40:05 -07:00
Izek Greenfield	c647f9011c	[SPARK-27862][BUILD] Move to json4s 3.6.6 ## What changes were proposed in this pull request? Move to json4s version 3.6.6 Add scala-xml 1.2.0 ## How was this patch tested? Pass the Jenkins Closes #24736 from igreenfield/master. Authored-by: Izek Greenfield <igreenfield@axiomsl.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-05-30 19:42:56 -05:00
Fokko Driesprong	bd87323003	[SPARK-27757][CORE] Bump Jackson to 2.9.9 ## What changes were proposed in this pull request? This fixes CVE-2019-12086 on Databind: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.9.9 ## How was this patch tested? Existing tests Closes #24646 from Fokko/SPARK-27757. Authored-by: Fokko Driesprong <fokko@apache.org> Signed-off-by: Sean Owen <sean.owen@databricks.com>	2019-05-30 09:35:20 -05:00
HyukjinKwon	90b6cda9af	[SPARK-25944][R][BUILD] AppVeyor change to latest R version (3.6.0) ## What changes were proposed in this pull request? R 3.6.0 is released 2019-04-26. This PR targets to change R version from 3.5.1 to 3.6.0 in AppVeyor. This PR sets `R_REMOTES_NO_ERRORS_FROM_WARNINGS` to `true` to avoid the warnings below: ``` Error in strptime(xx, f, tz = tz) : (converted from warning) unable to identify current timezone 'C': please set environment variable 'TZ' Error in i.p(...) : (converted from warning) installation of package 'praise' had non-zero exit status Calls: <Anonymous> ... with_rprofile_user -> with_envvar -> force -> force -> i.p Execution halted ``` ## How was this patch tested? AppVeyor Closes #24716 from HyukjinKwon/SPARK-27848. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-05-28 14:42:03 +09:00
Sean Owen	6c5827c723	[SPARK-27794][R][DOCS] Use https URL for CRAN repo ## What changes were proposed in this pull request? Use https URL for CRAN repo (and for a Scala download in a Dockerfile) ## How was this patch tested? Existing tests. Closes #24664 from srowen/SPARK-27794. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-22 14:28:21 -07:00
Sean Owen	eed6de1a65	[MINOR][DOCS] Tighten up some key links to the project and download pages to use HTTPS ## What changes were proposed in this pull request? Tighten up some key links to the project and download pages to use HTTPS ## How was this patch tested? N/A Closes #24665 from srowen/HTTPSURLs. Authored-by: Sean Owen <sean.owen@databricks.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-21 10:56:42 -07:00
HyukjinKwon	b7bf4fd123	[SPARK-27402][INFRA][FOLLOW-UP] Exclude 'hive-thriftserver' in modules to test for hadoop3.2 for now ## What changes were proposed in this pull request? This PR excludes 'hive-thriftserver' in modules to test for hadoop3.2 for now as well ## How was this patch tested? Manually tested via `run-tests.py` Closes #24644 from HyukjinKwon/SPARK-27402. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-20 07:53:19 -07:00
Dongjoon Hyun	141a3bfc8d	[SPARK-27755][BUILD] Update zstd-jni to 1.4.0-1 ## What changes were proposed in this pull request? This PR aims to update `zstd-jni` library to `1.4.0-1` which improves the `level 1 compression speed` performance by 6% in most scenarios. The following is the full release note. - https://github.com/facebook/zstd/releases/tag/v1.4.0 ## How was this patch tested? Pass the Jenkins. Closes #24632 from dongjoon-hyun/SPARK-27755. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2019-05-17 08:34:45 -07:00

1 2 3 4 5 ...

733 commits