ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Kousuke Saruta	7b78d56f34	[SPARK-35870][BUILD] Upgrade Jetty to 9.4.42 ### What changes were proposed in this pull request? This PR upgrades Jetty to `9.4.42`. In the current master, `9.4.40` is used. `9.4.41` and `9.4.42` include the following updates. https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.41.v20210516 https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.42.v20210604 ### Why are the changes needed? Mainly for CVE-2021-28169. https://nvd.nist.gov/vuln/detail/CVE-2021-28169 This CVE might little affect Spark, but just in case. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI. Closes #33053 from sarutak/upgrade-jetty-9.4.42. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>	2021-06-25 03:32:32 +09:00
William Hyun	89dbf514f5	[SPARK-35850][BUILD] Upgrade scala-maven-plugin to 4.5.3 ### What changes were proposed in this pull request? This PR aims to upgrade the scala-maven-plugin version to 4.5.3. ### Why are the changes needed? This will upgrade `sbt-compiler-bridge` from 1.3.1 to 1.5.5 in order to bring the latest bug fixes. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #33007 from williamhyun/scalamvnplugin. Authored-by: William Hyun <william@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-06-21 21:35:42 -07:00
Gengliang Wang	74d647d2ca	[SPARK-35825][INFRA] Increase the heap and stack size for Maven build ### What changes were proposed in this pull request? Increase memory configuration for Maven build. Stack size: 64MB => 128MB Initial heap size: 1024MB => 2048MB Maximum heap size: 1024MB => 2048MB The SBT builds are ok so let's keep the current configuration. ### Why are the changes needed? The jenkins jobs are unstable due to the stackoverflow errors: https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2-jdk-11/ https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/2274/ ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Jenkins test Closes #32961 from gengliangwang/increaseXss. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-06-19 10:44:46 -07:00
David Christle	7fcb127674	[SPARK-35670][BUILD] Upgrade ZSTD-JNI to 1.5.0-2 ### What changes were proposed in this pull request? This PR aims to upgrade `zstd-jni` to 1.5.0-2, which uses `zstd` version 1.5.0. ### Why are the changes needed? Major improvements to Zstd support are targeted for the upcoming 3.2.0 release of Spark. Zstd 1.5.0 introduces significant compression (+25% to 140%) and decompression (~15%) speed improvements in benchmarks described in more detail on the releases page: - https://github.com/facebook/zstd/releases/tag/v1.5.0 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Build passes build tests, but the benchmark tests seem flaky. I am unsure if this change is responsible. The error is: ``` Running org.apache.spark.rdd.CoalescedRDDBenchmark: 21/06/08 18:53:10 ERROR SparkContext: Failed to add file:/home/runner/work/spark/spark/./core/target/scala-2.12/spark-core_2.12-3.2.0-SNAPSHOT-tests.jar to Spark environment java.lang.IllegalArgumentException: requirement failed: File spark-core_2.12-3.2.0-SNAPSHOT-tests.jar was already registered with a different path (old path = /home/runner/work/spark/spark/core/target/scala-2.12/spark-core_2.12-3.2.0-SNAPSHOT-tests.jar, new path = /home/runner/work/spark/spark/./core/target/scala-2.12/spark-core_2.12-3.2.0-SNAPSHOT-tests.jar ``` https://github.com/dchristle/spark/runs/2776123749?check_suite_focus=true cc: dongjoon-hyun Closes #32826 from dchristle/ZSTD150. Lead-authored-by: David Christle <dchristle@squareup.com> Co-authored-by: David Christle <dchristle@users.noreply.github.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-06-17 11:06:50 -07:00
Chao Sun	506ef9aad7	[SPARK-29250][BUILD] Upgrade to Hadoop 3.3.1 ### What changes were proposed in this pull request? This upgrade default Hadoop version from 3.2.1 to 3.3.1. The changes here are simply update the version number and dependency file. ### Why are the changes needed? Hadoop 3.3.1 just came out, which comes with many client-side improvements such as for S3A/ABFS (20% faster when accessing S3). These are important for users who want to use Spark in a cloud environment. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Existing unit tests in Spark - Manually tested using my S3 bucket for event log dir: ``` bin/spark-shell \ -c spark.hadoop.fs.s3a.access.key=$AWS_ACCESS_KEY_ID \ -c spark.hadoop.fs.s3a.secret.key=$AWS_SECRET_ACCESS_KEY \ -c spark.eventLog.enabled=true -c spark.eventLog.dir=s3a://<my-bucket> ``` - Manually tested against docker-based YARN dev cluster, by running `SparkPi`. Closes #30135 from sunchao/SPARK-29250. Authored-by: Chao Sun <sunchao@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-06-16 13:28:07 -07:00
Sumeet Gajjar	864ff67746	[SPARK-35429][CORE] Remove commons-httpclient from Hadoop-3.2 profile due to EOL and CVEs ### What changes were proposed in this pull request? Remove commons-httpclient as a direct dependency for Hadoop-3.2 profile. Hadoop-2.7 profile distribution still has it, hadoop-client has a compile dependency on commons-httpclient, thus we cannot remove it for Hadoop-2.7 profile. ``` [INFO] +- org.apache.hadoop:hadoop-client:jar:2.7.4:compile [INFO] \| +- org.apache.hadoop:hadoop-common:jar:2.7.4:compile [INFO] \| \| +- commons-cli:commons-cli:jar:1.2:compile [INFO] \| \| +- xmlenc:xmlenc:jar:0.52:compile [INFO] \| \| +- commons-httpclient:commons-httpclient:jar:3.1:compile ``` ### Why are the changes needed? Spark is pulling in commons-httpclient as a dependency directly. commons-httpclient went EOL years ago and there are most likely CVEs not being reported against it, thus we should remove it. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Existing unittests - Checked the dependency tree before and after introducing the changes Before: ``` ./build/mvn dependency:tree -Phadoop-3.2 \| grep -i "commons-httpclient" Using `mvn` from path: /usr/bin/mvn [INFO] +- commons-httpclient:commons-httpclient:jar:3.1:compile [INFO] \| +- commons-httpclient:commons-httpclient:jar:3.1:provided ``` After ``` ./build/mvn dependency:tree \| grep -i "commons-httpclient" Using `mvn` from path: /Users/sumeet.gajjar/cloudera/upstream-spark/build/apache-maven-3.6.3/bin/mvn ``` P.S. Reopening this since [spark upgraded](`463daabd5a`) its `hive.version` to `2.3.9` which does not have a dependency on `commons-httpclient`. Closes #32912 from sumeetgajjar/SPARK-35429. Authored-by: Sumeet Gajjar <sumeetgajjar93@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-06-15 14:43:30 -07:00
Gengliang Wang	62be22e929	[SPARK-35694][INFRA][FOLLOWUP] Increase the default JVM stack size of SBT/Maven ### What changes were proposed in this pull request? In https://github.com/apache/spark/pull/32838, we set the default JVM stack size to 16M from 4M. However, there are still stackoverflow error in builds: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139672/console Let's update the value to 64M ### Why are the changes needed? Make test build stable. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual trigger test builds. Closes #32879 from gengliangwang/increaseStackAgain. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-06-11 18:51:07 +09:00
Yuming Wang	463daabd5a	[SPARK-34512][BUILD][SQL] Upgrade built-in Hive to 2.3.9 ### What changes were proposed in this pull request? This pr upgrades built-in Hive to 2.3.9. Hive 2.3.9 changes: - [HIVE-17155] - findConfFile() in HiveConf.java has some issues with the conf path - [HIVE-24797] - Disable validate default values when parsing Avro schemas - [HIVE-24608] - Switch back to get_table in HMS client for Hive 2.3.x - [HIVE-21200] - Vectorization: date column throwing java.lang.UnsupportedOperationException for parquet - [HIVE-21563] - Improve Table#getEmptyTable performance by disabling registerAllFunctionsOnce - [HIVE-19228] - Remove commons-httpclient 3.x usage ### Why are the changes needed? Fix regression caused by AVRO-2035. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Unit test. Closes #32750 from wangyum/SPARK-34512. Authored-by: Yuming Wang <yumwang@ebay.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-06-10 20:44:35 -07:00
Gengliang Wang	0b5683a4d5	[SPARK-35694][INFRA] Increase the default JVM stack size of SBT/Maven ### What changes were proposed in this pull request? The jenkins SBT/Maven build keep failing with stack overflow error: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139542 We should increase the JVM stack size to 16MB. Also, https://github.com/apache/spark/pull/32521 set the stack size to 256MB for Java 11 build, which might be too big since every thread will allocate this memory for the stack. This PR also set it as 16MB to make the config consistent. ### Why are the changes needed? Fix SBT/Maven build. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Jenkins and GA tests. Closes #32838 from gengliangwang/increaseSBTStackSize. Authored-by: Gengliang Wang <gengliang@apache.org> Signed-off-by: Gengliang Wang <gengliang@apache.org>	2021-06-09 19:36:29 +08:00
Dongjoon Hyun	6f2ffccb5e	[SPARK-35660][BUILD][K8S] Upgrade kubernetes-client to 5.4.1 ### What changes were proposed in this pull request? This PR aims to upgrade kubernetes-client to 5.4.1. ### Why are the changes needed? This will bring a few bug fixes. - https://github.com/fabric8io/kubernetes-client/releases/tag/v5.4.1 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #32798 from dongjoon-hyun/SPARK-35660. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-06-06 22:27:08 -07:00
Kousuke Saruta	510bde460a	[SPARK-35655][BUILD] Upgrade HtmlUnit and its related artifacts to 2.50 ### What changes were proposed in this pull request? This PR upgrades HtmlUnit and its related artifacts to `2.50`. ### Why are the changes needed? We currently uses 2.40 but 2.41+ seem to include bunch of bug fixes and improvements especially for JavaScript and CSS. https://htmlunit.sourceforge.io/changes-report.html#a2.50.0 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? All the `UISeleniumSuite` which use HtmlUnit pass on my laptop. Closes #32789 from sarutak/upgrade-htmlunit250. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Gengliang Wang <gengliang@apache.org>	2021-06-05 17:45:44 +08:00
yangjie01	4a549f2de2	[SPARK-35574][BUILD] Add a compile arg to turn compilation warnings related to `procedure syntax` to compilation errors in Scala 2.13 ### What changes were proposed in this pull request? There are several pr to fix compilation warnings related to `procedure syntax` like SPARK-29291， SPARK-33352 and SPARK-35526, in order to prevent the recurrence of similar problems, this pr add a compile arg to convert `procedure syntax` related compilation warnings to compilation errors in Scala 2.13. ### Why are the changes needed? Prevent the recurrence of compilation warnings related to `procedure syntax is deprecated` ### Does this PR introduce _any_ user-facing change? `procedure syntax` is no longer allowed in Spark code with Scala 2.13, for constructors methods definition should be `this(...) = { }` not `this(...) { }`, for without `return type` methods definition should be `def methodName(...): Unit = {}` not `def methodName(...) {}`. ### How was this patch tested? - Pass the GitHub Action Scala 2.13 job - Manual test： Do some code change like: ``` Index: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala =================================================================== -67,7 +67,7 private[spark] class HeartbeatReceiver(sc: SparkContext, clock: Clock) extends SparkListener with ThreadSafeRpcEndpoint with Logging { - def this(sc: SparkContext) = { + def this(sc: SparkContext) { this(sc, new SystemClock) } Index: core/src/main/scala/org/apache/spark/MapOutputTracker.scala =================================================================== -720,7 +720,7 } } - def registerMergeResult(shuffleId: Int, reduceId: Int, status: MergeStatus): Unit = { + def registerMergeResult(shuffleId: Int, reduceId: Int, status: MergeStatus) { shuffleStatuses(shuffleId).addMergeResult(reduceId, status) } ``` sbt with Scala 2.13 profile compile failed as follows:* ``` [error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala:70:29: procedure syntax is deprecated for constructors: add `=`, as in method definition [error] def this(sc: SparkContext) { [error] ^ [error] /home/runner/work/spark/spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:723:79: procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `registerMergeResult`'s return type [error] def registerMergeResult(shuffleId: Int, reduceId: Int, status: MergeStatus) { [error] ^ [error] two errors found [error] (core / Compile / compileIncremental) Compilation failed [error] Total time: 136 s (02:16), completed May 31, 2021 10:06:50 AM Error: Process completed with exit code 1. ``` maven with Scala 2.13 profile compile failed as follows: ``` [ERROR] [Error] /Users/yangjie01/SourceCode/git/spark-mine/core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala:70: procedure syntax is deprecated for constructors: add `=`, as in method definition [ERROR] [Error] /Users/yangjie01/SourceCode/git/spark-mine/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:723: procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `registerMergeResult`'s return type [ERROR] two errors found ``` Closes #32710 from LuciferYang/SPARK-35574. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-06-03 13:52:04 +09:00
Dongjoon Hyun	1a55019b1f	[SPARK-31168][BUILD][FOLLOWUP] Update scala-2.12 profile ### What changes were proposed in this pull request? This PR is a follow-up of https://github.com/apache/spark/pull/32697 to update the missed part. After SPARK-34774, we have Scala 2.12 version in `scala-2.12` profile. ### Why are the changes needed? To be consistent. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs and manual. BEFORE ``` $ build/mvn help:evaluate -Pscala-2.12 -Dexpression=scala.version \| grep "^2.12" Using `mvn` from path: /usr/local/bin/mvn 2.12.10 ``` AFTER ``` $ build/mvn help:evaluate -Pscala-2.12 -Dexpression=scala.version \| grep "^2.12" Using `mvn` from path: /usr/local/bin/mvn 2.12.14 ``` Closes #32707 from dongjoon-hyun/SPARK-31168-2. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-30 21:27:24 -07:00
yangjie01	ff27264ae5	[SPARK-35550][BUILD] Upgrade Jackson to 2.12.3 ### What changes were proposed in this pull request? This pr upgrade Jackson version to 2.12.3. Jackson Release 2.12.3: [https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.12.3](https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.12.3) ### Why are the changes needed? Upgrade to a new version to bring potential bug fixes like [https://github.com/FasterXML/jackson-modules-java8/issues/207](https://github.com/FasterXML/jackson-modules-java8/issues/207) and avro's master has been upgraded to Jackson to 2.12.3 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass the Jenkins or GitHub Action Closes #32688 from LuciferYang/SPARK-35550. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-31 10:28:43 +09:00
Dongjoon Hyun	6c4b60f3b3	[SPARK-31168][BUILD] Upgrade Scala to 2.12.14 ### What changes were proposed in this pull request? This PR is the 4th try to upgrade Scala 2.12.x in order to see the feasibility. - https://github.com/apache/spark/pull/27929 (Upgrade Scala to 2.12.11, wangyum ) - https://github.com/apache/spark/pull/30940 (Upgrade Scala to 2.12.12, viirya ) - https://github.com/apache/spark/pull/31223 (Upgrade Scala to 2.12.13, dongjoon-hyun ) Note that Scala 2.12.14 has the following fix for Apache Spark community. - Fix cyclic error in runtime reflection (protobuf), a regression that prevented Spark upgrading to 2.12.13 REQUIREMENTS: - [x] `silencer` library is released via https://github.com/ghik/silencer/pull/66 - [x] `genjavadoc` library is released via https://github.com/lightbend/genjavadoc/issues/282 ### Why are the changes needed? Apache Spark was stuck to 2.12.10 due to the regression in Scala 2.12.11/2.12.12/2.12.13. This will bring all the bug fixes. - https://github.com/scala/scala/releases/tag/v2.12.14 - https://github.com/scala/scala/releases/tag/v2.12.13 - https://github.com/scala/scala/releases/tag/v2.12.12 - https://github.com/scala/scala/releases/tag/v2.12.11 ### Does this PR introduce _any_ user-facing change? Yes, but this is a bug-fixed version. ### How was this patch tested? Pass the CIs. Closes #32697 from dongjoon-hyun/SPARK-31168. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-30 16:08:13 -07:00
Kousuke Saruta	116a97e153	[SPARK-35501][SQL][TESTS] Add a feature for removing pulled container image for docker integration tests ### What changes were proposed in this pull request? This PR adds a feature for removing pulled container image after every docker integration test finish. This feature is enabled by the new propoerty `spark.tes.docker.removePulledImage`. ### Why are the changes needed? For idempotent. I'm trying to add docker integration tests to GA in SPARK-35483 (#32631) but I noticed that `jdbc.OracleIntegrationSuite` consistently fails(https://github.com/sarutak/spark/runs/2646707235?check_suite_focus=true). I investigated the reason and I found it's short of the storage capacity of the host on GA. ``` ORACLE PASSWORD FOR SYS AND SYSTEM: oracle The location '/opt/oracle' specified for database files has insufficient space. Database creation needs at least '4.5GB' disk space. Specify a different database file destination that has enough space in the configuration file '/etc/sysconfig/oracle-xe-18c.conf'. mv: cannot stat '/opt/oracle/product/18c/dbhomeXE/dbs/spfileXE.ora': No such file or directory mv: cannot stat '/opt/oracle/product/18c/dbhomeXE/dbs/orapwXE': No such file or directory ORACLE_HOME = [/home/oracle] ? ORACLE_BASE environment variable is not being set since this information is not available for the current user ID . You can set ORACLE_BASE manually if it is required. Resetting ORACLE_BASE to its previous value or ORACLE_HOME The Oracle base remains unchanged with value /opt/oracle ##################################### ########### E R R O R ############### DATABASE SETUP WAS NOT SUCCESSFUL! Please check output for further info! ########### E R R O R ############### ##################################### The following output is now a tail of the alert.log: tail: cannot open '/opt/oracle/diag/rdbms///trace/alert.log' for reading: No such file or directory tail: no files remaining ``` With this feature, pulled container image is removed and keep the capacity for `jdbc.OracleIntegrationSuite` in GA. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? I confirmed the following things. A container image which is absent in the local repository is removed after test finished if `spark.test.container.removePulledImage` is `true`. * A container image which is present in the local repository is not removed after the finished even if `spark.test.container.removePulledImage` is `true`. * A container image is not removed regardless of presence of the container image in the local repository even if `spark.test.container.removePulledImage` is `true`. Closes #32652 from sarutak/docker-image-rm. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-26 17:24:29 +09:00
Vinod KC	e3c6907c99	[SPARK-35490][BUILD] Update json4s to 3.7.0-M11 ### What changes were proposed in this pull request? This PR aims to upgrade json4s from 3.7.0-M5 to 3.7.0-M11 Note: json4s version greater than 3.7.0-M11 is not binary compatible with Spark third party jars ### Why are the changes needed? Multiple defect fixes and improvements like https://github.com/json4s/json4s/issues/750 https://github.com/json4s/json4s/issues/554 https://github.com/json4s/json4s/issues/715 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Ran with the existing UTs Closes #32636 from vinodkc/br_build_upgrade_json4s. Authored-by: Vinod KC <vinod.kc.in@gmail.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>	2021-05-26 11:10:14 +03:00
Vinod KC	4ba1db91f0	[SPARK-35513][BUILD] Update joda-time to 2.10.10 ### What changes were proposed in this pull request? This PR aims to upgrade joda-time from 2.10.5 to 2.10.10 ### Why are the changes needed? Improvement and bug fixes in joda-time https://www.joda.org/joda-time/changes-report.html#a2.10.10 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Ran with the existing UTs Closes #32661 from vinodkc/br_build_upgrade_joda_time. Authored-by: Vinod KC <vinod.kc.in@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-25 11:29:03 -07:00
Vinod KC	d5868ebc39	[SPARK-35492][BUILD] Upgrade httpcore to 4.4.14 ### What changes were proposed in this pull request? This PR aims to upgrade Apache HttpCore from 4.4.12 to 4.4.14. ### Why are the changes needed? Stability improvements in httpcore 4.4.14 - Bug fix: Non-blocking TLSv1.3 connections can end up in an infinite event spin when closed concurrently by the local and the remote endpoints. - HTTPCORE-647: Non-blocking connection terminated due to 'java.io.IOException: Broken pipe' can enter an infinite loop flushing buffered output data. - PR #201, HTTPCORE-634: Fix race condition in AbstractConnPool that can cause internal state - corruption - HTTPCORE-612: DefaultConnectionReuseStrategy incorrectly used int to represent Content-Length value - instead of long ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? With Jenkins Tests Closes #32638 from vinodkc/br_build_upgrade_httpcore. Authored-by: Vinod KC <vinod.kc.in@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-23 08:16:50 -07:00
Dongjoon Hyun	fa424ac2b8	[SPARK-35489][BUILD] Upgrade ORC to 1.6.8 ### What changes were proposed in this pull request? This PR aims to upgrade ORC to 1.6.8. ### Why are the changes needed? This will bring the latest bug fixes. - https://orc.apache.org/news/2021/05/21/ORC-1.6.8/ ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the existing CIs. Closes #32635 from dongjoon-hyun/SPARK-35489. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-22 10:35:40 -07:00
Vinod KC	003294ce1d	[SPARK-35488][BUILD] Upgrade ASM to 7.3.1 ### What changes were proposed in this pull request? This PR aims to upgrade ASM to 7.3.1. - https://issues.apache.org/jira/browse/XBEAN-323 - https://asm.ow2.io/versions.html ### Why are the changes needed? ASM 7.3.1 bring following changes - new V15 constant - experimental support for PermittedSubtypes and RecordComponent - bug fixes - - 317885: SKIP_DEBUG now skips MethodParameters attributes ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Ran with the existing UTs Closes #32634 from vinodkc/br_build_upgrade_asm. Authored-by: Vinod KC <vinod.kc.in@gmail.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>	2021-05-23 02:33:15 +09:00
Kousuke Saruta	6bd6e46aec	[SPARK-35487][BUILD] Upgrade dropwizard metrics to 4.2.0 ### What changes were proposed in this pull request? This PR upgrades Dropwizard metrics to 4.2.0. I also modified the corresponding links in `docs/monitoring.md`. ### Why are the changes needed? The latest version was released last week and it contains some improvements. https://github.com/dropwizard/metrics/releases/tag/v4.2.0 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Build succeeds and all the modified links are reachable. Closes #32628 from sarutak/upgrade-dropwizard-4.2.0. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-21 22:53:32 -07:00
Dongjoon Hyun	3757c1803d	[SPARK-35462][BUILD][K8S] Upgrade Kubernetes-client to 5.4.0 to support K8s 1.21 models ### What changes were proposed in this pull request? This PR aims to upgrade `kubernetes-client` from 5.3.1 to 5.4.0 to support K8s 1.21 models officially. ### Why are the changes needed? `kubernetes-client` 5.4.0 has `Kubernetes Model v1.21.0` - https://github.com/fabric8io/kubernetes-client/releases/tag/v5.4.0 ### Does this PR introduce _any_ user-facing change? No. This is a dev-only change. ### How was this patch tested? Pass the CIs including Jenkins K8s IT. - https://github.com/apache/spark/pull/32612#issuecomment-845456039 I tested K8s IT with the following versions. - minikube version: v1.20.0 - K8s Client Version: v1.21.0 - Server Version: v1.21.0 ``` KubernetesSuite: - Run SparkPi with no resources - Run SparkPi with a very long application name. - Use SparkLauncher.NO_RESOURCE - Run SparkPi with a master URL without a scheme. - Run SparkPi with an argument. - Run SparkPi with custom labels, annotations, and environment variables. - All pods have the same service account by default - Run extraJVMOptions check on driver - Run SparkRemoteFileTest using a remote data file - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j.properties - Run SparkPi with env and mount secrets. - Run PySpark on simple pi.py example - Run PySpark to test a pyfiles example - Run PySpark with memory customization - Run in client mode. - Start pod creation from template - Launcher client dependencies - SPARK-33615: Launcher client archives - SPARK-33748: Launcher python client respecting PYSPARK_PYTHON - SPARK-33748: Launcher python client respecting spark.pyspark.python and spark.pyspark.driver.python - Launcher python client dependencies using a zip file - Test basic decommissioning - Test basic decommissioning with shuffle cleanup - Test decommissioning with dynamic allocation & shuffle cleanups - Test decommissioning timeouts - Run SparkR on simple dataframe.R example Run completed in 17 minutes, 18 seconds. Total number of tests run: 26 Suites: completed 2, aborted 0 Tests: succeeded 26, failed 0, canceled 0, ignored 0, pending 0 All tests passed. ``` Closes #32612 from dongjoon-hyun/SPARK-35462. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-20 14:34:58 -07:00
Bo Zhang	e170e63955	[SPARK-35457][BUILD] Bump ANTLR runtime version to 4.8 ### What changes were proposed in this pull request? This PR changes the antlr4-runtime version from 4.8-1 to 4.8. ### Why are the changes needed? Version 4.8 is the official release version, with a proper release note (see https://github.com/antlr/antlr4/releases) and artifiacts listed in https://www.antlr.org/download/index.html. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Will rely on tests in the PR. Closes #32603 from bozhang2820/antlr-4.8. Authored-by: Bo Zhang <bo.zhang@databricks.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>	2021-05-20 17:24:40 +09:00
Kousuke Saruta	8c70c17545	[SPARK-35434][BUILD] Upgrade scalatestplus artifacts to 3.2.9.0 ### What changes were proposed in this pull request? This PR upgrades the scalatestplus artifacts and scalacheck. ### Why are the changes needed? scalatestplus artifacts Spark uses are two years old and these artifacts are currently renamed. So, let's follow up. Also, the latest releases seem to support Scala 3. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA passed on my repository. Closes #32581 from sarutak/upgrade-scalatestplus. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-18 09:33:11 -07:00
Dongjoon Hyun	dd5464976f	[SPARK-35394][K8S][BUILD] Move kubernetes-client.version to root pom file ### What changes were proposed in this pull request? This PR aims to unify two K8s version variables in two `pom.xml`s into one. `kubernetes-client.version` is correct because the artifact ID is `kubernetes-client`. ``` kubernetes.client.version (kubernetes/core module) kubernetes-client.version (kubernetes/integration-test module) ``` ### Why are the changes needed? Having two variables for the same value is confusing and inconvenient when we upgrade K8s versions. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. (The compilation test passes are enough.) Closes #32531 from dongjoon-hyun/SPARK-35394. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-13 00:40:53 -07:00
Ludovic Henry	b52d47a920	[SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludovic.netlib:2.0 ### What changes were proposed in this pull request? Bump to `dev.ludovic.netlib:2.0` which provides JNI-based wrappers for BLAS, ARPACK, and LAPACK. Theseare not taking dependencies on GPL or LGPL libraries, allowing to provide out-of-the-box support for hardware acceleration when a native library is present (this is still up to the end-user to install such library on their system, like OpenBLAS, Intel MKL, and libarpack2). ### Why are the changes needed? Great performance improvement for ML-related workload on vanilla-distributions of Spark. ### Does this PR introduce _any_ user-facing change? Users now take advantage of hardware acceleration as long as a native library is installed (like OpenBLAS, Intel MKL and libarpack2). ### How was this patch tested? Spark test-suite + dev.ludovic.netlib testsuite. #### JDK8: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 220 226 6 454.9 2.2 1.0X [info] java 221 228 5 451.9 2.2 1.0X [info] native 209 215 5 478.7 2.1 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 125 3 823.3 1.2 1.0X [info] java 121 125 3 824.3 1.2 1.0X [info] native 101 105 3 988.4 1.0 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 212 219 6 470.9 2.1 1.0X [info] java 208 212 4 481.0 2.1 1.0X [info] native 209 215 5 478.5 2.1 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 119 3 878.9 1.1 1.0X [info] java 99 105 3 1011.4 1.0 1.2X [info] native 97 103 3 1026.7 1.0 1.2X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 111 2 925.9 1.1 1.0X [info] java 71 73 2 1414.9 0.7 1.5X [info] native 54 56 2 1847.0 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 2 1046.8 1.0 1.0X [info] java 47 48 1 2129.8 0.5 2.0X [info] native 29 30 1 3404.7 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 139 143 2 718.2 1.4 1.0X [info] java 46 47 1 2171.2 0.5 3.0X [info] native 44 46 2 2261.8 0.4 3.1X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 154 157 4 651.0 1.5 1.0X [info] java 40 42 1 2469.3 0.4 3.8X [info] native 26 27 1 3787.6 0.3 5.8X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 185 195 8 541.0 1.8 1.0X [info] java 186 196 7 538.5 1.9 1.0X [info] native 177 187 7 564.1 1.8 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 98 102 3 1016.2 1.0 1.0X [info] java 98 102 3 1017.8 1.0 1.0X [info] native 87 91 3 1143.2 0.9 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 68 70 1 1474.7 0.7 1.0X [info] java 51 52 1 1973.0 0.5 1.3X [info] native 30 32 1 3298.8 0.3 2.2X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 99 2 1037.9 1.0 1.0X [info] java 50 51 1 1999.6 0.5 1.9X [info] native 30 31 1 3368.1 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 1688.7 0.6 1.0X [info] java 41 42 1 2461.9 0.4 1.5X [info] native 15 16 1 6593.0 0.2 3.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1116.2 0.9 1.0X [info] java 39 40 1 2565.8 0.4 2.3X [info] native 15 16 1 6594.2 0.2 5.9X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 192 202 7 520.5 1.9 1.0X [info] java 203 214 7 491.9 2.0 0.9X [info] native 176 187 7 568.8 1.8 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 846.1 1.2 1.0X [info] java 38 39 1 1313.5 0.8 1.6X [info] native 24 27 1 2047.8 0.5 2.4X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 515.4 1.9 1.0X [info] java 97 101 2 515.1 1.9 1.0X [info] native 88 91 3 569.1 1.8 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 169 174 3 295.4 3.4 1.0X [info] java 169 174 3 295.4 3.4 1.0X [info] native 160 165 4 312.2 3.2 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 561 577 13 1782.3 0.6 1.0X [info] java 225 231 4 4446.2 0.2 2.5X [info] native 31 32 3 32473.1 0.0 18.2X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 570 584 9 1754.8 0.6 1.0X [info] java 224 230 4 4457.3 0.2 2.5X [info] native 31 32 1 32493.4 0.0 18.5X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 855 866 6 1169.2 0.9 1.0X [info] java 224 228 3 4466.9 0.2 3.8X [info] native 31 32 1 32395.5 0.0 27.7X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1328 1344 8 752.8 1.3 1.0X [info] java 224 230 4 4458.9 0.2 5.9X [info] native 31 32 1 32201.8 0.0 42.8X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 534 541 5 1873.0 0.5 1.0X [info] java 220 224 3 4542.8 0.2 2.4X [info] native 15 16 1 66803.1 0.0 35.7X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 544 551 6 1839.6 0.5 1.0X [info] java 220 224 4 4538.2 0.2 2.5X [info] native 15 16 1 65589.9 0.0 35.7X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 833 845 21 1201.0 0.8 1.0X [info] java 220 224 3 4548.7 0.2 3.8X [info] native 15 16 1 66603.2 0.0 55.5X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 899 907 5 1112.9 0.9 1.0X [info] java 221 224 2 4531.6 0.2 4.1X [info] native 15 16 1 65944.9 0.0 59.3X ``` #### JDK11: ``` [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 195 200 3 512.2 2.0 1.0X [info] java 197 202 3 507.0 2.0 1.0X [info] native 184 189 4 543.0 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 3 921.8 1.1 1.0X [info] java 101 105 3 989.4 1.0 1.1X [info] native 87 91 3 1147.1 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 187 191 3 535.1 1.9 1.0X [info] java 182 188 3 548.8 1.8 1.0X [info] native 178 182 3 562.2 1.8 1.1X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 110 114 3 909.3 1.1 1.0X [info] java 86 93 4 1159.3 0.9 1.3X [info] native 86 90 3 1162.4 0.9 1.3X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 106 108 2 943.6 1.1 1.0X [info] java 70 71 2 1426.8 0.7 1.5X [info] native 54 56 2 1835.4 0.5 1.9X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1047.1 1.0 1.0X [info] java 43 44 1 2331.9 0.4 2.2X [info] native 29 30 1 3392.1 0.3 3.2X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 115 2 880.7 1.1 1.0X [info] java 42 43 1 2398.1 0.4 2.7X [info] native 45 46 1 2233.3 0.4 2.5X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 140 143 2 714.6 1.4 1.0X [info] java 28 29 1 3531.0 0.3 4.9X [info] native 26 27 1 3820.0 0.3 5.3X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 156 166 7 641.3 1.6 1.0X [info] java 158 167 6 633.2 1.6 1.0X [info] native 150 160 7 664.8 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 88 2 1181.7 0.8 1.0X [info] java 85 88 2 1176.0 0.9 1.0X [info] native 75 78 2 1333.2 0.8 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.1 0.6 1.0X [info] java 41 43 1 2415.5 0.4 1.4X [info] native 30 31 1 3293.9 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 94 96 1 1063.4 0.9 1.0X [info] java 41 42 1 2435.8 0.4 2.3X [info] native 30 30 1 3379.8 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2278.9 0.4 1.0X [info] java 37 38 0 2686.8 0.4 1.2X [info] native 15 16 1 6555.4 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 88 89 1 1142.1 0.9 1.0X [info] java 33 34 1 3010.7 0.3 2.6X [info] native 15 16 1 6553.9 0.2 5.7X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 172 4 609.4 1.6 1.0X [info] java 163 172 5 612.6 1.6 1.0X [info] native 150 159 4 667.0 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 49 50 1 1029.4 1.0 1.0X [info] java 41 42 1 1209.4 0.8 1.2X [info] native 25 27 1 2029.2 0.5 2.0X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 80 85 3 622.2 1.6 1.0X [info] java 80 85 3 622.4 1.6 1.0X [info] native 75 79 3 668.7 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 137 142 3 364.1 2.7 1.0X [info] java 139 142 2 360.4 2.8 1.0X [info] native 131 135 3 380.4 2.6 1.0X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 517 525 5 1935.5 0.5 1.0X [info] java 213 216 3 4704.8 0.2 2.4X [info] native 31 31 1 32705.6 0.0 16.9X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 589 601 6 1698.6 0.6 1.0X [info] java 213 217 3 4693.3 0.2 2.8X [info] native 31 32 1 32498.9 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 851 865 6 1175.3 0.9 1.0X [info] java 212 216 3 4717.0 0.2 4.0X [info] native 30 32 1 32903.0 0.0 28.0X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1301 1316 6 768.4 1.3 1.0X [info] java 212 216 2 4717.4 0.2 6.1X [info] native 31 32 1 32606.0 0.0 42.4X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 454 460 2 2203.0 0.5 1.0X [info] java 208 212 3 4803.8 0.2 2.2X [info] native 15 16 0 66586.0 0.0 30.2X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 529 536 4 1889.7 0.5 1.0X [info] java 208 212 3 4798.6 0.2 2.5X [info] native 15 16 1 66751.4 0.0 35.3X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 830 840 5 1205.1 0.8 1.0X [info] java 208 211 2 4814.1 0.2 4.0X [info] native 15 15 1 67676.4 0.0 56.2X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 894 907 7 1118.7 0.9 1.0X [info] java 208 211 3 4809.6 0.2 4.3X [info] native 15 16 1 66675.2 0.0 59.6X ``` #### JDK16: ``` [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.VectorBLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 193 199 3 517.5 1.9 1.0X [info] java 181 186 4 553.2 1.8 1.1X [info] native 181 185 5 553.6 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 2 925.1 1.1 1.0X [info] java 88 91 3 1138.6 0.9 1.2X [info] native 87 91 3 1144.2 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 184 189 3 542.5 1.8 1.0X [info] java 181 185 3 552.8 1.8 1.0X [info] native 179 183 2 558.0 1.8 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 1031.6 1.0 1.0X [info] java 86 90 2 1163.7 0.9 1.1X [info] native 85 88 2 1182.9 0.8 1.1X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 107 109 2 932.4 1.1 1.0X [info] java 54 56 2 1846.7 0.5 2.0X [info] native 54 56 2 1846.7 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1043.6 1.0 1.0X [info] java 29 30 1 3439.3 0.3 3.3X [info] native 29 30 1 3423.9 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 123 2 829.8 1.2 1.0X [info] java 32 32 1 3171.3 0.3 3.8X [info] native 45 46 1 2246.2 0.4 2.7X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 142 144 2 705.9 1.4 1.0X [info] java 15 16 1 6585.8 0.2 9.3X [info] native 26 27 1 3839.5 0.3 5.4X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 157 165 5 635.6 1.6 1.0X [info] java 151 159 5 664.0 1.5 1.0X [info] native 151 160 5 663.6 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 89 2 1172.3 0.9 1.0X [info] java 75 79 3 1337.3 0.7 1.1X [info] native 75 79 2 1335.5 0.7 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.5 0.6 1.0X [info] java 28 29 1 3544.2 0.3 2.0X [info] native 30 31 1 3306.2 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1108.3 0.9 1.0X [info] java 28 28 1 3622.5 0.3 3.3X [info] native 30 31 1 3381.3 0.3 3.1X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2284.7 0.4 1.0X [info] java 14 15 1 7034.0 0.1 3.1X [info] native 15 16 1 6643.7 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 86 1 1177.4 0.8 1.0X [info] java 15 15 1 6886.1 0.1 5.8X [info] native 15 16 1 6560.1 0.2 5.6X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 173 6 608.1 1.6 1.0X [info] java 148 157 5 675.2 1.5 1.1X [info] native 152 160 5 659.9 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 61 63 1 815.4 1.2 1.0X [info] java 16 17 1 3104.3 0.3 3.8X [info] native 24 27 1 2071.9 0.5 2.5X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 81 85 2 616.4 1.6 1.0X [info] java 81 85 2 614.7 1.6 1.0X [info] native 75 78 2 669.5 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 138 141 3 362.7 2.8 1.0X [info] java 137 140 2 365.3 2.7 1.0X [info] native 131 134 2 382.9 2.6 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 525 544 8 1906.2 0.5 1.0X [info] java 61 68 3 16358.1 0.1 8.6X [info] native 31 32 1 32623.7 0.0 17.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 580 598 12 1724.5 0.6 1.0X [info] java 61 68 4 16302.5 0.1 9.5X [info] native 30 32 1 32962.8 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 829 838 4 1206.2 0.8 1.0X [info] java 61 69 3 16339.7 0.1 13.5X [info] native 30 31 1 33231.9 0.0 27.6X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1352 1363 5 739.6 1.4 1.0X [info] java 61 69 3 16347.0 0.1 22.1X [info] native 31 32 1 32740.3 0.0 44.3X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 482 493 7 2073.1 0.5 1.0X [info] java 35 38 2 28315.3 0.0 13.7X [info] native 15 15 1 67579.7 0.0 32.6X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 472 482 4 2119.0 0.5 1.0X [info] java 36 38 2 28138.1 0.0 13.3X [info] native 15 16 1 66616.5 0.0 31.4X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 823 830 5 1215.2 0.8 1.0X [info] java 35 38 2 28681.4 0.0 23.6X [info] native 15 15 1 67908.4 0.0 55.9X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 896 908 7 1115.8 0.9 1.0X [info] java 35 38 2 28402.0 0.0 25.5X [info] native 15 16 0 66691.2 0.0 59.8X ``` TODO: - [x] update documentation in `docs/` and `docs/ml-linalg-guide.md` refering `com.github.fommil.netlib` - [ ] merge https://github.com/luhenry/netlib/pull/1 with all feedback from this PR + remove references to snapshot repositories in `pom.xml` and `project/SparkBuild.scala`. Closes #32415 from luhenry/master. Authored-by: Ludovic Henry <git@ludovic.dev> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-12 08:59:36 -05:00
Takeshi Yamamuro	101b0cc313	[SPARK-35253][SQL][BUILD] Bump up the janino version to v3.1.4 ### What changes were proposed in this pull request? This PR proposes to bump up the janino version from 3.0.16 to v3.1.4. The major changes of this upgrade are as follows: - Fixed issue #131: Janino 3.1.2 is 10x slower than 3.0.11: The Compiler's IClassLoader was initialized way too eagerly, thus lots of classes were loaded from the class path, which is very slow. - Improved the encoding of stack map frames according to JVMS11 4.7.4: Previously, only "full_frame"s were generated. - Fixed issue #107: Janino requires "org.codehaus.commons.compiler.io", but commons-compiler does not export this package - Fixed the promotion of the array access index expression (see JLS7 15.13 Array Access Expressions). For all the changes, please see the change log: http://janino-compiler.github.io/janino/changelog.html NOTE1: I've checked that there is no obvious performance regression. For all the data, see a link: https://docs.google.com/spreadsheets/d/1srxT9CioGQg1fLKM3Uo8z1sTzgCsMj4pg6JzpdcG6VU/edit?usp=sharing NOTE2: We upgraded janino to 3.1.2 (#27860) once before, but the commit had been reverted in #29495 because of the correctness issue. Recently, #32374 had checked if Spark could land on v3.1.3 or not, but a new bug was found there. These known issues has been fixed in v3.1.4 by following PRs: - janino-compiler/janino#145 - janino-compiler/janino#146 ### Why are the changes needed? janino v3.0.X is no longer maintained. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA passed. Closes #32455 from maropu/janino_v3.1.4. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-12 08:57:57 -05:00
Hyukjin Kwon	b59d5ab060	[SPARK-35372][BUILD] Increase stack size for Scala compilation in Maven build ### What changes were proposed in this pull request? This PR increases the stack size for Scala compilation in Maven build to fix the error: ``` java.lang.StackOverflowError scala.reflect.internal.Trees$UnderConstructionTransformer.transform(Trees.scala:1741) scala.reflect.internal.Trees$UnderConstructionTransformer.transform$(Trees.scala:1740) scala.tools.nsc.transform.ExplicitOuter$OuterPathTransformer.transform(ExplicitOuter.scala:289) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:477) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:330) scala.reflect.api.Trees$Transformer.$anonfun$transformStats$1(Trees.scala:2597) scala.reflect.api.Trees$Transformer.transformStats(Trees.scala:2595) scala.reflect.internal.Trees.itransform(Trees.scala:1404) scala.reflect.internal.Trees.itransform$(Trees.scala:1374) scala.reflect.internal.SymbolTable.itransform(SymbolTable.scala:28) scala.reflect.internal.SymbolTable.itransform(SymbolTable.scala:28) scala.reflect.api.Trees$Transformer.transform(Trees.scala:2563) scala.tools.nsc.transform.TypingTransformers$TypingTransformer.transform(TypingTransformers.scala:51) scala.tools.nsc.transform.ExplicitOuter$OuterPathTransformer.scala$reflect$internal$Trees$UnderConstructionTransformer$$super$transform(ExplicitOuter.scala:212) scala.reflect.internal.Trees$UnderConstructionTransformer.transform(Trees.scala:1745) scala.reflect.internal.Trees$UnderConstructionTransformer.transform$(Trees.scala:1740) scala.tools.nsc.transform.ExplicitOuter$OuterPathTransformer.transform(ExplicitOuter.scala:289) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:477) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:330) scala.reflect.internal.Trees.itransform(Trees.scala:1383) ``` See https://github.com/apache/spark/runs/2554067779 ### Why are the changes needed? To recover JDK 11 compilation ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? CI in this PR will test it out. Closes #32502 from HyukjinKwon/SPARK-35372. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>	2021-05-12 02:20:28 +09:00
Kousuke Saruta	bb93547cdf	[SPARK-35326][BUILD] Upgrade Jersey to 2.34 ### What changes were proposed in this pull request? This PR upgrades Jersey to 2.34. ### Why are the changes needed? CVE-2021-28168, a local information disclosure vulnerability, is reported (https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-28168). Spark 3.1.1, 3.0.2 and 3.2.0 use an affected version 2.30. ### Does this PR introduce _any_ user-facing change? It's not clear how much the impact is but Spark uses an affected version of Jersey so I think it's better to upgrade it just in case. ### How was this patch tested? CI. Closes #32453 from sarutak/upgrade-jersey. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-06 08:36:32 -07:00
William Hyun	ac8813e37c	[SPARK-35277][BUILD] Upgrade snappy to 1.1.8.4 ### What changes were proposed in this pull request? This PR aims to upgrade snappy to version 1.1.8.4. ### Why are the changes needed? This will bring the latest bug fixes and improvements. - https://github.com/xerial/snappy-java/blob/master/Milestone.md#snappy-java-1183-2021-01-20 - Make pure-java Snappy thread-safe - Improved SnappyFramedInput/OutputStream performance by using java.util.zip.CRC32C ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #32402 from williamhyun/snappy1184. Authored-by: William Hyun <william@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-04-29 21:26:16 -07:00
lipzhu	77e9152898	[SPARK-35255][BUILD] Automated formatting for Scala Code for Blank Lines ### What changes were proposed in this pull request? https://github.com/databricks/scala-style-guide#blanklines https://scalameta.org/scalafmt/docs/configuration.html#newlinestoplevelstatements ### How was this patch tested? Manually tested by modifying a few files and running ./dev/scalafmt then checking that ./dev/scalastyle still passed. Closes #32383 from lipzhu/SPARK-35255. Authored-by: lipzhu <lipzhu@ebay.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2021-04-30 11:45:58 +09:00
yangjie01	7b78e34417	[SPARK-35269][BUILD] Upgrade commons-lang3 to 3.12.0 ### What changes were proposed in this pull request? This pr aims to upgrade Apache commons-lang3 to 3.12.0 ### Why are the changes needed? This version will bring the latest bug fixes as follows: - https://commons.apache.org/proper/commons-lang/changes-report.html#a3.12.0 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass the Jenkins or GitHub Action Closes #32393 from LuciferYang/lang3-to-312. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-04-29 09:27:28 -07:00
Ludovic Henry	5b77ebb57b	[SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib ### What changes were proposed in this pull request? Following https://github.com/apache/spark/pull/30810, I've continued looking for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate work done in the [`dev.ludovic.netlib`](https://github.com/luhenry/netlib/) Maven package. The `dev.ludovic.netlib` library wraps the original `com.github.fommil.netlib` library and focus on accelerating the linear algebra routines in use in Spark. When running the `org.apache.spark.ml.linalg.BLASBenchmark` benchmarking suite, I get the results at [1] on an Intel machine. Moreover, this library is thoroughly tested to return the exact same results as the reference implementation. Under the hood, it reimplements the necessary algorithms in pure autovectorization-friendly Java 8, as well as takes advantage of the Vector API and Foreign Linker API introduced in JDK 16 when available. A table summarising which version gets loaded in which case: ``` \| \| BLAS.nativeBLAS \| BLAS.javaBLAS \| \| --------------------- \| -------------------------------------------------- \| -------------------------------------------------- \| \| with -Pnetlib-lgpl \| 1. dev.ludovic.netlib.blas.NetlibNativeBLAS, a \| 1. dev.ludovic.netlib.blas.VectorizedBLAS \| \| \| wrapper for com.github.fommil:all \| (JDK16+, relies on the Vector API, requires \| \| \| 2. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, \| `--add-modules=jdk.incubator.vector` on JDK16) \| \| \| relies on the Foreign Linker API, requires \| 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) \| \| \| `--add-modules=jdk.incubator.foreign \| 3. dev.ludovic.netlib.blas.JavaBLAS \| \| \| -Dforeign.restricted=warn`) \| 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a \| \| \| 3. fails to load, falls back to BLAS.javaBLAS in \| wrapper for com.github.fommil:core \| \| \| org.apache.spark.ml.linalg.BLAS \| \| \| --------------------- \| -------------------------------------------------- \| -------------------------------------------------- \| \| without -Pnetlib-lgpl \| 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, \| 1. dev.ludovic.netlib.blas.VectorizedBLAS \| \| \| relies on the Foreign Linker API, requires \| (JDK16+, relies on the Vector API, requires \| \| \| `--add-modules=jdk.incubator.foreign \| `--add-modules=jdk.incubator.vector` on JDK16) \| \| \| -Dforeign.restricted=warn`) \| 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) \| \| \| 2. fails to load, falls back to BLAS.javaBLAS in \| 3. dev.ludovic.netlib.blas.JavaBLAS \| \| \| org.apache.spark.ml.linalg.BLAS \| 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a \| \| \| \| wrapper for com.github.fommil:core \| \| --------------------- \| -------------------------------------------------- \| -------------------------------------------------- \| ``` ### Why are the changes needed? Accelerates linear algebra operations when the pure-java fallback method is in use. Transparently falls back to native implementation (OpenBLAS, MKL) when available. ### Does this PR introduce _any_ user-facing change? No, all changes are transparent to the user. ### How was this patch tested? The `dev.ludovic.netlib` library has its own test suite [2]. It has also been validated by running the Spark test suite and benchmarking suite. [1] Results for `org.apache.spark.ml.linalg.BLASBenchmark`: #### JDK8: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 223 232 8 448.0 2.2 1.0X [info] java 221 228 7 453.0 2.2 1.0X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 122 128 4 821.2 1.2 1.0X [info] java 122 128 4 822.3 1.2 1.0X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 109 112 2 921.4 1.1 1.0X [info] java 70 74 3 1423.5 0.7 1.5X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 96 98 2 1046.1 1.0 1.0X [info] java 47 49 2 2121.7 0.5 2.0X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 184 195 8 544.3 1.8 1.0X [info] java 185 196 7 539.5 1.9 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 99 104 4 1011.9 1.0 1.0X [info] java 99 104 4 1010.4 1.0 1.0X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 947.2 1.1 1.0X [info] java 0 0 0 1584.8 0.6 1.7X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 867.4 1.2 1.0X [info] java 1 1 0 865.0 1.2 1.0X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 485.9 2.1 1.0X [info] java 1 1 0 486.8 2.1 1.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1843.0 0.5 1.0X [info] java 0 0 0 2690.6 0.4 1.5X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1214.7 0.8 1.0X [info] java 0 0 0 2536.8 0.4 2.1X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1895.9 0.5 1.0X [info] java 0 0 0 2961.1 0.3 1.6X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1223.4 0.8 1.0X [info] java 0 0 0 3091.4 0.3 2.5X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 560 575 20 1787.1 0.6 1.0X [info] java 226 232 5 4432.4 0.2 2.5X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 570 586 23 1755.2 0.6 1.0X [info] java 227 232 4 4410.1 0.2 2.5X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 863 879 17 1158.4 0.9 1.0X [info] java 227 231 3 4407.9 0.2 3.8X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1282 1305 23 780.0 1.3 1.0X [info] java 227 232 4 4413.4 0.2 5.7X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 538 548 8 1858.6 0.5 1.0X [info] java 221 226 3 4521.1 0.2 2.4X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 549 558 10 1819.9 0.5 1.0X [info] java 222 229 7 4503.5 0.2 2.5X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 838 852 12 1193.0 0.8 1.0X [info] java 222 229 5 4500.5 0.2 3.8X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 905 919 18 1104.8 0.9 1.0X [info] java 221 228 5 4521.3 0.2 4.1X ``` #### JDK11: ``` [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 195 204 10 512.7 2.0 1.0X [info] java 195 202 7 512.4 2.0 1.0X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 108 113 4 923.3 1.1 1.0X [info] java 102 107 4 984.4 1.0 1.1X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 107 110 3 938.1 1.1 1.0X [info] java 69 72 3 1447.1 0.7 1.5X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 96 98 2 1046.5 1.0 1.0X [info] java 43 45 2 2317.1 0.4 2.2X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 155 168 8 644.2 1.6 1.0X [info] java 158 169 8 632.8 1.6 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 85 90 4 1178.1 0.8 1.0X [info] java 86 90 4 1167.7 0.9 1.0X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 1182.1 0.8 1.0X [info] java 0 0 0 1432.1 0.7 1.2X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 898.7 1.1 1.0X [info] java 1 1 0 891.5 1.1 1.0X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 495.4 2.0 1.0X [info] java 1 1 0 495.7 2.0 1.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2271.6 0.4 1.0X [info] java 0 0 0 3648.1 0.3 1.6X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1229.3 0.8 1.0X [info] java 0 0 0 2711.3 0.4 2.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2677.5 0.4 1.0X [info] java 0 0 0 3288.2 0.3 1.2X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1233.0 0.8 1.0X [info] java 0 0 0 2766.3 0.4 2.2X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 520 536 16 1923.6 0.5 1.0X [info] java 214 221 7 4669.5 0.2 2.4X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 593 612 17 1686.5 0.6 1.0X [info] java 215 219 3 4643.3 0.2 2.8X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 853 870 16 1172.8 0.9 1.0X [info] java 215 218 3 4659.7 0.2 4.0X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1350 1370 23 740.8 1.3 1.0X [info] java 215 219 4 4656.6 0.2 6.3X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 460 468 6 2173.2 0.5 1.0X [info] java 210 213 2 4752.7 0.2 2.2X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 535 544 8 1869.3 0.5 1.0X [info] java 210 215 5 4761.8 0.2 2.5X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 843 853 11 1186.8 0.8 1.0X [info] java 209 214 4 4793.4 0.2 4.0X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 891 904 15 1122.0 0.9 1.0X [info] java 209 214 4 4777.2 0.2 4.3X ``` #### JDK16: ``` [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.VectorizedBLAS [info] nativeBLAS = dev.ludovic.netlib.blas.VectorizedBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 194 199 7 515.7 1.9 1.0X [info] java 181 186 3 551.1 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 109 115 4 915.0 1.1 1.0X [info] java 88 92 3 1138.8 0.9 1.2X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 108 110 2 922.6 1.1 1.0X [info] java 54 56 2 1839.2 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 96 97 2 1046.1 1.0 1.0X [info] java 29 30 1 3393.4 0.3 3.2X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 156 165 5 643.0 1.6 1.0X [info] java 150 159 5 667.1 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 85 91 6 1171.0 0.9 1.0X [info] java 75 79 3 1340.6 0.7 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 917.0 1.1 1.0X [info] java 0 0 0 8147.2 0.1 8.9X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 859.3 1.2 1.0X [info] java 1 1 0 859.3 1.2 1.0X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 482.1 2.1 1.0X [info] java 1 1 0 482.6 2.1 1.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2214.2 0.5 1.0X [info] java 0 0 0 7975.8 0.1 3.6X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1231.4 0.8 1.0X [info] java 0 0 0 8680.9 0.1 7.0X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2684.3 0.4 1.0X [info] java 0 0 0 18527.1 0.1 6.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1235.4 0.8 1.0X [info] java 0 0 0 17347.9 0.1 14.0X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 530 552 18 1887.5 0.5 1.0X [info] java 58 64 3 17143.9 0.1 9.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 598 620 17 1671.1 0.6 1.0X [info] java 58 64 3 17196.6 0.1 10.3X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 834 847 14 1199.4 0.8 1.0X [info] java 57 63 4 17486.9 0.1 14.6X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1338 1366 22 747.3 1.3 1.0X [info] java 58 63 3 17356.6 0.1 23.2X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 489 501 9 2045.5 0.5 1.0X [info] java 36 38 2 27721.9 0.0 13.6X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 478 488 9 2094.0 0.5 1.0X [info] java 36 38 2 27813.2 0.0 13.3X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 825 837 10 1211.6 0.8 1.0X [info] java 35 38 2 28433.1 0.0 23.5X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 900 918 15 1111.6 0.9 1.0X [info] java 36 38 2 28073.0 0.0 25.3X ``` [2] https://github.com/luhenry/netlib/tree/master/blas/src/test/java/dev/ludovic/netlib/blas Closes #32253 from luhenry/master. Authored-by: Ludovic Henry <git@ludovic.dev> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-04-27 14:00:59 -05:00
Dongjoon Hyun	b108e7fff3	[SPARK-33913][SS] Upgrade Kafka to 2.8.0 ### What changes were proposed in this pull request? This PR aims to upgrade Kafka client to 2.8.0. Note that Kafka 2.8.0 uses ZSTD JNI 1.4.9-1 like Apache Spark 3.2.0. ### Why are the changes needed? This will bring the latest client-side improvement and bug fixes like the following examples. - KAFKA-10631 ProducerFencedException is not Handled on Offest Commit - KAFKA-10134 High CPU issue during rebalance in Kafka consumer after upgrading to 2.5 - KAFKA-12193 Re-resolve IPs when a client is disconnected - KAFKA-10090 Misleading warnings: The configuration was supplied but isn't a known config - KAFKA-9263 The new hw is added to incorrect log when ReplicaAlterLogDirsThread is replacing log - KAFKA-10607 Ensure the error counts contains the NONE - KAFKA-10458 Need a way to update quota for TokenBucket registered with Sensor - KAFKA-10503 MockProducer doesn't throw ClassCastException when no partition for topic RELEASE NOTE - https://downloads.apache.org/kafka/2.8.0/RELEASE_NOTES.html - https://downloads.apache.org/kafka/2.7.0/RELEASE_NOTES.html ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with the existing tests because this is a dependency change. Closes #32325 from dongjoon-hyun/SPARK-33913. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: hyukjinkwon <gurwls223@apache.org>	2021-04-25 16:20:22 +09:00
Kousuke Saruta	44c13871d1	[SPARK-35210][BUILD] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue ### What changes were proposed in this pull request? This PR proposes to upgrade Jetty to 9.4.40. ### Why are the changes needed? SPARK-34988 (#32091) upgraded Jetty to 9.4.39 for CVE-2021-28165. But after the upgrade, Jetty 9.4.40 was released to fix the ERR_CONNECTION_RESET issue (https://github.com/eclipse/jetty.project/issues/6152). This issue seems to affect Jetty 9.4.39 when POST method is used with SSL. For Spark, job submission using REST and ThriftServer with HTTPS protocol can be affected. ### Does this PR introduce _any_ user-facing change? No. No released version uses Jetty 9.3.39. ### How was this patch tested? CI. Closes #32318 from sarutak/upgrade-jetty-9.4.40. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>	2021-04-24 19:22:32 +09:00
yangjie01	c7e18ad223	[SPARK-35132][BUILD][CORE] Upgrade netty-all to 4.1.63.Final ### What changes were proposed in this pull request? There are 3 CVE problems were found after netty 4.1.51.Final as follows: - [CVE-2021-21409](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21409) - [CVE-2021-21295](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21295) - [CVE-2021-21290](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-21290) So the main change of this pr is upgrade netty-all to 4.1.63.Final avoid these potential risks. Another change is to clean up deprecated api usage: [Tiny caches have been merged into small caches](https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L447-L455)(after [netty#10267](https://github.com/netty/netty/pull/10267)) and [should use PooledByteBufAllocator(boolean, int, int, int, int, int, int, boolean, int)](https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L227-L239) api to create `PooledByteBufAllocator`. ### Why are the changes needed? Upgrade netty-all to 4.1.63.Final avoid CVE problems. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass the Jenkins or GitHub Action Closes #32227 from LuciferYang/SPARK-35132. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-04-20 18:28:43 -05:00
Kousuke Saruta	59c8131d06	[SPARK-34988][CORE] Upgrade Jetty for CVE-2021-28165 ### What changes were proposed in this pull request? This PR upgrades the version of Jetty to 9.4.39. ### Why are the changes needed? CVE-2021-28165 affects the version of Jetty that Spark uses and it seems to be a little bit serious. https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-28165 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes #32091 from sarutak/upgrade-jetty-9.4.39. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Max Gekk <max.gekk@gmail.com>	2021-04-08 13:56:55 +03:00
Yuming Wang	cbffc12f90	[SPARK-34542][BUILD] Upgrade Parquet to 1.12.0 ### What changes were proposed in this pull request? Parquet 1.12.0 New Feature - PARQUET-41 - Add bloom filters to parquet statistics - PARQUET-1373 - Encryption key management tools - PARQUET-1396 - Example of using EncryptionPropertiesFactory and DecryptionPropertiesFactory - PARQUET-1622 - Add BYTE_STREAM_SPLIT encoding - PARQUET-1784 - Column-wise configuration - PARQUET-1817 - Crypto Properties Factory - PARQUET-1854 - Properties-Driven Interface to Parquet Encryption Parquet 1.12.0 release notes: https://github.com/apache/parquet-mr/blob/apache-parquet-1.12.0/CHANGES.md ### Why are the changes needed? - Bloom filters to improve filter performance - ZSTD enhancement ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit test. Closes #31649 from wangyum/SPARK-34542. Lead-authored-by: Yuming Wang <yumwang@ebay.com> Co-authored-by: Yuming Wang <yumwang@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-03-27 07:56:29 -07:00
Ismaël Mejía	8a552bfc76	[SPARK-34778][BUILD] Upgrade to Avro 1.10.2 ### What changes were proposed in this pull request? Update the Avro version to 1.10.2 ### Why are the changes needed? To stay up to date with upstream and catch compatibility issues with zstd ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests Closes #31866 from iemejia/SPARK-27733-upgrade-avro-1.10.2. Authored-by: Ismaël Mejía <iemejia@gmail.com> Signed-off-by: Yuming Wang <yumwang@ebay.com>	2021-03-22 19:30:14 +08:00
Zhang, Xingchao	2888d1883e	[SPARK-34784][BUILD] Upgrade Jackson to 2.12.2 ### What changes were proposed in this pull request? This pr upgrade Jackson to 2.12.2. Jackson Release 2.12: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.12 ### Why are the changes needed? Make it easy to upgrade Avro 1.10.2. ``` [error] Caused by: sbt.ForkMain$ForkError: com.fasterxml.jackson.databind.JsonMappingException: Scala module 2.11.4 requires Jackson Databind version >= 2.11.0 and < 2.12.0 [error] at com.fasterxml.jackson.module.scala.JacksonModule.setupModule(JacksonModule.scala:61) [error] at com.fasterxml.jackson.module.scala.JacksonModule.setupModule$(JacksonModule.scala:46) [error] at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:17) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Tested with Avro 1.10.2 and Parquet 1.12.0: https://github.com/apache/spark/runs/2157735537 Closes #31878 from xclyfe/SPARK-34784. Authored-by: Zhang, Xingchao <xingczhang@ebay.com> Signed-off-by: Yuming Wang <yumwang@ebay.com>	2021-03-21 15:36:38 +08:00
William Hyun	c799d049fc	[SPARK-34810][TEST] Update PostgreSQL test with the latest results ### What changes were proposed in this pull request? This PR aims to update `PostgresIntegrationSuite` with the latest results. ### Why are the changes needed? The latest PostgreSQL jar version is 42.2.19. Since 42.2.9, the test is broken because it returns `0.0` instead of `0.00`. - https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.19 42.2.9 (2019-12-06) 42.2.10 (2020-01-30) 42.2.11 (2020-03-09) 42.2.12 (2020-03-31) 42.2.13 (2020-06-04) 42.2.14 (2020-06-10) 42.2.15 (2020-08-14) 42.2.16 (2020-08-20) 42.2.17 (2020-10-09) 42.2.18 (2020-10-15) 42.2.19 (2021-02-18) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CI with the updated test cases. ``` build/sbt -Pdocker-integration-tests 'docker-integration-tests/testOnly org.apache.spark.sql.jdbc.PostgresIntegrationSuite' ``` Closes #31910 from williamhyun/pg. Authored-by: William Hyun <williamhyun3@gmail.com> Signed-off-by: Yuming Wang <yumwang@ebay.com>	2021-03-21 13:36:45 +08:00
yangjie01	2e836cdb59	[SPARK-34774][BUILD] Ensure change-scala-version.sh update scala.version in parent POM correctly ### What changes were proposed in this pull request? After SPARK-34507, execute` change-scala-version.sh` script will update `scala.version` in parent pom, but if we execute the following commands in order： ``` dev/change-scala-version.sh 2.13 dev/change-scala-version.sh 2.12 git status ``` there will generate git diff as follow: ``` diff --git a/pom.xml b/pom.xml index ddc4ce2f68..f43d8c8f78 100644 --- a/pom.xml +++ b/pom.xml -162,7 +162,7 <commons.math3.version>3.4.1</commons.math3.version> <commons.collections.version>3.2.2</commons.collections.version> - <scala.version>2.12.10</scala.version> + <scala.version>2.13.5</scala.version> <scala.binary.version>2.12</scala.binary.version> <scalatest-maven-plugin.version>2.0.0</scalatest-maven-plugin.version> <scalafmt.parameters>--test</scalafmt.parameters> ``` seem 'scala.version' property was not update correctly. So this pr add an extra 'scala.version' to scala-2.12 profile to ensure change-scala-version.sh can update the public `scala.version` property correctly. ### Why are the changes needed? Bug fix. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual test Execute the following commands in order： ``` dev/change-scala-version.sh 2.13 dev/change-scala-version.sh 2.12 git status ``` Before ``` diff --git a/pom.xml b/pom.xml index ddc4ce2f68..f43d8c8f78 100644 --- a/pom.xml +++ b/pom.xml -162,7 +162,7 <commons.math3.version>3.4.1</commons.math3.version> <commons.collections.version>3.2.2</commons.collections.version> - <scala.version>2.12.10</scala.version> + <scala.version>2.13.5</scala.version> <scala.binary.version>2.12</scala.binary.version> <scalatest-maven-plugin.version>2.0.0</scalatest-maven-plugin.version> <scalafmt.parameters>--test</scalafmt.parameters> ``` After No git diff. Closes #31865 from LuciferYang/SPARK-34774. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-03-18 07:33:23 -05:00
Kousuke Saruta	c5cadfefdf	[SPARK-34762][BUILD] Fix the build failure with Scala 2.13 which is related to commons-cli ### What changes were proposed in this pull request? This PR fixes the build failure with Scala 2.13 which is related to `commons-cli`. The last few days, build with Scala 2.13 on GA continues to fail and the error message says like as follows. ``` [error] /home/runner/work/spark/spark/sql/hive-thriftserver/src/main/java/org/apache/hive/service/server/HiveServer2.java:26:1: error: package org.apache.commons.cli does not exist 1278[error] import org.apache.commons.cli.GnuParser; ``` The reason is that `mvn help` in `change-scala-version.sh` downloads the POM file of `commons-cli` but doesn't download the JAR file, leading the build failure. This PR also adds `commons-cli` to the dependencies explicitly because HiveThriftServer depends on it. ### Why are the changes needed? Expect to fix the build failure with Scala 2.13. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? I confirmed that build successfully finishes with Scala 2.13 on my laptop. ``` find ~/.m2 -name commons-cli -exec rm -rf {} \; find ~/.ivy2 -name commons-cli -exec rm -rf {} \; find ~/.cache/ -name commons-cli -exec rm -rf {} \; // For Linux find ~/Library/Caches -name commons-cli -exec rm -rf {} \; // For macOS dev/change-scala-version 2.13 ./build/sbt -Pyarn -Pmesos -Pkubernetes -Phive -Phive-thriftserver -Phadoop-cloud -Pkinesis-asl -Pdocker-integration-tests -Pkubernetes-integration-tests -Pspark-ganglia-lgpl -Pscala-2.13 clean compile test:compile ``` Closes #31862 from sarutak/commons-cli. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2021-03-18 12:31:50 +09:00
Erik Krogen	4a6f5340ae	[SPARK-34752][BUILD] Bump Jetty to 9.4.37 to address CVE-2020-27223 ### What changes were proposed in this pull request? Upgrade Jetty version from `9.4.36.v20210114` to `9.4.37.v20210219`. ### Why are the changes needed? Current Jetty version is vulnerable to [CVE-2020-27223](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-27223), see [Veracode](https://www.sourceclear.com/vulnerability-database/security/denial-of-servicedos/java/sid-29523) for more details. ### Does this PR introduce _any_ user-facing change? No, minor Jetty version change. Release notes can be found [here](https://github.com/eclipse/jetty.project/releases/tag/jetty-9.4.37.v20210219). ### How was this patch tested? Will let GitHub run the unit tests. Closes #31846 from xkrogen/xkrogen-SPARK-34752-jetty-upgrade-cve. Authored-by: Erik Krogen <xkrogen@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2021-03-16 12:07:12 +09:00
Dongjoon Hyun	ba7e525a11	[SPARK-34670][BUILD] Upgrade ZSTD-JNI to 1.4.9-1 ### What changes were proposed in this pull request? This PR aims to upgrade ZSTD-JNI to 1.4.9-1. ### Why are the changes needed? ZStandard 1.4.9 and its corresponding JNI brings the following bug fixes and improvements. - https://github.com/facebook/zstd/releases/tag/v1.4.9 One of notable improvement of ZStandard 1.4.9 is `2x faster Long Distance Mode`, but we are not using it yet. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs with the existing tests and there is no regression in ZStandardBenchmark. Closes #31784 from dongjoon-hyun/ZSTD-149. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-03-08 22:40:49 -08:00
Dongjoon Hyun	1f6089b165	[SPARK-34647][CORE] Use ZSTD JNI NoFinalizer classes and bump to 1.4.8-7 ### What changes were proposed in this pull request? This PR aims to use `ZstdInputStreamNoFinalizer` and `ZstdOutputStreamNoFinalizer` classes and upgrade ZSTD JNI to 1.4.8-7. ### Why are the changes needed? `1.4.8-7` makes `NoFinalizer` classes public again. This improves the performance. - `57d53a09d2` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. Closes #31762 from dongjoon-hyun/SPARK-ZSTD-NOFINALIZER. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-03-06 10:32:27 -08:00
Kousuke Saruta	33d1c16f53	[SPARK-34590][TESTS] Allow JDWP debug for tests ### What changes were proposed in this pull request? This PR proposes a new feature that allows developers to debug test code using JDWP with sbt an Maven. More specifically, this PR introduces the following profile options. * `jdwp-test-debug`: An profile which controls enable/disable JDWP debug * `test.jdwp.address`: An option which corresponds to `address` option in JDWP * `test.jdwp.suspend`: An option which corresponds to `suspend` option in JDWP * `test.jdwp.server`: An option which corresponds to `server` option in JDWP * `test.debug.suite`: An option which controls whether debug ScalaStyle suites (Maven only) For `sbt`, this feature can be used like `build/sbt -Pjdwp-test-debug -Dtest.jdwp.address=localhost:9876 -Dtest.jdwp.suspend=y -Dtest.jdwp.server=y` and can be used for both JUnit tests and ScalaTest tests. For `Maven`, this feature can be used like as follows: (For JUnit tests) `build/mvn -Pjdwp-test-debug -Dtest.jdwp.address=localhost:9876 -Dtest.jdwp.suspend=y -Dtest.jdwp.server=y` (For ScalaTest suites) `build/mvn -Pjdwp-test-debug -Dtest.debug.suite=true -Dtest.jdwp.address=localhost:9876 -Dtest.jdwp.suspend=y -Dtest.jdwp.server=y` (It might be useful to specify specific sub-modules like `-pl sql/core,sql/catalyst`). ### Why are the changes needed? It's useful to debug test code. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? I confirmed the following things. * `jdwp-tes-debug` can switch JDWP enabled/disabled * `test.jdwp.address` can change address and port. * `test.jdwp.suspend` can change the behavior that the target debugee suspends or not. * `test.jdwp.server` can change the behavior that the JDWP debugger run as a server or client. * ScalaTest suites can be debugged with Maven with setting `test.debug.suite` to `true`. Closes #31706 from sarutak/sbt-jdwp. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-03-03 09:23:56 -08:00
Dongjoon Hyun	4818847e87	[SPARK-34578][SQL][TESTS][TEST-MAVEN] Refactor ORC encryption tests and ignore ORC shim loaded by old Hadoop library ### What changes were proposed in this pull request? 1. This PR aims to ignore ORC encryption tests when ORC shim is loaded by old Hadoop library by some other tests. The test coverage is preserved by Jenkins SBT runs and GitHub Action jobs. This PR only aims to recover Maven Jenkins jobs. 2. In addition, this PR simplifies SBT testing by refactor the test config to `SparkBuild.scala/pom.xml` and remove `DedicatedJVMTest`. This will remove one GitHub Action job which was recently added for `DedicatedJVMTest` tag. ### Why are the changes needed? Currently, Maven test fails when it runs in a batch mode because `HadoopShimsPre2_3$NullKeyProvider` is loaded. MVN COMMAND ``` $ mvn test -pl sql/core --am -Dtest=none -DwildcardSuites=org.apache.spark.sql.execution.datasources.orc.OrcV1QuerySuite,org.apache.spark.sql.execution.datasources.orc.OrcEncryptionSuite ``` BEFORE ``` - Write and read an encrypted table * FAILED * ... Cause: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1) (localhost executor driver): java.lang.IllegalArgumentException: Unknown key pii at org.apache.orc.impl.HadoopShimsPre2_3$NullKeyProvider.getCurrentKeyVersion(HadoopShimsPre2_3.java:71) at org.apache.orc.impl.WriterImpl.getKey(WriterImpl.java:871) ``` AFTER ``` OrcV1QuerySuite ... OrcEncryptionSuite: - Write and read an encrypted file !!! CANCELED !!! [] was empty org.apache.orc.impl.HadoopShimsPre2_3$NullKeyProvider1b705f65 doesn't has the test keys. ORC shim is created with old Hadoop libraries (OrcEncryptionSuite.scala:39) - Write and read an encrypted table !!! CANCELED !!! [] was empty org.apache.orc.impl.HadoopShimsPre2_3$NullKeyProvider22adeee1 doesn't has the test keys. ORC shim is created with old Hadoop libraries (OrcEncryptionSuite.scala:67) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the Jenkins Maven tests. For SBT command, - the test suite required a dedicated JVM (Before) - the test suite doesn't require a dedicated JVM (After) ``` $ build/sbt "sql/testOnly .OrcV1QuerySuite .OrcEncryptionSuite" ... [info] OrcV1QuerySuite ... [info] - SPARK-20728 Make ORCFileFormat configurable between sql/hive and sql/core (26 milliseconds) [info] OrcEncryptionSuite: [info] - Write and read an encrypted file (431 milliseconds) [info] - Write and read an encrypted table (359 milliseconds) [info] All tests passed. [info] Passed: Total 35, Failed 0, Errors 0, Passed 35 ``` Closes #31697 from dongjoon-hyun/SPARK-34578-TEST. Lead-authored-by: Dongjoon Hyun <dhyun@apple.com> Co-authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2021-03-02 16:52:27 +09:00
Yikun Jiang	85b50d4258	[SPARK-34539][BUILD][INFRA] Remove stand-alone version Zinc server ### What changes were proposed in this pull request? Cleanup all Zinc standalone server code, and realated coniguration. ### Why are the changes needed? ![image](https://user-images.githubusercontent.com/1736354/109154790-c1d3e580-77a9-11eb-8cde-835deed6e10e.png) - Zinc is the incremental compiler to speed up builds of compilation. - The scala-maven-plugin is the mave plugin, which is used by Spark, one of the function is to integrate the Zinc to enable the incremental compiler. - Since Spark v3.0.0 ([SPARK-28759](https://issues.apache.org/jira/browse/SPARK-28759)), the scala-maven-plugin is upgraded to v4.X, that means Zinc v0.3.13 standalone server is useless anymore. However, we still download, install, start the standalone Zinc server. we should remove all zinc standalone server code, and all related configuration. See more in [SPARK-34539](https://issues.apache.org/jira/projects/SPARK/issues/SPARK-34539) or the doc [Zinc standalone server is useless after scala-maven-plugin 4.x](https://docs.google.com/document/d/1u4kCHDx7KjVlHGerfmbcKSB0cZo6AD4cBdHSse-SBsM). ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Run any mvn build: ./build/mvn -DskipTests clean package -pl core You could see the increamental compilation is still working, the stage of "scala-maven-plugin:4.3.0:compile (scala-compile-first)" with incremental compilation info, like: ``` [INFO] --- scala-maven-plugin:4.3.0:testCompile (scala-test-compile-first) spark-core_2.12 --- [INFO] Using incremental compilation using Mixed compile order [INFO] Compiler bridge file: /root/.sbt/1.0/zinc/org.scala-sbt/org.scala-sbt-compiler-bridge_2.12-1.3.1-bin_2.12.10__52.0-1.3.1_20191012T045515.jar [INFO] compiler plugin: BasicArtifact(com.github.ghik,silencer-plugin_2.12.10,1.6.0,null) [INFO] Compiling 303 Scala sources and 27 Java sources to /root/spark/core/target/scala-2.12/test-classes ... ``` Closes #31647 from Yikun/cleanup-zinc. Authored-by: Yikun Jiang <yikunkero@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-03-01 08:39:38 -06:00

1 2 3 4 5 ...

978 commits