ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Gengliang Wang	5d45a415f3	Preparing Spark release v3.2.0-rc7	2021-10-06 11:45:26 +00:00
Gengliang Wang	4bd358474b	Preparing development version 3.2.1-SNAPSHOT	2021-09-28 10:53:42 +00:00
Gengliang Wang	dde73e2e1c	Preparing Spark release v3.2.0-rc6	2021-09-28 10:53:35 +00:00
Gengliang Wang	0c57bb8f7f	Preparing development version 3.2.1-SNAPSHOT	2021-09-27 08:24:50 +00:00
Gengliang Wang	49aea14c5a	Preparing Spark release v3.2.0-rc5	2021-09-27 08:24:44 +00:00
Gengliang Wang	2348cce37e	Preparing development version 3.2.1-SNAPSHOT	2021-09-26 12:28:46 +00:00
Gengliang Wang	2ed8c08c5b	Preparing Spark release v3.2.0-rc5	2021-09-26 12:28:40 +00:00
Gengliang Wang	da722d43cb	Preparing development version 3.2.1-SNAPSHOT	2021-09-24 10:03:23 +00:00
Gengliang Wang	9e35703211	Preparing Spark release v3.2.0-rc5	2021-09-24 10:03:16 +00:00
Gengliang Wang	0fb7127f85	Preparing development version 3.2.1-SNAPSHOT	2021-09-23 08:46:28 +00:00
Gengliang Wang	b609f2fe0c	Preparing Spark release v3.2.0-rc4	2021-09-23 08:46:22 +00:00
Gengliang Wang	b0249851f6	Preparing development version 3.2.1-SNAPSHOT	2021-09-18 11:30:12 +00:00
Gengliang Wang	96044e9735	Preparing Spark release v3.2.0-rc3	2021-09-18 11:30:06 +00:00
dgd-contributor	d71e2fd1d1	[SPARK-36685][ML][MLLIB] Fix wrong assert messages ### What changes were proposed in this pull request? Fix wrong assert statement, a mistake when coding ### Why are the changes needed? wrong assert statement ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #33953 from dgd-contributor/SPARK-36685. Authored-by: dgd-contributor <dgd_contributor@viettel.com.vn> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit `9af0132516`) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-09-11 14:39:55 -07:00
Gengliang Wang	1bad04d028	Preparing development version 3.2.1-SNAPSHOT	2021-08-31 17:04:14 +00:00
Gengliang Wang	03f5d23e96	Preparing Spark release v3.2.0-rc2	2021-08-31 17:04:08 +00:00
Gengliang Wang	69be513c5e	Preparing development version 3.2.1-SNAPSHOT	2021-08-20 12:40:47 +00:00
Gengliang Wang	6bb3523d8e	Preparing Spark release v3.2.0-rc1	2021-08-20 12:40:40 +00:00
Gengliang Wang	fafdc1482b	Revert "Preparing Spark release v3.2.0-rc1" This reverts commit `8e58fafb05`.	2021-08-20 20:07:02 +08:00
Gengliang Wang	c829ed53ff	Revert "Preparing development version 3.2.1-SNAPSHOT" This reverts commit `4f1d21571d`.	2021-08-20 20:07:01 +08:00
Gengliang Wang	4f1d21571d	Preparing development version 3.2.1-SNAPSHOT	2021-08-19 14:08:32 +00:00
Gengliang Wang	8e58fafb05	Preparing Spark release v3.2.0-rc1	2021-08-19 14:08:26 +00:00
Ruifeng Zheng	a66738823c	[SPARK-35678][ML][FOLLOWUP] softmax support offset and step ### What changes were proposed in this pull request? softmax support offset and step, then we can use it in ANN and NB ### Why are the changes needed? to simplify impl ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuite Closes #32991 from zhengruifeng/softmax_support_offset_step. Authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>	2021-06-23 21:03:18 -05:00
Hyukjin Kwon	ea907469bb	Revert "[SPARK-35678][ML][FOLLOWUP] softmax support offset and step" This reverts commit `fdf86fd6e7`.	2021-06-18 16:45:28 +09:00
Ruifeng Zheng	fdf86fd6e7	[SPARK-35678][ML][FOLLOWUP] softmax support offset and step ### What changes were proposed in this pull request? use newly impled softmax function in NB ### Why are the changes needed? to simplify impl ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuite Closes #32927 from zhengruifeng/softmax__followup. Authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>	2021-06-17 22:46:36 -07:00
Ruifeng Zheng	5c96d643ee	[SPARK-35707][ML] optimize sparse GEMM by skipping bound checking ### What changes were proposed in this pull request? Sparse gemm use mothod `DenseMatrix.apply` to access the values, which can be optimized by skipping checking the bound and `isTransposed` ``` override def apply(i: Int, j: Int): Double = values(index(i, j)) private[ml] def index(i: Int, j: Int): Int = { require(i >= 0 && i < numRows, s"Expected 0 <= i < $numRows, got i = $i.") require(j >= 0 && j < numCols, s"Expected 0 <= j < $numCols, got j = $j.") if (!isTransposed) i + numRows * j else j + numCols * i } ``` ### Why are the changes needed? to improve performance, about 15% faster in the designed case ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuite and additional performance test Closes #32857 from zhengruifeng/gemm_opt_index. Authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>	2021-06-16 08:57:27 +08:00
Ruifeng Zheng	2802ac321f	[SPARK-35666][ML] gemv skip array shape checking ### What changes were proposed in this pull request? In existing impls, it is common case that the vector/matrix need to be sliced/copied just due to shape match. which makes the logic complex and introduce extra costing of slicing & copying. ### Why are the changes needed? 1, avoid slicing and copying due to shape checking; 2, simpify the usages; ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuites Closes #32805 from zhengruifeng/new_blas_func_for_agg. Authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>	2021-06-16 08:54:34 +08:00
Ruifeng Zheng	8c4b535baf	[SPARK-35678][ML] add a common softmax function ### What changes were proposed in this pull request? add softmax function in utils ### Why are the changes needed? it can be used in multi places ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? existing testsuites Closes #32822 from zhengruifeng/add_softmax_func. Authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>	2021-06-15 10:33:57 +08:00
yangjie01	09d039da56	[SPARK-35526][CORE][SQL][ML][MLLIB] Re-Cleanup `procedure syntax is deprecated` compilation warning in Scala 2.13 ### What changes were proposed in this pull request? After SPARK-29291 and SPARK-33352, there are still some compilation warnings about `procedure syntax is deprecated` as follows: ``` [WARNING] [Warn] /spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:723: [deprecation \| origin= \| version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `registerMergeResult`'s return type [WARNING] [Warn] /spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:748: [deprecation \| origin= \| version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `unregisterMergeResult`'s return type [WARNING] [Warn] /spark/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala:223: [deprecation \| origin= \| version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `testSimpleSpillingForAllCodecs`'s return type [WARNING] [Warn] /spark/mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASBenchmark.scala:53: [deprecation \| origin= \| version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `runBLASBenchmark`'s return type [WARNING] [Warn] /spark/sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala:110: [deprecation \| origin= \| version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `assertEmptyRootPath`'s return type [WARNING] [Warn] /spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala:602: [deprecation \| origin= \| version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `executeCTASWithNonEmptyLocation`'s return type ``` So the main change of this pr is cleanup these compilation warnings. ### Why are the changes needed? Eliminate compilation warnings in Scala 2.13 and this change should be compatible with Scala 2.12 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass the Jenkins or GitHub Action Closes #32669 from LuciferYang/re-clean-procedure-syntax. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-30 16:49:47 -07:00
Ludovic Henry	b52d47a920	[SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludovic.netlib:2.0 ### What changes were proposed in this pull request? Bump to `dev.ludovic.netlib:2.0` which provides JNI-based wrappers for BLAS, ARPACK, and LAPACK. Theseare not taking dependencies on GPL or LGPL libraries, allowing to provide out-of-the-box support for hardware acceleration when a native library is present (this is still up to the end-user to install such library on their system, like OpenBLAS, Intel MKL, and libarpack2). ### Why are the changes needed? Great performance improvement for ML-related workload on vanilla-distributions of Spark. ### Does this PR introduce _any_ user-facing change? Users now take advantage of hardware acceleration as long as a native library is installed (like OpenBLAS, Intel MKL and libarpack2). ### How was this patch tested? Spark test-suite + dev.ludovic.netlib testsuite. #### JDK8: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 220 226 6 454.9 2.2 1.0X [info] java 221 228 5 451.9 2.2 1.0X [info] native 209 215 5 478.7 2.1 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 125 3 823.3 1.2 1.0X [info] java 121 125 3 824.3 1.2 1.0X [info] native 101 105 3 988.4 1.0 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 212 219 6 470.9 2.1 1.0X [info] java 208 212 4 481.0 2.1 1.0X [info] native 209 215 5 478.5 2.1 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 119 3 878.9 1.1 1.0X [info] java 99 105 3 1011.4 1.0 1.2X [info] native 97 103 3 1026.7 1.0 1.2X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 111 2 925.9 1.1 1.0X [info] java 71 73 2 1414.9 0.7 1.5X [info] native 54 56 2 1847.0 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 2 1046.8 1.0 1.0X [info] java 47 48 1 2129.8 0.5 2.0X [info] native 29 30 1 3404.7 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 139 143 2 718.2 1.4 1.0X [info] java 46 47 1 2171.2 0.5 3.0X [info] native 44 46 2 2261.8 0.4 3.1X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 154 157 4 651.0 1.5 1.0X [info] java 40 42 1 2469.3 0.4 3.8X [info] native 26 27 1 3787.6 0.3 5.8X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 185 195 8 541.0 1.8 1.0X [info] java 186 196 7 538.5 1.9 1.0X [info] native 177 187 7 564.1 1.8 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 98 102 3 1016.2 1.0 1.0X [info] java 98 102 3 1017.8 1.0 1.0X [info] native 87 91 3 1143.2 0.9 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 68 70 1 1474.7 0.7 1.0X [info] java 51 52 1 1973.0 0.5 1.3X [info] native 30 32 1 3298.8 0.3 2.2X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 99 2 1037.9 1.0 1.0X [info] java 50 51 1 1999.6 0.5 1.9X [info] native 30 31 1 3368.1 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 1688.7 0.6 1.0X [info] java 41 42 1 2461.9 0.4 1.5X [info] native 15 16 1 6593.0 0.2 3.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1116.2 0.9 1.0X [info] java 39 40 1 2565.8 0.4 2.3X [info] native 15 16 1 6594.2 0.2 5.9X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 192 202 7 520.5 1.9 1.0X [info] java 203 214 7 491.9 2.0 0.9X [info] native 176 187 7 568.8 1.8 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 846.1 1.2 1.0X [info] java 38 39 1 1313.5 0.8 1.6X [info] native 24 27 1 2047.8 0.5 2.4X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 515.4 1.9 1.0X [info] java 97 101 2 515.1 1.9 1.0X [info] native 88 91 3 569.1 1.8 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 169 174 3 295.4 3.4 1.0X [info] java 169 174 3 295.4 3.4 1.0X [info] native 160 165 4 312.2 3.2 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 561 577 13 1782.3 0.6 1.0X [info] java 225 231 4 4446.2 0.2 2.5X [info] native 31 32 3 32473.1 0.0 18.2X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 570 584 9 1754.8 0.6 1.0X [info] java 224 230 4 4457.3 0.2 2.5X [info] native 31 32 1 32493.4 0.0 18.5X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 855 866 6 1169.2 0.9 1.0X [info] java 224 228 3 4466.9 0.2 3.8X [info] native 31 32 1 32395.5 0.0 27.7X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1328 1344 8 752.8 1.3 1.0X [info] java 224 230 4 4458.9 0.2 5.9X [info] native 31 32 1 32201.8 0.0 42.8X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 534 541 5 1873.0 0.5 1.0X [info] java 220 224 3 4542.8 0.2 2.4X [info] native 15 16 1 66803.1 0.0 35.7X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 544 551 6 1839.6 0.5 1.0X [info] java 220 224 4 4538.2 0.2 2.5X [info] native 15 16 1 65589.9 0.0 35.7X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 833 845 21 1201.0 0.8 1.0X [info] java 220 224 3 4548.7 0.2 3.8X [info] native 15 16 1 66603.2 0.0 55.5X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 899 907 5 1112.9 0.9 1.0X [info] java 221 224 2 4531.6 0.2 4.1X [info] native 15 16 1 65944.9 0.0 59.3X ``` #### JDK11: ``` [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 195 200 3 512.2 2.0 1.0X [info] java 197 202 3 507.0 2.0 1.0X [info] native 184 189 4 543.0 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 3 921.8 1.1 1.0X [info] java 101 105 3 989.4 1.0 1.1X [info] native 87 91 3 1147.1 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 187 191 3 535.1 1.9 1.0X [info] java 182 188 3 548.8 1.8 1.0X [info] native 178 182 3 562.2 1.8 1.1X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 110 114 3 909.3 1.1 1.0X [info] java 86 93 4 1159.3 0.9 1.3X [info] native 86 90 3 1162.4 0.9 1.3X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 106 108 2 943.6 1.1 1.0X [info] java 70 71 2 1426.8 0.7 1.5X [info] native 54 56 2 1835.4 0.5 1.9X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1047.1 1.0 1.0X [info] java 43 44 1 2331.9 0.4 2.2X [info] native 29 30 1 3392.1 0.3 3.2X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 115 2 880.7 1.1 1.0X [info] java 42 43 1 2398.1 0.4 2.7X [info] native 45 46 1 2233.3 0.4 2.5X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 140 143 2 714.6 1.4 1.0X [info] java 28 29 1 3531.0 0.3 4.9X [info] native 26 27 1 3820.0 0.3 5.3X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 156 166 7 641.3 1.6 1.0X [info] java 158 167 6 633.2 1.6 1.0X [info] native 150 160 7 664.8 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 88 2 1181.7 0.8 1.0X [info] java 85 88 2 1176.0 0.9 1.0X [info] native 75 78 2 1333.2 0.8 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.1 0.6 1.0X [info] java 41 43 1 2415.5 0.4 1.4X [info] native 30 31 1 3293.9 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 94 96 1 1063.4 0.9 1.0X [info] java 41 42 1 2435.8 0.4 2.3X [info] native 30 30 1 3379.8 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2278.9 0.4 1.0X [info] java 37 38 0 2686.8 0.4 1.2X [info] native 15 16 1 6555.4 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 88 89 1 1142.1 0.9 1.0X [info] java 33 34 1 3010.7 0.3 2.6X [info] native 15 16 1 6553.9 0.2 5.7X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 172 4 609.4 1.6 1.0X [info] java 163 172 5 612.6 1.6 1.0X [info] native 150 159 4 667.0 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 49 50 1 1029.4 1.0 1.0X [info] java 41 42 1 1209.4 0.8 1.2X [info] native 25 27 1 2029.2 0.5 2.0X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 80 85 3 622.2 1.6 1.0X [info] java 80 85 3 622.4 1.6 1.0X [info] native 75 79 3 668.7 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 137 142 3 364.1 2.7 1.0X [info] java 139 142 2 360.4 2.8 1.0X [info] native 131 135 3 380.4 2.6 1.0X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 517 525 5 1935.5 0.5 1.0X [info] java 213 216 3 4704.8 0.2 2.4X [info] native 31 31 1 32705.6 0.0 16.9X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 589 601 6 1698.6 0.6 1.0X [info] java 213 217 3 4693.3 0.2 2.8X [info] native 31 32 1 32498.9 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 851 865 6 1175.3 0.9 1.0X [info] java 212 216 3 4717.0 0.2 4.0X [info] native 30 32 1 32903.0 0.0 28.0X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1301 1316 6 768.4 1.3 1.0X [info] java 212 216 2 4717.4 0.2 6.1X [info] native 31 32 1 32606.0 0.0 42.4X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 454 460 2 2203.0 0.5 1.0X [info] java 208 212 3 4803.8 0.2 2.2X [info] native 15 16 0 66586.0 0.0 30.2X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 529 536 4 1889.7 0.5 1.0X [info] java 208 212 3 4798.6 0.2 2.5X [info] native 15 16 1 66751.4 0.0 35.3X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 830 840 5 1205.1 0.8 1.0X [info] java 208 211 2 4814.1 0.2 4.0X [info] native 15 15 1 67676.4 0.0 56.2X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 894 907 7 1118.7 0.9 1.0X [info] java 208 211 3 4809.6 0.2 4.3X [info] native 15 16 1 66675.2 0.0 59.6X ``` #### JDK16: ``` [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.VectorBLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 193 199 3 517.5 1.9 1.0X [info] java 181 186 4 553.2 1.8 1.1X [info] native 181 185 5 553.6 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 2 925.1 1.1 1.0X [info] java 88 91 3 1138.6 0.9 1.2X [info] native 87 91 3 1144.2 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 184 189 3 542.5 1.8 1.0X [info] java 181 185 3 552.8 1.8 1.0X [info] native 179 183 2 558.0 1.8 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 1031.6 1.0 1.0X [info] java 86 90 2 1163.7 0.9 1.1X [info] native 85 88 2 1182.9 0.8 1.1X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 107 109 2 932.4 1.1 1.0X [info] java 54 56 2 1846.7 0.5 2.0X [info] native 54 56 2 1846.7 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1043.6 1.0 1.0X [info] java 29 30 1 3439.3 0.3 3.3X [info] native 29 30 1 3423.9 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 123 2 829.8 1.2 1.0X [info] java 32 32 1 3171.3 0.3 3.8X [info] native 45 46 1 2246.2 0.4 2.7X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 142 144 2 705.9 1.4 1.0X [info] java 15 16 1 6585.8 0.2 9.3X [info] native 26 27 1 3839.5 0.3 5.4X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 157 165 5 635.6 1.6 1.0X [info] java 151 159 5 664.0 1.5 1.0X [info] native 151 160 5 663.6 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 89 2 1172.3 0.9 1.0X [info] java 75 79 3 1337.3 0.7 1.1X [info] native 75 79 2 1335.5 0.7 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.5 0.6 1.0X [info] java 28 29 1 3544.2 0.3 2.0X [info] native 30 31 1 3306.2 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1108.3 0.9 1.0X [info] java 28 28 1 3622.5 0.3 3.3X [info] native 30 31 1 3381.3 0.3 3.1X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2284.7 0.4 1.0X [info] java 14 15 1 7034.0 0.1 3.1X [info] native 15 16 1 6643.7 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 86 1 1177.4 0.8 1.0X [info] java 15 15 1 6886.1 0.1 5.8X [info] native 15 16 1 6560.1 0.2 5.6X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 173 6 608.1 1.6 1.0X [info] java 148 157 5 675.2 1.5 1.1X [info] native 152 160 5 659.9 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 61 63 1 815.4 1.2 1.0X [info] java 16 17 1 3104.3 0.3 3.8X [info] native 24 27 1 2071.9 0.5 2.5X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 81 85 2 616.4 1.6 1.0X [info] java 81 85 2 614.7 1.6 1.0X [info] native 75 78 2 669.5 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 138 141 3 362.7 2.8 1.0X [info] java 137 140 2 365.3 2.7 1.0X [info] native 131 134 2 382.9 2.6 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 525 544 8 1906.2 0.5 1.0X [info] java 61 68 3 16358.1 0.1 8.6X [info] native 31 32 1 32623.7 0.0 17.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 580 598 12 1724.5 0.6 1.0X [info] java 61 68 4 16302.5 0.1 9.5X [info] native 30 32 1 32962.8 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 829 838 4 1206.2 0.8 1.0X [info] java 61 69 3 16339.7 0.1 13.5X [info] native 30 31 1 33231.9 0.0 27.6X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1352 1363 5 739.6 1.4 1.0X [info] java 61 69 3 16347.0 0.1 22.1X [info] native 31 32 1 32740.3 0.0 44.3X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 482 493 7 2073.1 0.5 1.0X [info] java 35 38 2 28315.3 0.0 13.7X [info] native 15 15 1 67579.7 0.0 32.6X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 472 482 4 2119.0 0.5 1.0X [info] java 36 38 2 28138.1 0.0 13.3X [info] native 15 16 1 66616.5 0.0 31.4X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 823 830 5 1215.2 0.8 1.0X [info] java 35 38 2 28681.4 0.0 23.6X [info] native 15 15 1 67908.4 0.0 55.9X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 896 908 7 1115.8 0.9 1.0X [info] java 35 38 2 28402.0 0.0 25.5X [info] native 15 16 0 66691.2 0.0 59.8X ``` TODO: - [x] update documentation in `docs/` and `docs/ml-linalg-guide.md` refering `com.github.fommil.netlib` - [ ] merge https://github.com/luhenry/netlib/pull/1 with all feedback from this PR + remove references to snapshot repositories in `pom.xml` and `project/SparkBuild.scala`. Closes #32415 from luhenry/master. Authored-by: Ludovic Henry <git@ludovic.dev> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-12 08:59:36 -05:00
byungsoo	94bbca3e55	[SPARK-35306][MLLIB][TESTS] Add benchmark results for BLASBenchmark created by GitHub Actions machines ### What changes were proposed in this pull request? This PR adds benchmark results for `BLASBenchmark` created by GitHub Actions machines. Benchmark result files are added for both JDK 8 (`BLASBenchmark-result.txt`) and 11 (`BLASBenchmark-jdk11-result.txt`) in `{SPARK_HOME}/mllib-local/benchmarks/`. ### Why are the changes needed? In [SPARK-34950](https://issues.apache.org/jira/browse/SPARK-34950), benchmark results were updated to the ones created by Github Actions machines. As benchmark results for `BLASBenchmark` (added at [SPARK-33882](https://issues.apache.org/jira/browse/SPARK-33882) and [SPARK-35150](https://issues.apache.org/jira/browse/SPARK-35150)) are not currently available at the repository, this PR adds them. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? The benchmark results were obtained by running tests with GitHub Actions workflow in my forked repository. You can refer to the test results and output files from the link below. - https://github.com/byungsoo-oh/spark/actions/runs/809900377 - https://github.com/byungsoo-oh/spark/actions/runs/810084610 Closes #32435 from byungsoo-oh/SPARK-35306. Authored-by: byungsoo <byungsoo@byungsoo-pc.tn.corp.samsungelectronics.net> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2021-05-07 11:08:10 +09:00
Ludovic Henry	5b77ebb57b	[SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib ### What changes were proposed in this pull request? Following https://github.com/apache/spark/pull/30810, I've continued looking for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate work done in the [`dev.ludovic.netlib`](https://github.com/luhenry/netlib/) Maven package. The `dev.ludovic.netlib` library wraps the original `com.github.fommil.netlib` library and focus on accelerating the linear algebra routines in use in Spark. When running the `org.apache.spark.ml.linalg.BLASBenchmark` benchmarking suite, I get the results at [1] on an Intel machine. Moreover, this library is thoroughly tested to return the exact same results as the reference implementation. Under the hood, it reimplements the necessary algorithms in pure autovectorization-friendly Java 8, as well as takes advantage of the Vector API and Foreign Linker API introduced in JDK 16 when available. A table summarising which version gets loaded in which case: ``` \| \| BLAS.nativeBLAS \| BLAS.javaBLAS \| \| --------------------- \| -------------------------------------------------- \| -------------------------------------------------- \| \| with -Pnetlib-lgpl \| 1. dev.ludovic.netlib.blas.NetlibNativeBLAS, a \| 1. dev.ludovic.netlib.blas.VectorizedBLAS \| \| \| wrapper for com.github.fommil:all \| (JDK16+, relies on the Vector API, requires \| \| \| 2. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, \| `--add-modules=jdk.incubator.vector` on JDK16) \| \| \| relies on the Foreign Linker API, requires \| 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) \| \| \| `--add-modules=jdk.incubator.foreign \| 3. dev.ludovic.netlib.blas.JavaBLAS \| \| \| -Dforeign.restricted=warn`) \| 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a \| \| \| 3. fails to load, falls back to BLAS.javaBLAS in \| wrapper for com.github.fommil:core \| \| \| org.apache.spark.ml.linalg.BLAS \| \| \| --------------------- \| -------------------------------------------------- \| -------------------------------------------------- \| \| without -Pnetlib-lgpl \| 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+, \| 1. dev.ludovic.netlib.blas.VectorizedBLAS \| \| \| relies on the Foreign Linker API, requires \| (JDK16+, relies on the Vector API, requires \| \| \| `--add-modules=jdk.incubator.foreign \| `--add-modules=jdk.incubator.vector` on JDK16) \| \| \| -Dforeign.restricted=warn`) \| 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+) \| \| \| 2. fails to load, falls back to BLAS.javaBLAS in \| 3. dev.ludovic.netlib.blas.JavaBLAS \| \| \| org.apache.spark.ml.linalg.BLAS \| 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a \| \| \| \| wrapper for com.github.fommil:core \| \| --------------------- \| -------------------------------------------------- \| -------------------------------------------------- \| ``` ### Why are the changes needed? Accelerates linear algebra operations when the pure-java fallback method is in use. Transparently falls back to native implementation (OpenBLAS, MKL) when available. ### Does this PR introduce _any_ user-facing change? No, all changes are transparent to the user. ### How was this patch tested? The `dev.ludovic.netlib` library has its own test suite [2]. It has also been validated by running the Spark test suite and benchmarking suite. [1] Results for `org.apache.spark.ml.linalg.BLASBenchmark`: #### JDK8: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 223 232 8 448.0 2.2 1.0X [info] java 221 228 7 453.0 2.2 1.0X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 122 128 4 821.2 1.2 1.0X [info] java 122 128 4 822.3 1.2 1.0X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 109 112 2 921.4 1.1 1.0X [info] java 70 74 3 1423.5 0.7 1.5X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 96 98 2 1046.1 1.0 1.0X [info] java 47 49 2 2121.7 0.5 2.0X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 184 195 8 544.3 1.8 1.0X [info] java 185 196 7 539.5 1.9 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 99 104 4 1011.9 1.0 1.0X [info] java 99 104 4 1010.4 1.0 1.0X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 947.2 1.1 1.0X [info] java 0 0 0 1584.8 0.6 1.7X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 867.4 1.2 1.0X [info] java 1 1 0 865.0 1.2 1.0X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 485.9 2.1 1.0X [info] java 1 1 0 486.8 2.1 1.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1843.0 0.5 1.0X [info] java 0 0 0 2690.6 0.4 1.5X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1214.7 0.8 1.0X [info] java 0 0 0 2536.8 0.4 2.1X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1895.9 0.5 1.0X [info] java 0 0 0 2961.1 0.3 1.6X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1223.4 0.8 1.0X [info] java 0 0 0 3091.4 0.3 2.5X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 560 575 20 1787.1 0.6 1.0X [info] java 226 232 5 4432.4 0.2 2.5X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 570 586 23 1755.2 0.6 1.0X [info] java 227 232 4 4410.1 0.2 2.5X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 863 879 17 1158.4 0.9 1.0X [info] java 227 231 3 4407.9 0.2 3.8X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1282 1305 23 780.0 1.3 1.0X [info] java 227 232 4 4413.4 0.2 5.7X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 538 548 8 1858.6 0.5 1.0X [info] java 221 226 3 4521.1 0.2 2.4X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 549 558 10 1819.9 0.5 1.0X [info] java 222 229 7 4503.5 0.2 2.5X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 838 852 12 1193.0 0.8 1.0X [info] java 222 229 5 4500.5 0.2 3.8X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 905 919 18 1104.8 0.9 1.0X [info] java 221 228 5 4521.3 0.2 4.1X ``` #### JDK11: ``` [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 195 204 10 512.7 2.0 1.0X [info] java 195 202 7 512.4 2.0 1.0X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 108 113 4 923.3 1.1 1.0X [info] java 102 107 4 984.4 1.0 1.1X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 107 110 3 938.1 1.1 1.0X [info] java 69 72 3 1447.1 0.7 1.5X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 96 98 2 1046.5 1.0 1.0X [info] java 43 45 2 2317.1 0.4 2.2X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 155 168 8 644.2 1.6 1.0X [info] java 158 169 8 632.8 1.6 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 85 90 4 1178.1 0.8 1.0X [info] java 86 90 4 1167.7 0.9 1.0X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 1182.1 0.8 1.0X [info] java 0 0 0 1432.1 0.7 1.2X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 898.7 1.1 1.0X [info] java 1 1 0 891.5 1.1 1.0X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 495.4 2.0 1.0X [info] java 1 1 0 495.7 2.0 1.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2271.6 0.4 1.0X [info] java 0 0 0 3648.1 0.3 1.6X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1229.3 0.8 1.0X [info] java 0 0 0 2711.3 0.4 2.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2677.5 0.4 1.0X [info] java 0 0 0 3288.2 0.3 1.2X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1233.0 0.8 1.0X [info] java 0 0 0 2766.3 0.4 2.2X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 520 536 16 1923.6 0.5 1.0X [info] java 214 221 7 4669.5 0.2 2.4X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 593 612 17 1686.5 0.6 1.0X [info] java 215 219 3 4643.3 0.2 2.8X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 853 870 16 1172.8 0.9 1.0X [info] java 215 218 3 4659.7 0.2 4.0X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1350 1370 23 740.8 1.3 1.0X [info] java 215 219 4 4656.6 0.2 6.3X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 460 468 6 2173.2 0.5 1.0X [info] java 210 213 2 4752.7 0.2 2.2X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 535 544 8 1869.3 0.5 1.0X [info] java 210 215 5 4761.8 0.2 2.5X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 843 853 11 1186.8 0.8 1.0X [info] java 209 214 4 4793.4 0.2 4.0X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 891 904 15 1122.0 0.9 1.0X [info] java 209 214 4 4777.2 0.2 4.3X ``` #### JDK16: ``` [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.NetlibF2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.VectorizedBLAS [info] nativeBLAS = dev.ludovic.netlib.blas.VectorizedBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 194 199 7 515.7 1.9 1.0X [info] java 181 186 3 551.1 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 109 115 4 915.0 1.1 1.0X [info] java 88 92 3 1138.8 0.9 1.2X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 108 110 2 922.6 1.1 1.0X [info] java 54 56 2 1839.2 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 96 97 2 1046.1 1.0 1.0X [info] java 29 30 1 3393.4 0.3 3.2X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 156 165 5 643.0 1.6 1.0X [info] java 150 159 5 667.1 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 85 91 6 1171.0 0.9 1.0X [info] java 75 79 3 1340.6 0.7 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 917.0 1.1 1.0X [info] java 0 0 0 8147.2 0.1 8.9X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 859.3 1.2 1.0X [info] java 1 1 0 859.3 1.2 1.0X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 482.1 2.1 1.0X [info] java 1 1 0 482.6 2.1 1.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2214.2 0.5 1.0X [info] java 0 0 0 7975.8 0.1 3.6X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1231.4 0.8 1.0X [info] java 0 0 0 8680.9 0.1 7.0X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 0 0 0 2684.3 0.4 1.0X [info] java 0 0 0 18527.1 0.1 6.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1 1 0 1235.4 0.8 1.0X [info] java 0 0 0 17347.9 0.1 14.0X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 530 552 18 1887.5 0.5 1.0X [info] java 58 64 3 17143.9 0.1 9.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 598 620 17 1671.1 0.6 1.0X [info] java 58 64 3 17196.6 0.1 10.3X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 834 847 14 1199.4 0.8 1.0X [info] java 57 63 4 17486.9 0.1 14.6X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 1338 1366 22 747.3 1.3 1.0X [info] java 58 63 3 17356.6 0.1 23.2X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 489 501 9 2045.5 0.5 1.0X [info] java 36 38 2 27721.9 0.0 13.6X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 478 488 9 2094.0 0.5 1.0X [info] java 36 38 2 27813.2 0.0 13.3X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 825 837 10 1211.6 0.8 1.0X [info] java 35 38 2 28433.1 0.0 23.5X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 900 918 15 1111.6 0.9 1.0X [info] java 36 38 2 28073.0 0.0 25.3X ``` [2] https://github.com/luhenry/netlib/tree/master/blas/src/test/java/dev/ludovic/netlib/blas Closes #32253 from luhenry/master. Authored-by: Ludovic Henry <git@ludovic.dev> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-04-27 14:00:59 -05:00
Ludovic Henry	9244066ca6	[SPARK-33882][ML] Add a vectorized BLAS implementation ### What changes were proposed in this pull request? This patch introduces a VectorizedBLAS class which implements such hardware-accelerated BLAS operations. This feature is hidden behind the "vectorized" profile that you can enable by passing "-Pvectorized" to sbt or maven. The Vector API has been introduced in JDK 16. Following discussion on the mailing list, this API is introduced transparently and needs to be enabled explicitely. ### Why are the changes needed? Whenever a native BLAS implementation isn't available on the system, Spark automatically falls back onto a Java implementation. With the recent release of the Vector API in the OpenJDK [1], we can use hardware acceleration for such operations. This change was also discussed on the mailing list. [2] ### Does this PR introduce _any_ user-facing change? It introduces a build-time profile called `vectorized`. You can pass it to sbt and mvn with `-Pvectorized`. There is no change to the end-user of Spark and it should only impact Spark developpers. It is also disabled by default. ### How was this patch tested? It passes `build/sbt mllib-local/test` with and without `-Pvectorized` with JDK 16. This patch also introduces benchmarks for BLAS. The benchmark results are as follows: ``` [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 37 37 0 271.5 3.7 1.0X [info] vector 24 25 4 416.1 2.4 1.5X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 70 70 0 143.2 7.0 1.0X [info] vector 35 35 2 288.7 3.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 50 51 1 199.8 5.0 1.0X [info] vector 15 15 0 648.7 1.5 3.2X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 34 34 0 295.6 3.4 1.0X [info] vector 19 19 0 531.2 1.9 1.8X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 25 25 1 399.0 2.5 1.0X [info] vector 8 9 1 1177.3 0.8 3.0X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 27 27 0 0.0 26651.5 1.0X [info] vector 21 21 0 0.0 20646.3 1.3X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 36 36 0 0.0 35501.4 1.0X [info] vector 22 22 0 0.0 21930.3 1.6X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 20 20 0 0.0 20283.3 1.0X [info] vector 9 9 0 0.1 8657.7 2.3X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 30 30 0 0.0 29845.8 1.0X [info] vector 10 10 1 0.1 9695.4 3.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 182 182 0 0.5 1820.0 1.0X [info] vector 160 160 1 0.6 1597.6 1.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 211 211 1 0.5 2106.2 1.0X [info] vector 156 157 0 0.6 1564.4 1.3X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ------------------------------------------------------------------------------------------------------------------------ [info] f2j 276 276 0 0.4 2757.8 1.0X [info] vector 137 137 0 0.7 1365.1 2.0X ``` /cc srowen xkrogen [1] https://openjdk.java.net/jeps/338 [2] https://mail-archives.apache.org/mod_mbox/spark-dev/202012.mbox/%3cDM5PR2101MB11106162BB3AF32AD29C6C79B0C69DM5PR2101MB1110.namprd21.prod.outlook.com%3e Closes #30810 from luhenry/master. Lead-authored-by: Ludovic Henry <luhenry@microsoft.com> Co-authored-by: Ludovic Henry <git@ludovic.dev> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-04-14 11:36:58 -05:00
Ruifeng Zheng	47da944f59	[SPARK-34470][ML] VectorSlicer utilize ordering if possible ### What changes were proposed in this pull request? 1, add a new param `sorted` in `slice`; 2, in `VectorSlicer`, set `sorted = true` if input indices are ordered. ### Why are the changes needed? The input indices of VectorSlicer are probably ordered. VectorSlicer should use this attribute if possible. I did a simple test and `sorted = true` maybe about 70% faster than existing `slice` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? added testsuite Closes #31588 from zhengruifeng/vector_slice_for_sorted_indices. Authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>	2021-03-22 09:46:53 +08:00
yangjie01	8b1ba233f1	[SPARK-34068][CORE][SQL][MLLIB][GRAPHX] Remove redundant collection conversion ### What changes were proposed in this pull request? There are some redundant collection conversion can be removed, for version compatibility, clean up these with Scala-2.13 profile. ### Why are the changes needed? Remove redundant collection conversion ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass the Jenkins or GitHub Action - Manual test `core`, `graphx`, `mllib`, `mllib-local`, `sql`, `yarn`,`kafka-0-10` in Scala 2.13 passed Closes #31125 from LuciferYang/SPARK-34068. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-01-13 18:07:02 -06:00
Dongjoon Hyun	de9818f043	[SPARK-33662][BUILD] Setting version to 3.2.0-SNAPSHOT ### What changes were proposed in this pull request? This PR aims to update `master` branch version to 3.2.0-SNAPSHOT. ### Why are the changes needed? Start to prepare Apache Spark 3.2.0. ### Does this PR introduce _any_ user-facing change? N/A. ### How was this patch tested? Pass the CIs. Closes #30606 from dongjoon-hyun/SPARK-3.2. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2020-12-04 14:10:42 -08:00
Dongjoon Hyun	3ce4ab545b	[SPARK-33513][BUILD] Upgrade to Scala 2.13.4 to improve exhaustivity ### What changes were proposed in this pull request? This PR aims the followings. 1. Upgrade from Scala 2.13.3 to 2.13.4 for Apache Spark 3.1 2. Fix exhaustivity issues in both Scala 2.12/2.13 (Scala 2.13.4 requires this for compilation.) 3. Enforce the improved exhaustive check by using the existing Scala 2.13 GitHub Action compilation job. ### Why are the changes needed? Scala 2.13.4 is a maintenance release for 2.13 line and improves JDK 15 support. - https://github.com/scala/scala/releases/tag/v2.13.4 Also, it improves exhaustivity check. - https://github.com/scala/scala/pull/9140 (Check exhaustivity of pattern matches with "if" guards and custom extractors) - https://github.com/scala/scala/pull/9147 (Check all bindings exhaustively, e.g. tuples components) ### Does this PR introduce _any_ user-facing change? Yep. Although it's a maintenance version change, it's a Scala version change. ### How was this patch tested? Pass the CIs and do the manual testing. - Scala 2.12 CI jobs(GitHub Action/Jenkins UT/Jenkins K8s IT) to check the validity of code change. - Scala 2.13 Compilation job to check the compilation Closes #30455 from dongjoon-hyun/SCALA_3.13. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2020-11-23 16:28:43 -08:00
zhengruifeng	432afac07e	[SPARK-32907][ML] adaptively blockify instances - revert blockify gmm ### What changes were proposed in this pull request? revert blockify gmm ### Why are the changes needed? WeichenXu123 and I thought we should use memory size instead of number of rows to blockify instance; then if a buffer's size is large and determined by number of rows, we should discard it. In GMM, we found that the pre-allocated memory maybe too large and should be discarded: ``` transient private lazy val auxiliaryPDFMat = DenseMatrix.zeros(blockSize, numFeatures) ``` We had some offline discuss and thought it is better to revert blockify GMM. ### Does this PR introduce _any_ user-facing change? blockSize added in master branch will be removed ### How was this patch tested? existing testsuites Closes #29782 from zhengruifeng/unblockify_gmm. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>	2020-09-23 15:54:56 +08:00
Kousuke Saruta	b121f0d459	[SPARK-32873][BUILD] Fix code which causes error when build with sbt and Scala 2.13 ### What changes were proposed in this pull request? This PR fix code which causes error when build with sbt and Scala 2.13 like as follows. ``` [error] [warn] /home/kou/work/oss/spark-scala-2.13/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala:251: method with a single empty parameter list overrides method without any parameter list [error] [warn] override def hasNext(): Boolean = requestOffset < part.untilOffset [error] [warn] [error] [warn] /home/kou/work/oss/spark-scala-2.13/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala:294: method with a single empty parameter list overrides method without any parameter list [error] [warn] override def hasNext(): Boolean = okNext ``` More specifically, what this PR fixes are * Methods which has an empty parameter list and overrides an method which has no parameter list. ``` override def hasNext(): Boolean = okNext ``` * Methods which has no parameter list and overrides an method which has an empty parameter list. ``` override def next: (Int, Double) = { ``` * Infix operator expression that the operator wraps. ``` 3L * math.min(k, numFeatures) * math.min(k, numFeatures) 3L * math.min(k, numFeatures) * math.min(k, numFeatures) + + math.max(math.max(k, numFeatures), 4L * math.min(k, numFeatures) math.max(math.max(k, numFeatures), 4L * math.min(k, numFeatures) * * math.min(k, numFeatures) + 4L * math.min(k, numFeatures)) ``` ### Why are the changes needed? For building Spark with sbt and Scala 2.13. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? After this change and #29742 applied, compile passed with the following command. ``` build/sbt -Pscala-2.13 -Phive -Phive-thriftserver -Pyarn -Pkubernetes compile test:compile ``` Closes #29745 from sarutak/fix-code-for-sbt-and-spark-2.13. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2020-09-14 15:34:58 +09:00
Sean Owen	be2eca22e9	[SPARK-32398][TESTS][CORE][STREAMING][SQL][ML] Update to scalatest 3.2.0 for Scala 2.13.3+ ### What changes were proposed in this pull request? Updates to scalatest 3.2.0. Though it looks large, it is 99% changes to the new location of scalatest classes. ### Why are the changes needed? 3.2.0+ has a fix that is required for Scala 2.13.3+ compatibility. ### Does this PR introduce _any_ user-facing change? No, only affects tests. ### How was this patch tested? Existing tests. Closes #29196 from srowen/SPARK-32398. Authored-by: Sean Owen <srowen@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2020-07-23 16:20:17 -07:00
zhengruifeng	e7fa778dc7	[SPARK-30699][ML][PYSPARK] GMM blockify input vectors ### What changes were proposed in this pull request? 1, add new param blockSize; 2, if blockSize==1, keep original behavior, code path trainOnRows; 3, if blockSize>1, standardize and stack input vectors to blocks (like ALS/MLP), code path trainOnBlocks ### Why are the changes needed? performance gain on dense dataset HIGGS: 1, save about 45% RAM; 2, 3X faster with openBLAS ### Does this PR introduce any user-facing change? add a new expert param `blockSize` ### How was this patch tested? added testsuites Closes #27473 from zhengruifeng/blockify_gmm. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>	2020-05-12 12:54:03 +08:00
zhengruifeng	052ff49acd	[SPARK-30659][ML][PYSPARK] LogisticRegression blockify input vectors ### What changes were proposed in this pull request? 1, reorg the `fit` method in LR to several blocks (`createModel`, `createBounds`, `createOptimizer`, `createInitCoefWithInterceptMatrix`); 2, add new param blockSize; 3, if blockSize==1, keep original behavior, code path `trainOnRows`; 4, if blockSize>1, standardize and stack input vectors to blocks (like ALS/MLP), code path `trainOnBlocks` ### Why are the changes needed? On dense dataset `epsilon_normalized.t`: 1, reduce RAM to persist traing dataset; (save about 40% RAM) 2, use Level-2 BLAS routines; (4x ~ 5x faster) ### Does this PR introduce _any_ user-facing change? Yes, a new param is added ### How was this patch tested? existing and added testsuites Closes #28458 from zhengruifeng/blockify_lor_II. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>	2020-05-07 10:07:24 +08:00
zhengruifeng	ebdf41dd69	[SPARK-30642][ML][PYSPARK] LinearSVC blockify input vectors ### What changes were proposed in this pull request? 1, add new param `blockSize`; 2, add a new class InstanceBlock; 3, if `blockSize==1`, keep original behavior; if `blockSize>1`, stack input vectors to blocks (like ALS/MLP); 4, if `blockSize>1`, standardize the input outside of optimization procedure; ### Why are the changes needed? 1, reduce RAM to persist traing dataset; (save about 40% RAM) 2, use Level-2 BLAS routines; (4x ~ 5x faster on dataset `epsilon`) ### Does this PR introduce any user-facing change? Yes, a new param is added ### How was this patch tested? existing and added testsuites Closes #28349 from zhengruifeng/blockify_svc_II. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>	2020-05-06 10:06:23 +08:00
zhengruifeng	0ede08bcb2	[SPARK-31007][ML] KMeans optimization based on triangle-inequality ### What changes were proposed in this pull request? apply Lemma 1 in [Using the Triangle Inequality to Accelerate K-Means](https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf): > Let x be a point, and let b and c be centers. If d(b,c)>=2d(x,b) then d(x,c) >= d(x,b); It can be directly applied in EuclideanDistance, but not in CosineDistance. However, for CosineDistance we can luckily get a variant in the space of radian/angle. ### Why are the changes needed? It help improving the performance of prediction and training (mostly) ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites Closes #27758 from zhengruifeng/km_triangle. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2020-04-24 11:24:15 -05:00
yan ma	fae981e5f3	[SPARK-30773][ML] Support NativeBlas for level-1 routines ### What changes were proposed in this pull request? Change BLAS for part of level-1 routines(axpy, dot, scal(double, denseVector)) from java implementation to NativeBLAS when vector size>256 ### Why are the changes needed? In current ML BLAS.scala, all level-1 routines are fixed to use java implementation. But NativeBLAS(intel MKL, OpenBLAS) can bring up to 11X performance improvement based on performance test which apply direct calls against these methods. We should provide a way to allow user take advantage of NativeBLAS for level-1 routines. Here we do it through switching to NativeBLAS for these methods from f2jBLAS. ### Does this PR introduce any user-facing change? Yes, methods axpy, dot, scal in level-1 routines will switch to NativeBLAS when it has more than nativeL1Threshold(fixed value 256) elements and will fallback to f2jBLAS if native BLAS is not properly configured in system. ### How was this patch tested? Perf test direct calls level-1 routines Closes #27546 from yma11/SPARK-30773. Lead-authored-by: yan ma <yan.ma@intel.com> Co-authored-by: Ma Yan <yan.ma@intel.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2020-03-20 10:32:58 -05:00
zhengruifeng	111e9038d8	[SPARK-30770][ML] avoid vector conversion in GMM.transform ### What changes were proposed in this pull request? Current impl needs to convert ml.Vector to breeze.Vector, which can be skipped. ### Why are the changes needed? avoid unnecessary vector conversions ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites Closes #27519 from zhengruifeng/gmm_transform_opt. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>	2020-03-04 11:02:27 +08:00
Wu, Xiaochang	ac122762f5	[SPARK-30813][ML] Fix Matrices.sprand comments ### What changes were proposed in this pull request? Fix mistakes in comments ### Why are the changes needed? There are mistakes in comments ### Does this PR introduce any user-facing change? No ### How was this patch tested? N/A Closes #27564 from xwu99/fix-mllib-sprand-comment. Authored-by: Wu, Xiaochang <xiaochang.wu@intel.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2020-03-02 08:56:17 -06:00
gatorsmile	28b8713036	[SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT ### What changes were proposed in this pull request? This patch is to bump the master branch version to 3.1.0-SNAPSHOT. ### Why are the changes needed? N/A ### Does this PR introduce any user-facing change? N/A ### How was this patch tested? N/A Closes #27698 from gatorsmile/updateVersion. Authored-by: gatorsmile <gatorsmile@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2020-02-25 19:44:31 -08:00
zhengruifeng	e086a78706	[MINOR][ML] ML cleanup ### What changes were proposed in this pull request? 1, remove used imports and variables; 2, use `.iterator` instead of `.view` to avoid IDEA warnings; 3, remove resolved _TODO_ ### Why are the changes needed? cleanup ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites Closes #27600 from zhengruifeng/nits. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2020-02-25 12:32:12 -06:00
zhengruifeng	8ebbf85a85	[SPARK-30772][ML][SQL] avoid tuple assignment because it will circumvent the transient tag ### What changes were proposed in this pull request? it is said in [LeastSquaresAggregator](`12e1bbaddb/mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/LeastSquaresAggregator.scala (L188)`) that : > // do not use tuple assignment above because it will circumvent the transient tag I then check this issue with Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_241) ### Why are the changes needed? avoid tuple assignment because it will circumvent the transient tag ### Does this PR introduce any user-facing change? No ### How was this patch tested? existing testsuites Closes #27523 from zhengruifeng/avoid_tuple_assign_to_transient. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2020-02-16 10:01:49 -06:00

1 2 3

118 commits