Commit graph

96 commits

Author SHA1 Message Date
Ruifeng Zheng a66738823c [SPARK-35678][ML][FOLLOWUP] softmax support offset and step
### What changes were proposed in this pull request?
softmax support offset and step, then we can use it in ANN and NB

### Why are the changes needed?
to simplify impl

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existing testsuite

Closes #32991 from zhengruifeng/softmax_support_offset_step.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
2021-06-23 21:03:18 -05:00
Hyukjin Kwon ea907469bb Revert "[SPARK-35678][ML][FOLLOWUP] softmax support offset and step"
This reverts commit fdf86fd6e7.
2021-06-18 16:45:28 +09:00
Ruifeng Zheng fdf86fd6e7 [SPARK-35678][ML][FOLLOWUP] softmax support offset and step
### What changes were proposed in this pull request?
use newly impled softmax function in NB

### Why are the changes needed?
to simplify impl

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existing testsuite

Closes #32927 from zhengruifeng/softmax__followup.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
2021-06-17 22:46:36 -07:00
Ruifeng Zheng 5c96d643ee [SPARK-35707][ML] optimize sparse GEMM by skipping bound checking
### What changes were proposed in this pull request?
Sparse gemm use mothod `DenseMatrix.apply` to access the values, which can be optimized by skipping checking the bound and `isTransposed`

```
  override def apply(i: Int, j: Int): Double = values(index(i, j))

  private[ml] def index(i: Int, j: Int): Int = {
    require(i >= 0 && i < numRows, s"Expected 0 <= i < $numRows, got i = $i.")
    require(j >= 0 && j < numCols, s"Expected 0 <= j < $numCols, got j = $j.")
    if (!isTransposed) i + numRows * j else j + numCols * i
  }

```

### Why are the changes needed?
to improve performance, about 15% faster in the designed case

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existing testsuite and additional performance test

Closes #32857 from zhengruifeng/gemm_opt_index.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
2021-06-16 08:57:27 +08:00
Ruifeng Zheng 2802ac321f [SPARK-35666][ML] gemv skip array shape checking
### What changes were proposed in this pull request?
In existing impls, it is common case that the vector/matrix need to be sliced/copied just due to shape match.
which makes the logic complex and introduce extra costing of slicing & copying.

### Why are the changes needed?
1, avoid slicing and copying due to shape checking;
2, simpify the usages;

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #32805 from zhengruifeng/new_blas_func_for_agg.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
2021-06-16 08:54:34 +08:00
Ruifeng Zheng 8c4b535baf [SPARK-35678][ML] add a common softmax function
### What changes were proposed in this pull request?
add softmax function in utils

### Why are the changes needed?
it can be used in multi places

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
existing testsuites

Closes #32822 from zhengruifeng/add_softmax_func.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
2021-06-15 10:33:57 +08:00
yangjie01 09d039da56 [SPARK-35526][CORE][SQL][ML][MLLIB] Re-Cleanup procedure syntax is deprecated compilation warning in Scala 2.13
### What changes were proposed in this pull request?
After SPARK-29291 and SPARK-33352, there are still some compilation warnings about `procedure syntax is deprecated` as follows:

```
[WARNING] [Warn] /spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:723: [deprecation   | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `registerMergeResult`'s return type
[WARNING] [Warn] /spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:748: [deprecation   | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `unregisterMergeResult`'s return type
[WARNING] [Warn] /spark/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala:223: [deprecation   | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `testSimpleSpillingForAllCodecs`'s return type
[WARNING] [Warn] /spark/mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASBenchmark.scala:53: [deprecation   | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `runBLASBenchmark`'s return type
[WARNING] [Warn] /spark/sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala:110: [deprecation   | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `assertEmptyRootPath`'s return type
[WARNING] [Warn] /spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala:602: [deprecation   | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `executeCTASWithNonEmptyLocation`'s return type
```

So the main change of this pr is cleanup these compilation warnings.

### Why are the changes needed?
Eliminate compilation warnings in Scala 2.13 and this change should be compatible with Scala 2.12

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Pass the Jenkins or GitHub Action

Closes #32669 from LuciferYang/re-clean-procedure-syntax.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2021-05-30 16:49:47 -07:00
Ludovic Henry b52d47a920 [SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludovic.netlib:2.0
### What changes were proposed in this pull request?

Bump to `dev.ludovic.netlib:2.0` which provides JNI-based wrappers for BLAS, ARPACK, and LAPACK. Theseare not taking dependencies on GPL or LGPL libraries, allowing to provide out-of-the-box support for hardware acceleration when a native library is present (this is still up to the end-user to install such library on their system, like OpenBLAS, Intel MKL, and libarpack2).

### Why are the changes needed?

Great performance improvement for ML-related workload on vanilla-distributions of Spark.

### Does this PR introduce _any_ user-facing change?

Users now take advantage of hardware acceleration as long as a native library is installed (like OpenBLAS, Intel MKL and libarpack2).

### How was this patch tested?

Spark test-suite + dev.ludovic.netlib testsuite.

#### JDK8:
```
[info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU  3.80GHz
[info]
[info] f2jBLAS    = dev.ludovic.netlib.blas.F2jBLAS
[info] javaBLAS   = dev.ludovic.netlib.blas.Java8BLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS
[info]
[info] daxpy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        220            226           6        454.9           2.2       1.0X
[info] java                       221            228           5        451.9           2.2       1.0X
[info] native                     209            215           5        478.7           2.1       1.1X
[info]
[info] saxpy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        121            125           3        823.3           1.2       1.0X
[info] java                       121            125           3        824.3           1.2       1.0X
[info] native                     101            105           3        988.4           1.0       1.2X
[info]
[info] dcopy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        212            219           6        470.9           2.1       1.0X
[info] java                       208            212           4        481.0           2.1       1.0X
[info] native                     209            215           5        478.5           2.1       1.0X
[info]
[info] scopy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        114            119           3        878.9           1.1       1.0X
[info] java                        99            105           3       1011.4           1.0       1.2X
[info] native                      97            103           3       1026.7           1.0       1.2X
[info]
[info] ddot:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        108            111           2        925.9           1.1       1.0X
[info] java                        71             73           2       1414.9           0.7       1.5X
[info] native                      54             56           2       1847.0           0.5       2.0X
[info]
[info] sdot:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         96             97           2       1046.8           1.0       1.0X
[info] java                        47             48           1       2129.8           0.5       2.0X
[info] native                      29             30           1       3404.7           0.3       3.3X
[info]
[info] dnrm2:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        139            143           2        718.2           1.4       1.0X
[info] java                        46             47           1       2171.2           0.5       3.0X
[info] native                      44             46           2       2261.8           0.4       3.1X
[info]
[info] snrm2:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        154            157           4        651.0           1.5       1.0X
[info] java                        40             42           1       2469.3           0.4       3.8X
[info] native                      26             27           1       3787.6           0.3       5.8X
[info]
[info] dscal:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        185            195           8        541.0           1.8       1.0X
[info] java                       186            196           7        538.5           1.9       1.0X
[info] native                     177            187           7        564.1           1.8       1.0X
[info]
[info] sscal:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         98            102           3       1016.2           1.0       1.0X
[info] java                        98            102           3       1017.8           1.0       1.0X
[info] native                      87             91           3       1143.2           0.9       1.1X
[info]
[info] dgemv[N]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         68             70           1       1474.7           0.7       1.0X
[info] java                        51             52           1       1973.0           0.5       1.3X
[info] native                      30             32           1       3298.8           0.3       2.2X
[info]
[info] dgemv[T]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         96             99           2       1037.9           1.0       1.0X
[info] java                        50             51           1       1999.6           0.5       1.9X
[info] native                      30             31           1       3368.1           0.3       3.2X
[info]
[info] sgemv[N]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         59             61           1       1688.7           0.6       1.0X
[info] java                        41             42           1       2461.9           0.4       1.5X
[info] native                      15             16           1       6593.0           0.2       3.9X
[info]
[info] sgemv[T]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         90             92           1       1116.2           0.9       1.0X
[info] java                        39             40           1       2565.8           0.4       2.3X
[info] native                      15             16           1       6594.2           0.2       5.9X
[info]
[info] dger:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        192            202           7        520.5           1.9       1.0X
[info] java                       203            214           7        491.9           2.0       0.9X
[info] native                     176            187           7        568.8           1.8       1.1X
[info]
[info] dspmv[U]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         59             61           1        846.1           1.2       1.0X
[info] java                        38             39           1       1313.5           0.8       1.6X
[info] native                      24             27           1       2047.8           0.5       2.4X
[info]
[info] dspr[U]:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         97            101           3        515.4           1.9       1.0X
[info] java                        97            101           2        515.1           1.9       1.0X
[info] native                      88             91           3        569.1           1.8       1.1X
[info]
[info] dsyr[U]:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        169            174           3        295.4           3.4       1.0X
[info] java                       169            174           3        295.4           3.4       1.0X
[info] native                     160            165           4        312.2           3.2       1.1X
[info]
[info] dgemm[N,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        561            577          13       1782.3           0.6       1.0X
[info] java                       225            231           4       4446.2           0.2       2.5X
[info] native                      31             32           3      32473.1           0.0      18.2X
[info]
[info] dgemm[N,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        570            584           9       1754.8           0.6       1.0X
[info] java                       224            230           4       4457.3           0.2       2.5X
[info] native                      31             32           1      32493.4           0.0      18.5X
[info]
[info] dgemm[T,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        855            866           6       1169.2           0.9       1.0X
[info] java                       224            228           3       4466.9           0.2       3.8X
[info] native                      31             32           1      32395.5           0.0      27.7X
[info]
[info] dgemm[T,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                       1328           1344           8        752.8           1.3       1.0X
[info] java                       224            230           4       4458.9           0.2       5.9X
[info] native                      31             32           1      32201.8           0.0      42.8X
[info]
[info] sgemm[N,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        534            541           5       1873.0           0.5       1.0X
[info] java                       220            224           3       4542.8           0.2       2.4X
[info] native                      15             16           1      66803.1           0.0      35.7X
[info]
[info] sgemm[N,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        544            551           6       1839.6           0.5       1.0X
[info] java                       220            224           4       4538.2           0.2       2.5X
[info] native                      15             16           1      65589.9           0.0      35.7X
[info]
[info] sgemm[T,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        833            845          21       1201.0           0.8       1.0X
[info] java                       220            224           3       4548.7           0.2       3.8X
[info] native                      15             16           1      66603.2           0.0      55.5X
[info]
[info] sgemm[T,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        899            907           5       1112.9           0.9       1.0X
[info] java                       221            224           2       4531.6           0.2       4.1X
[info] native                      15             16           1      65944.9           0.0      59.3X
```

#### JDK11:
```
[info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU  3.80GHz
[info]
[info] f2jBLAS    = dev.ludovic.netlib.blas.F2jBLAS
[info] javaBLAS   = dev.ludovic.netlib.blas.Java11BLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS
[info]
[info] daxpy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        195            200           3        512.2           2.0       1.0X
[info] java                       197            202           3        507.0           2.0       1.0X
[info] native                     184            189           4        543.0           1.8       1.1X
[info]
[info] saxpy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        108            112           3        921.8           1.1       1.0X
[info] java                       101            105           3        989.4           1.0       1.1X
[info] native                      87             91           3       1147.1           0.9       1.2X
[info]
[info] dcopy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        187            191           3        535.1           1.9       1.0X
[info] java                       182            188           3        548.8           1.8       1.0X
[info] native                     178            182           3        562.2           1.8       1.1X
[info]
[info] scopy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        110            114           3        909.3           1.1       1.0X
[info] java                        86             93           4       1159.3           0.9       1.3X
[info] native                      86             90           3       1162.4           0.9       1.3X
[info]
[info] ddot:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        106            108           2        943.6           1.1       1.0X
[info] java                        70             71           2       1426.8           0.7       1.5X
[info] native                      54             56           2       1835.4           0.5       1.9X
[info]
[info] sdot:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         96             97           1       1047.1           1.0       1.0X
[info] java                        43             44           1       2331.9           0.4       2.2X
[info] native                      29             30           1       3392.1           0.3       3.2X
[info]
[info] dnrm2:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        114            115           2        880.7           1.1       1.0X
[info] java                        42             43           1       2398.1           0.4       2.7X
[info] native                      45             46           1       2233.3           0.4       2.5X
[info]
[info] snrm2:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        140            143           2        714.6           1.4       1.0X
[info] java                        28             29           1       3531.0           0.3       4.9X
[info] native                      26             27           1       3820.0           0.3       5.3X
[info]
[info] dscal:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        156            166           7        641.3           1.6       1.0X
[info] java                       158            167           6        633.2           1.6       1.0X
[info] native                     150            160           7        664.8           1.5       1.0X
[info]
[info] sscal:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         85             88           2       1181.7           0.8       1.0X
[info] java                        85             88           2       1176.0           0.9       1.0X
[info] native                      75             78           2       1333.2           0.8       1.1X
[info]
[info] dgemv[N]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         58             59           1       1731.1           0.6       1.0X
[info] java                        41             43           1       2415.5           0.4       1.4X
[info] native                      30             31           1       3293.9           0.3       1.9X
[info]
[info] dgemv[T]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         94             96           1       1063.4           0.9       1.0X
[info] java                        41             42           1       2435.8           0.4       2.3X
[info] native                      30             30           1       3379.8           0.3       3.2X
[info]
[info] sgemv[N]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         44             45           1       2278.9           0.4       1.0X
[info] java                        37             38           0       2686.8           0.4       1.2X
[info] native                      15             16           1       6555.4           0.2       2.9X
[info]
[info] sgemv[T]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         88             89           1       1142.1           0.9       1.0X
[info] java                        33             34           1       3010.7           0.3       2.6X
[info] native                      15             16           1       6553.9           0.2       5.7X
[info]
[info] dger:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        164            172           4        609.4           1.6       1.0X
[info] java                       163            172           5        612.6           1.6       1.0X
[info] native                     150            159           4        667.0           1.5       1.1X
[info]
[info] dspmv[U]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         49             50           1       1029.4           1.0       1.0X
[info] java                        41             42           1       1209.4           0.8       1.2X
[info] native                      25             27           1       2029.2           0.5       2.0X
[info]
[info] dspr[U]:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         80             85           3        622.2           1.6       1.0X
[info] java                        80             85           3        622.4           1.6       1.0X
[info] native                      75             79           3        668.7           1.5       1.1X
[info]
[info] dsyr[U]:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        137            142           3        364.1           2.7       1.0X
[info] java                       139            142           2        360.4           2.8       1.0X
[info] native                     131            135           3        380.4           2.6       1.0X
[info]
[info] dgemm[N,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        517            525           5       1935.5           0.5       1.0X
[info] java                       213            216           3       4704.8           0.2       2.4X
[info] native                      31             31           1      32705.6           0.0      16.9X
[info]
[info] dgemm[N,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        589            601           6       1698.6           0.6       1.0X
[info] java                       213            217           3       4693.3           0.2       2.8X
[info] native                      31             32           1      32498.9           0.0      19.1X
[info]
[info] dgemm[T,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        851            865           6       1175.3           0.9       1.0X
[info] java                       212            216           3       4717.0           0.2       4.0X
[info] native                      30             32           1      32903.0           0.0      28.0X
[info]
[info] dgemm[T,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                       1301           1316           6        768.4           1.3       1.0X
[info] java                       212            216           2       4717.4           0.2       6.1X
[info] native                      31             32           1      32606.0           0.0      42.4X
[info]
[info] sgemm[N,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        454            460           2       2203.0           0.5       1.0X
[info] java                       208            212           3       4803.8           0.2       2.2X
[info] native                      15             16           0      66586.0           0.0      30.2X
[info]
[info] sgemm[N,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        529            536           4       1889.7           0.5       1.0X
[info] java                       208            212           3       4798.6           0.2       2.5X
[info] native                      15             16           1      66751.4           0.0      35.3X
[info]
[info] sgemm[T,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        830            840           5       1205.1           0.8       1.0X
[info] java                       208            211           2       4814.1           0.2       4.0X
[info] native                      15             15           1      67676.4           0.0      56.2X
[info]
[info] sgemm[T,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        894            907           7       1118.7           0.9       1.0X
[info] java                       208            211           3       4809.6           0.2       4.3X
[info] native                      15             16           1      66675.2           0.0      59.6X
```

#### JDK16:
```
[info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU  3.80GHz
[info]
[info] f2jBLAS    = dev.ludovic.netlib.blas.F2jBLAS
[info] javaBLAS   = dev.ludovic.netlib.blas.VectorBLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS
[info]
[info] daxpy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        193            199           3        517.5           1.9       1.0X
[info] java                       181            186           4        553.2           1.8       1.1X
[info] native                     181            185           5        553.6           1.8       1.1X
[info]
[info] saxpy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        108            112           2        925.1           1.1       1.0X
[info] java                        88             91           3       1138.6           0.9       1.2X
[info] native                      87             91           3       1144.2           0.9       1.2X
[info]
[info] dcopy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        184            189           3        542.5           1.8       1.0X
[info] java                       181            185           3        552.8           1.8       1.0X
[info] native                     179            183           2        558.0           1.8       1.0X
[info]
[info] scopy:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         97            101           3       1031.6           1.0       1.0X
[info] java                        86             90           2       1163.7           0.9       1.1X
[info] native                      85             88           2       1182.9           0.8       1.1X
[info]
[info] ddot:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        107            109           2        932.4           1.1       1.0X
[info] java                        54             56           2       1846.7           0.5       2.0X
[info] native                      54             56           2       1846.7           0.5       2.0X
[info]
[info] sdot:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         96             97           1       1043.6           1.0       1.0X
[info] java                        29             30           1       3439.3           0.3       3.3X
[info] native                      29             30           1       3423.9           0.3       3.3X
[info]
[info] dnrm2:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        121            123           2        829.8           1.2       1.0X
[info] java                        32             32           1       3171.3           0.3       3.8X
[info] native                      45             46           1       2246.2           0.4       2.7X
[info]
[info] snrm2:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        142            144           2        705.9           1.4       1.0X
[info] java                        15             16           1       6585.8           0.2       9.3X
[info] native                      26             27           1       3839.5           0.3       5.4X
[info]
[info] dscal:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        157            165           5        635.6           1.6       1.0X
[info] java                       151            159           5        664.0           1.5       1.0X
[info] native                     151            160           5        663.6           1.5       1.0X
[info]
[info] sscal:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         85             89           2       1172.3           0.9       1.0X
[info] java                        75             79           3       1337.3           0.7       1.1X
[info] native                      75             79           2       1335.5           0.7       1.1X
[info]
[info] dgemv[N]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         58             59           1       1731.5           0.6       1.0X
[info] java                        28             29           1       3544.2           0.3       2.0X
[info] native                      30             31           1       3306.2           0.3       1.9X
[info]
[info] dgemv[T]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         90             92           1       1108.3           0.9       1.0X
[info] java                        28             28           1       3622.5           0.3       3.3X
[info] native                      30             31           1       3381.3           0.3       3.1X
[info]
[info] sgemv[N]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         44             45           1       2284.7           0.4       1.0X
[info] java                        14             15           1       7034.0           0.1       3.1X
[info] native                      15             16           1       6643.7           0.2       2.9X
[info]
[info] sgemv[T]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         85             86           1       1177.4           0.8       1.0X
[info] java                        15             15           1       6886.1           0.1       5.8X
[info] native                      15             16           1       6560.1           0.2       5.6X
[info]
[info] dger:            Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        164            173           6        608.1           1.6       1.0X
[info] java                       148            157           5        675.2           1.5       1.1X
[info] native                     152            160           5        659.9           1.5       1.1X
[info]
[info] dspmv[U]:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         61             63           1        815.4           1.2       1.0X
[info] java                        16             17           1       3104.3           0.3       3.8X
[info] native                      24             27           1       2071.9           0.5       2.5X
[info]
[info] dspr[U]:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                         81             85           2        616.4           1.6       1.0X
[info] java                        81             85           2        614.7           1.6       1.0X
[info] native                      75             78           2        669.5           1.5       1.1X
[info]
[info] dsyr[U]:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        138            141           3        362.7           2.8       1.0X
[info] java                       137            140           2        365.3           2.7       1.0X
[info] native                     131            134           2        382.9           2.6       1.1X
[info]
[info] dgemm[N,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        525            544           8       1906.2           0.5       1.0X
[info] java                        61             68           3      16358.1           0.1       8.6X
[info] native                      31             32           1      32623.7           0.0      17.1X
[info]
[info] dgemm[N,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        580            598          12       1724.5           0.6       1.0X
[info] java                        61             68           4      16302.5           0.1       9.5X
[info] native                      30             32           1      32962.8           0.0      19.1X
[info]
[info] dgemm[T,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        829            838           4       1206.2           0.8       1.0X
[info] java                        61             69           3      16339.7           0.1      13.5X
[info] native                      30             31           1      33231.9           0.0      27.6X
[info]
[info] dgemm[T,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                       1352           1363           5        739.6           1.4       1.0X
[info] java                        61             69           3      16347.0           0.1      22.1X
[info] native                      31             32           1      32740.3           0.0      44.3X
[info]
[info] sgemm[N,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        482            493           7       2073.1           0.5       1.0X
[info] java                        35             38           2      28315.3           0.0      13.7X
[info] native                      15             15           1      67579.7           0.0      32.6X
[info]
[info] sgemm[N,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        472            482           4       2119.0           0.5       1.0X
[info] java                        36             38           2      28138.1           0.0      13.3X
[info] native                      15             16           1      66616.5           0.0      31.4X
[info]
[info] sgemm[T,N]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        823            830           5       1215.2           0.8       1.0X
[info] java                        35             38           2      28681.4           0.0      23.6X
[info] native                      15             15           1      67908.4           0.0      55.9X
[info]
[info] sgemm[T,T]:      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] -----------------------------------------------------------------------------------------------
[info] f2j                        896            908           7       1115.8           0.9       1.0X
[info] java                        35             38           2      28402.0           0.0      25.5X
[info] native                      15             16           0      66691.2           0.0      59.8X
```

TODO:
- [x] update documentation in `docs/` and `docs/ml-linalg-guide.md` refering `com.github.fommil.netlib`
- [ ] merge https://github.com/luhenry/netlib/pull/1 with all feedback from this PR + remove references to snapshot repositories in `pom.xml` and `project/SparkBuild.scala`.

Closes #32415 from luhenry/master.

Authored-by: Ludovic Henry <git@ludovic.dev>
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-05-12 08:59:36 -05:00
byungsoo 94bbca3e55 [SPARK-35306][MLLIB][TESTS] Add benchmark results for BLASBenchmark created by GitHub Actions machines
### What changes were proposed in this pull request?
This PR adds benchmark results for `BLASBenchmark` created by GitHub Actions machines.
Benchmark result files are added for both JDK 8 (`BLASBenchmark-result.txt`) and 11 (`BLASBenchmark-jdk11-result.txt`) in `{SPARK_HOME}/mllib-local/benchmarks/`.

### Why are the changes needed?
In [SPARK-34950](https://issues.apache.org/jira/browse/SPARK-34950), benchmark results were updated to the ones created by Github Actions machines.
As benchmark results for `BLASBenchmark` (added at [SPARK-33882](https://issues.apache.org/jira/browse/SPARK-33882) and [SPARK-35150](https://issues.apache.org/jira/browse/SPARK-35150)) are not currently available at the repository, this PR adds them.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
The benchmark results were obtained by running tests with GitHub Actions workflow in my forked repository.
You can refer to the test results and output files from the link below.
- https://github.com/byungsoo-oh/spark/actions/runs/809900377
- https://github.com/byungsoo-oh/spark/actions/runs/810084610

Closes #32435 from byungsoo-oh/SPARK-35306.

Authored-by: byungsoo <byungsoo@byungsoo-pc.tn.corp.samsungelectronics.net>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2021-05-07 11:08:10 +09:00
Ludovic Henry 5b77ebb57b [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib
### What changes were proposed in this pull request?

Following https://github.com/apache/spark/pull/30810, I've continued looking for ways to accelerate the usage of BLAS in Spark. With this PR, I integrate work done in the [`dev.ludovic.netlib`](https://github.com/luhenry/netlib/) Maven package.

The `dev.ludovic.netlib` library wraps the original `com.github.fommil.netlib` library and focus on accelerating the linear algebra routines in use in Spark. When running the `org.apache.spark.ml.linalg.BLASBenchmark` benchmarking suite, I get the results at [1] on an Intel machine. Moreover, this library is thoroughly tested to return the exact same results as the reference implementation.

Under the hood, it reimplements the necessary algorithms in pure autovectorization-friendly Java 8, as well as takes advantage of the Vector API and Foreign Linker API introduced in JDK 16 when available.

A table summarising which version gets loaded in which case:

```
|                       | BLAS.nativeBLAS                                    | BLAS.javaBLAS                                      |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| with -Pnetlib-lgpl    | 1. dev.ludovic.netlib.blas.NetlibNativeBLAS, a     | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
|                       |     wrapper for com.github.fommil:all              |    (JDK16+, relies on the Vector API, requires     |
|                       | 2. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    |     `--add-modules=jdk.incubator.vector` on JDK16) |
|                       |    relies on the Foreign Linker API, requires      | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
|                       |    `--add-modules=jdk.incubator.foreign            | 3. dev.ludovic.netlib.blas.JavaBLAS                |
|                       |     -Dforeign.restricted=warn`)                    | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
|                       | 3. fails to load, falls back to BLAS.javaBLAS in   |     wrapper for com.github.fommil:core             |
|                       |     org.apache.spark.ml.linalg.BLAS                |                                                    |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
| without -Pnetlib-lgpl | 1. dev.ludovic.netlib.blas.ForeignBLAS (JDK16+,    | 1. dev.ludovic.netlib.blas.VectorizedBLAS          |
|                       |    relies on the Foreign Linker API, requires      |    (JDK16+, relies on the Vector API, requires     |
|                       |    `--add-modules=jdk.incubator.foreign            |     `--add-modules=jdk.incubator.vector` on JDK16) |
|                       |     -Dforeign.restricted=warn`)                    | 2. dev.ludovic.netlib.blas.Java11BLAS (JDK11+)     |
|                       | 2. fails to load, falls back to BLAS.javaBLAS in   | 3. dev.ludovic.netlib.blas.JavaBLAS                |
|                       |     org.apache.spark.ml.linalg.BLAS                | 4. dev.ludovic.netlib.blas.NetlibF2jBLAS, a        |
|                       |                                                    |     wrapper for com.github.fommil:core             |
| --------------------- | -------------------------------------------------- | -------------------------------------------------- |
```

### Why are the changes needed?

Accelerates linear algebra operations when the pure-java fallback method is in use. Transparently falls back to native implementation (OpenBLAS, MKL) when available.

### Does this PR introduce _any_ user-facing change?

No, all changes are transparent to the user.

### How was this patch tested?

The `dev.ludovic.netlib` library has its own test suite [2]. It has also been validated by running the Spark test suite and benchmarking suite.

[1] Results for `org.apache.spark.ml.linalg.BLASBenchmark`:
#### JDK8:
```
[info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU  3.80GHz
[info]
[info] f2jBLAS    = dev.ludovic.netlib.blas.NetlibF2jBLAS
[info] javaBLAS   = dev.ludovic.netlib.blas.Java8BLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.Java8BLAS
[info]
[info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 223            232           8        448.0           2.2       1.0X
[info] java                                                221            228           7        453.0           2.2       1.0X
[info]
[info] saxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 122            128           4        821.2           1.2       1.0X
[info] java                                                122            128           4        822.3           1.2       1.0X
[info]
[info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 109            112           2        921.4           1.1       1.0X
[info] java                                                 70             74           3       1423.5           0.7       1.5X
[info]
[info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  96             98           2       1046.1           1.0       1.0X
[info] java                                                 47             49           2       2121.7           0.5       2.0X
[info]
[info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 184            195           8        544.3           1.8       1.0X
[info] java                                                185            196           7        539.5           1.9       1.0X
[info]
[info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  99            104           4       1011.9           1.0       1.0X
[info] java                                                 99            104           4       1010.4           1.0       1.0X
[info]
[info] dspmv[U]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        947.2           1.1       1.0X
[info] java                                                  0              0           0       1584.8           0.6       1.7X
[info]
[info] dspr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        867.4           1.2       1.0X
[info] java                                                  1              1           0        865.0           1.2       1.0X
[info]
[info] dsyr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        485.9           2.1       1.0X
[info] java                                                  1              1           0        486.8           2.1       1.0X
[info]
[info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1843.0           0.5       1.0X
[info] java                                                  0              0           0       2690.6           0.4       1.5X
[info]
[info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1214.7           0.8       1.0X
[info] java                                                  0              0           0       2536.8           0.4       2.1X
[info]
[info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1895.9           0.5       1.0X
[info] java                                                  0              0           0       2961.1           0.3       1.6X
[info]
[info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1223.4           0.8       1.0X
[info] java                                                  0              0           0       3091.4           0.3       2.5X
[info]
[info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 560            575          20       1787.1           0.6       1.0X
[info] java                                                226            232           5       4432.4           0.2       2.5X
[info]
[info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 570            586          23       1755.2           0.6       1.0X
[info] java                                                227            232           4       4410.1           0.2       2.5X
[info]
[info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 863            879          17       1158.4           0.9       1.0X
[info] java                                                227            231           3       4407.9           0.2       3.8X
[info]
[info] dgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                1282           1305          23        780.0           1.3       1.0X
[info] java                                                227            232           4       4413.4           0.2       5.7X
[info]
[info] sgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 538            548           8       1858.6           0.5       1.0X
[info] java                                                221            226           3       4521.1           0.2       2.4X
[info]
[info] sgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 549            558          10       1819.9           0.5       1.0X
[info] java                                                222            229           7       4503.5           0.2       2.5X
[info]
[info] sgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 838            852          12       1193.0           0.8       1.0X
[info] java                                                222            229           5       4500.5           0.2       3.8X
[info]
[info] sgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 905            919          18       1104.8           0.9       1.0X
[info] java                                                221            228           5       4521.3           0.2       4.1X
```

#### JDK11:
```
[info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU  3.80GHz
[info]
[info] f2jBLAS    = dev.ludovic.netlib.blas.NetlibF2jBLAS
[info] javaBLAS   = dev.ludovic.netlib.blas.Java11BLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.Java11BLAS
[info]
[info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 195            204          10        512.7           2.0       1.0X
[info] java                                                195            202           7        512.4           2.0       1.0X
[info]
[info] saxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 108            113           4        923.3           1.1       1.0X
[info] java                                                102            107           4        984.4           1.0       1.1X
[info]
[info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 107            110           3        938.1           1.1       1.0X
[info] java                                                 69             72           3       1447.1           0.7       1.5X
[info]
[info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  96             98           2       1046.5           1.0       1.0X
[info] java                                                 43             45           2       2317.1           0.4       2.2X
[info]
[info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 155            168           8        644.2           1.6       1.0X
[info] java                                                158            169           8        632.8           1.6       1.0X
[info]
[info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  85             90           4       1178.1           0.8       1.0X
[info] java                                                 86             90           4       1167.7           0.9       1.0X
[info]
[info] dspmv[U]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   0              0           0       1182.1           0.8       1.0X
[info] java                                                  0              0           0       1432.1           0.7       1.2X
[info]
[info] dspr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        898.7           1.1       1.0X
[info] java                                                  1              1           0        891.5           1.1       1.0X
[info]
[info] dsyr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        495.4           2.0       1.0X
[info] java                                                  1              1           0        495.7           2.0       1.0X
[info]
[info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   0              0           0       2271.6           0.4       1.0X
[info] java                                                  0              0           0       3648.1           0.3       1.6X
[info]
[info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1229.3           0.8       1.0X
[info] java                                                  0              0           0       2711.3           0.4       2.2X
[info]
[info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   0              0           0       2677.5           0.4       1.0X
[info] java                                                  0              0           0       3288.2           0.3       1.2X
[info]
[info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1233.0           0.8       1.0X
[info] java                                                  0              0           0       2766.3           0.4       2.2X
[info]
[info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 520            536          16       1923.6           0.5       1.0X
[info] java                                                214            221           7       4669.5           0.2       2.4X
[info]
[info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 593            612          17       1686.5           0.6       1.0X
[info] java                                                215            219           3       4643.3           0.2       2.8X
[info]
[info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 853            870          16       1172.8           0.9       1.0X
[info] java                                                215            218           3       4659.7           0.2       4.0X
[info]
[info] dgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                1350           1370          23        740.8           1.3       1.0X
[info] java                                                215            219           4       4656.6           0.2       6.3X
[info]
[info] sgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 460            468           6       2173.2           0.5       1.0X
[info] java                                                210            213           2       4752.7           0.2       2.2X
[info]
[info] sgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 535            544           8       1869.3           0.5       1.0X
[info] java                                                210            215           5       4761.8           0.2       2.5X
[info]
[info] sgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 843            853          11       1186.8           0.8       1.0X
[info] java                                                209            214           4       4793.4           0.2       4.0X
[info]
[info] sgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 891            904          15       1122.0           0.9       1.0X
[info] java                                                209            214           4       4777.2           0.2       4.3X
```

#### JDK16:
```
[info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic
[info] Intel(R) Xeon(R) E-2276G CPU  3.80GHz
[info]
[info] f2jBLAS    = dev.ludovic.netlib.blas.NetlibF2jBLAS
[info] javaBLAS   = dev.ludovic.netlib.blas.VectorizedBLAS
[info] nativeBLAS = dev.ludovic.netlib.blas.VectorizedBLAS
[info]
[info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 194            199           7        515.7           1.9       1.0X
[info] java                                                181            186           3        551.1           1.8       1.1X
[info]
[info] saxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 109            115           4        915.0           1.1       1.0X
[info] java                                                 88             92           3       1138.8           0.9       1.2X
[info]
[info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 108            110           2        922.6           1.1       1.0X
[info] java                                                 54             56           2       1839.2           0.5       2.0X
[info]
[info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  96             97           2       1046.1           1.0       1.0X
[info] java                                                 29             30           1       3393.4           0.3       3.2X
[info]
[info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 156            165           5        643.0           1.6       1.0X
[info] java                                                150            159           5        667.1           1.5       1.0X
[info]
[info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  85             91           6       1171.0           0.9       1.0X
[info] java                                                 75             79           3       1340.6           0.7       1.1X
[info]
[info] dspmv[U]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        917.0           1.1       1.0X
[info] java                                                  0              0           0       8147.2           0.1       8.9X
[info]
[info] dspr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        859.3           1.2       1.0X
[info] java                                                  1              1           0        859.3           1.2       1.0X
[info]
[info] dsyr[U]:                                  Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0        482.1           2.1       1.0X
[info] java                                                  1              1           0        482.6           2.1       1.0X
[info]
[info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   0              0           0       2214.2           0.5       1.0X
[info] java                                                  0              0           0       7975.8           0.1       3.6X
[info]
[info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1231.4           0.8       1.0X
[info] java                                                  0              0           0       8680.9           0.1       7.0X
[info]
[info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   0              0           0       2684.3           0.4       1.0X
[info] java                                                  0              0           0      18527.1           0.1       6.9X
[info]
[info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                   1              1           0       1235.4           0.8       1.0X
[info] java                                                  0              0           0      17347.9           0.1      14.0X
[info]
[info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 530            552          18       1887.5           0.5       1.0X
[info] java                                                 58             64           3      17143.9           0.1       9.1X
[info]
[info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 598            620          17       1671.1           0.6       1.0X
[info] java                                                 58             64           3      17196.6           0.1      10.3X
[info]
[info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 834            847          14       1199.4           0.8       1.0X
[info] java                                                 57             63           4      17486.9           0.1      14.6X
[info]
[info] dgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                1338           1366          22        747.3           1.3       1.0X
[info] java                                                 58             63           3      17356.6           0.1      23.2X
[info]
[info] sgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 489            501           9       2045.5           0.5       1.0X
[info] java                                                 36             38           2      27721.9           0.0      13.6X
[info]
[info] sgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 478            488           9       2094.0           0.5       1.0X
[info] java                                                 36             38           2      27813.2           0.0      13.3X
[info]
[info] sgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 825            837          10       1211.6           0.8       1.0X
[info] java                                                 35             38           2      28433.1           0.0      23.5X
[info]
[info] sgemm[T,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 900            918          15       1111.6           0.9       1.0X
[info] java                                                 36             38           2      28073.0           0.0      25.3X
```

[2] https://github.com/luhenry/netlib/tree/master/blas/src/test/java/dev/ludovic/netlib/blas

Closes #32253 from luhenry/master.

Authored-by: Ludovic Henry <git@ludovic.dev>
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-04-27 14:00:59 -05:00
Ludovic Henry 9244066ca6 [SPARK-33882][ML] Add a vectorized BLAS implementation
### What changes were proposed in this pull request?

This patch introduces a VectorizedBLAS class which implements such hardware-accelerated BLAS operations. This feature is hidden behind the "vectorized" profile that you can enable by passing "-Pvectorized" to sbt or maven.

The Vector API has been introduced in JDK 16. Following discussion on the mailing list, this API is introduced transparently and needs to be enabled explicitely.

### Why are the changes needed?

Whenever a native BLAS implementation isn't available on the system, Spark automatically falls back onto a Java implementation. With the recent release of the Vector API in the OpenJDK [1], we can use hardware acceleration for such operations.

This change was also discussed on the mailing list. [2]

### Does this PR introduce _any_ user-facing change?

It introduces a build-time profile called `vectorized`. You can pass it to sbt and mvn with `-Pvectorized`. There is no change to the end-user of Spark and it should only impact Spark developpers. It is also disabled by default.

### How was this patch tested?

It passes `build/sbt mllib-local/test` with and without `-Pvectorized` with JDK 16. This patch also introduces benchmarks for BLAS.

The benchmark results are as follows:

```
[info] daxpy:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  37             37           0        271.5           3.7       1.0X
[info] vector                                               24             25           4        416.1           2.4       1.5X
[info]
[info] ddot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  70             70           0        143.2           7.0       1.0X
[info] vector                                               35             35           2        288.7           3.5       2.0X
[info]
[info] sdot:                                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  50             51           1        199.8           5.0       1.0X
[info] vector                                               15             15           0        648.7           1.5       3.2X
[info]
[info] dscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  34             34           0        295.6           3.4       1.0X
[info] vector                                               19             19           0        531.2           1.9       1.8X
[info]
[info] sscal:                                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  25             25           1        399.0           2.5       1.0X
[info] vector                                                8              9           1       1177.3           0.8       3.0X
[info]
[info] dgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  27             27           0          0.0       26651.5       1.0X
[info] vector                                               21             21           0          0.0       20646.3       1.3X
[info]
[info] dgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  36             36           0          0.0       35501.4       1.0X
[info] vector                                               22             22           0          0.0       21930.3       1.6X
[info]
[info] sgemv[N]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  20             20           0          0.0       20283.3       1.0X
[info] vector                                                9              9           0          0.1        8657.7       2.3X
[info]
[info] sgemv[T]:                                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                  30             30           0          0.0       29845.8       1.0X
[info] vector                                               10             10           1          0.1        9695.4       3.1X
[info]
[info] dgemm[N,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 182            182           0          0.5        1820.0       1.0X
[info] vector                                              160            160           1          0.6        1597.6       1.1X
[info]
[info] dgemm[N,T]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 211            211           1          0.5        2106.2       1.0X
[info] vector                                              156            157           0          0.6        1564.4       1.3X
[info]
[info] dgemm[T,N]:                               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j                                                 276            276           0          0.4        2757.8       1.0X
[info] vector                                              137            137           0          0.7        1365.1       2.0X
```

/cc srowen xkrogen

[1] https://openjdk.java.net/jeps/338
[2] https://mail-archives.apache.org/mod_mbox/spark-dev/202012.mbox/%3cDM5PR2101MB11106162BB3AF32AD29C6C79B0C69DM5PR2101MB1110.namprd21.prod.outlook.com%3e

Closes #30810 from luhenry/master.

Lead-authored-by: Ludovic Henry <luhenry@microsoft.com>
Co-authored-by: Ludovic Henry <git@ludovic.dev>
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-04-14 11:36:58 -05:00
Ruifeng Zheng 47da944f59 [SPARK-34470][ML] VectorSlicer utilize ordering if possible
### What changes were proposed in this pull request?
1, add a new param `sorted` in `slice`;
2, in `VectorSlicer`, set `sorted = true` if input indices are ordered.

### Why are the changes needed?
The input indices of VectorSlicer are probably ordered.
VectorSlicer should use this attribute if possible.

I did a simple test and `sorted = true` maybe about 70% faster than existing `slice`

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
added testsuite

Closes #31588 from zhengruifeng/vector_slice_for_sorted_indices.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
2021-03-22 09:46:53 +08:00
yangjie01 8b1ba233f1 [SPARK-34068][CORE][SQL][MLLIB][GRAPHX] Remove redundant collection conversion
### What changes were proposed in this pull request?
There are some redundant collection conversion can be removed, for version compatibility, clean up these with Scala-2.13 profile.

### Why are the changes needed?
Remove redundant collection conversion

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
- Pass the Jenkins or GitHub  Action
- Manual test `core`, `graphx`, `mllib`, `mllib-local`, `sql`, `yarn`,`kafka-0-10` in Scala 2.13 passed

Closes #31125 from LuciferYang/SPARK-34068.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2021-01-13 18:07:02 -06:00
Dongjoon Hyun de9818f043
[SPARK-33662][BUILD] Setting version to 3.2.0-SNAPSHOT
### What changes were proposed in this pull request?

This PR aims to update `master` branch version to 3.2.0-SNAPSHOT.

### Why are the changes needed?

Start to prepare Apache Spark 3.2.0.

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

Pass the CIs.

Closes #30606 from dongjoon-hyun/SPARK-3.2.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-12-04 14:10:42 -08:00
Dongjoon Hyun 3ce4ab545b
[SPARK-33513][BUILD] Upgrade to Scala 2.13.4 to improve exhaustivity
### What changes were proposed in this pull request?

This PR aims the followings.
1. Upgrade from Scala 2.13.3 to 2.13.4 for Apache Spark 3.1
2. Fix exhaustivity issues in both Scala 2.12/2.13 (Scala 2.13.4 requires this for compilation.)
3. Enforce the improved exhaustive check by using the existing Scala 2.13 GitHub Action compilation job.

### Why are the changes needed?

Scala 2.13.4 is a maintenance release for 2.13 line and improves JDK 15 support.
- https://github.com/scala/scala/releases/tag/v2.13.4

Also, it improves exhaustivity check.
- https://github.com/scala/scala/pull/9140 (Check exhaustivity of pattern matches with "if" guards and custom extractors)
- https://github.com/scala/scala/pull/9147 (Check all bindings exhaustively, e.g. tuples components)

### Does this PR introduce _any_ user-facing change?

Yep. Although it's a maintenance version change, it's a Scala version change.

### How was this patch tested?

Pass the CIs and do the manual testing.
- Scala 2.12 CI jobs(GitHub Action/Jenkins UT/Jenkins K8s IT) to check the validity of code change.
- Scala 2.13 Compilation job to check the compilation

Closes #30455 from dongjoon-hyun/SCALA_3.13.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-11-23 16:28:43 -08:00
zhengruifeng 432afac07e [SPARK-32907][ML] adaptively blockify instances - revert blockify gmm
### What changes were proposed in this pull request?
revert blockify gmm

### Why are the changes needed?
WeichenXu123  and I thought we should use memory size instead of number of rows to blockify instance; then if a buffer's size is large and determined by number of rows, we should discard it.
In GMM, we found that the pre-allocated memory maybe too large and should be discarded:
```
transient private lazy val auxiliaryPDFMat = DenseMatrix.zeros(blockSize, numFeatures)
```
We had some offline discuss and thought it is better to revert blockify GMM.

### Does this PR introduce _any_ user-facing change?
blockSize added in master branch will be removed

### How was this patch tested?
existing testsuites

Closes #29782 from zhengruifeng/unblockify_gmm.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-09-23 15:54:56 +08:00
Kousuke Saruta b121f0d459 [SPARK-32873][BUILD] Fix code which causes error when build with sbt and Scala 2.13
### What changes were proposed in this pull request?

This PR fix code which causes error when build with sbt and Scala 2.13 like as follows.
```
[error] [warn] /home/kou/work/oss/spark-scala-2.13/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala:251: method with a single empty parameter list overrides method without any parameter list
[error] [warn]   override def hasNext(): Boolean = requestOffset < part.untilOffset
[error] [warn]
[error] [warn] /home/kou/work/oss/spark-scala-2.13/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala:294: method with a single empty parameter list overrides method without any parameter list
[error] [warn]   override def hasNext(): Boolean = okNext
```

More specifically, what this PR fixes are

* Methods which has an empty parameter list and overrides an method which has no parameter list.
```
override def hasNext(): Boolean = okNext
```

* Methods which has no parameter list and overrides an method which has an empty parameter list.
```
      override def next: (Int, Double) = {
```

* Infix operator expression that the operator wraps.
```
    3L * math.min(k, numFeatures) * math.min(k, numFeatures)
    3L * math.min(k, numFeatures) * math.min(k, numFeatures) +
    + math.max(math.max(k, numFeatures), 4L * math.min(k, numFeatures)
      math.max(math.max(k, numFeatures), 4L * math.min(k, numFeatures) *
    * math.min(k, numFeatures) + 4L * math.min(k, numFeatures))
```

### Why are the changes needed?

For building Spark with sbt and Scala 2.13.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

After this change and #29742 applied, compile passed with the following command.
```
build/sbt -Pscala-2.13  -Phive -Phive-thriftserver -Pyarn -Pkubernetes compile test:compile
```

Closes #29745 from sarutak/fix-code-for-sbt-and-spark-2.13.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-09-14 15:34:58 +09:00
Sean Owen be2eca22e9 [SPARK-32398][TESTS][CORE][STREAMING][SQL][ML] Update to scalatest 3.2.0 for Scala 2.13.3+
### What changes were proposed in this pull request?

Updates to scalatest 3.2.0. Though it looks large, it is 99% changes to the new location of scalatest classes.

### Why are the changes needed?

3.2.0+ has a fix that is required for Scala 2.13.3+ compatibility.

### Does this PR introduce _any_ user-facing change?

No, only affects tests.

### How was this patch tested?

Existing tests.

Closes #29196 from srowen/SPARK-32398.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-23 16:20:17 -07:00
zhengruifeng e7fa778dc7 [SPARK-30699][ML][PYSPARK] GMM blockify input vectors
### What changes were proposed in this pull request?
1, add new param blockSize;
2, if blockSize==1, keep original behavior, code path trainOnRows;
3, if blockSize>1, standardize and stack input vectors to blocks (like ALS/MLP), code path trainOnBlocks

### Why are the changes needed?
performance gain on dense dataset HIGGS:
1, save about 45% RAM;
2, 3X faster with openBLAS

### Does this PR introduce any user-facing change?
add a new expert param `blockSize`

### How was this patch tested?
added testsuites

Closes #27473 from zhengruifeng/blockify_gmm.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-05-12 12:54:03 +08:00
zhengruifeng 052ff49acd [SPARK-30659][ML][PYSPARK] LogisticRegression blockify input vectors
### What changes were proposed in this pull request?
1, reorg the `fit` method in LR to several blocks (`createModel`, `createBounds`, `createOptimizer`, `createInitCoefWithInterceptMatrix`);
2, add new param blockSize;
3, if blockSize==1, keep original behavior, code path `trainOnRows`;
4, if blockSize>1, standardize and stack input vectors to blocks (like ALS/MLP), code path `trainOnBlocks`

### Why are the changes needed?
On dense dataset `epsilon_normalized.t`:
1, reduce RAM to persist traing dataset; (save about 40% RAM)
2, use Level-2 BLAS routines; (4x ~ 5x faster)

### Does this PR introduce _any_ user-facing change?
Yes, a new param is added

### How was this patch tested?
existing and added testsuites

Closes #28458 from zhengruifeng/blockify_lor_II.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-05-07 10:07:24 +08:00
zhengruifeng ebdf41dd69 [SPARK-30642][ML][PYSPARK] LinearSVC blockify input vectors
### What changes were proposed in this pull request?
1, add new param `blockSize`;
2, add a new class InstanceBlock;
3, **if `blockSize==1`, keep original behavior; if `blockSize>1`, stack input vectors to blocks (like ALS/MLP);**
4, if `blockSize>1`, standardize the input outside of optimization procedure;

### Why are the changes needed?
1, reduce RAM to persist traing dataset; (save about 40% RAM)
2, use Level-2 BLAS routines; (4x ~ 5x faster on dataset `epsilon`)

### Does this PR introduce any user-facing change?
Yes, a new param is added

### How was this patch tested?
existing and added testsuites

Closes #28349 from zhengruifeng/blockify_svc_II.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-05-06 10:06:23 +08:00
zhengruifeng 0ede08bcb2 [SPARK-31007][ML] KMeans optimization based on triangle-inequality
### What changes were proposed in this pull request?
apply Lemma 1 in [Using the Triangle Inequality to Accelerate K-Means](https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf):

> Let x be a point, and let b and c be centers. If d(b,c)>=2d(x,b) then d(x,c) >= d(x,b);

It can be directly applied in EuclideanDistance, but not in CosineDistance.
However,  for CosineDistance we can luckily get a variant in the space of radian/angle.

### Why are the changes needed?
It help improving the performance of prediction and training (mostly)

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #27758 from zhengruifeng/km_triangle.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-24 11:24:15 -05:00
yan ma fae981e5f3 [SPARK-30773][ML] Support NativeBlas for level-1 routines
### What changes were proposed in this pull request?
Change BLAS for part of level-1 routines(axpy, dot, scal(double, denseVector)) from java implementation to NativeBLAS when vector size>256

### Why are the changes needed?
In current ML BLAS.scala, all level-1 routines are fixed to use java
implementation. But NativeBLAS(intel MKL, OpenBLAS) can bring up to 11X
performance improvement based on performance test which apply direct
calls against these methods. We should provide a way to allow user take
advantage of NativeBLAS for level-1 routines. Here we do it through
switching to NativeBLAS for these methods from f2jBLAS.

### Does this PR introduce any user-facing change?
 Yes, methods axpy, dot, scal in level-1 routines will switch to NativeBLAS when it has more than nativeL1Threshold(fixed value 256) elements and will fallback to f2jBLAS if native BLAS is not properly configured in system.

### How was this patch tested?
Perf test direct calls level-1 routines

Closes #27546 from yma11/SPARK-30773.

Lead-authored-by: yan ma <yan.ma@intel.com>
Co-authored-by: Ma Yan <yan.ma@intel.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-03-20 10:32:58 -05:00
zhengruifeng 111e9038d8 [SPARK-30770][ML] avoid vector conversion in GMM.transform
### What changes were proposed in this pull request?
Current impl needs to convert ml.Vector to breeze.Vector, which can be skipped.

### Why are the changes needed?
avoid unnecessary vector conversions

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #27519 from zhengruifeng/gmm_transform_opt.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-03-04 11:02:27 +08:00
Wu, Xiaochang ac122762f5 [SPARK-30813][ML] Fix Matrices.sprand comments
### What changes were proposed in this pull request?
Fix mistakes in comments

### Why are the changes needed?
There are mistakes in comments

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
N/A

Closes #27564 from xwu99/fix-mllib-sprand-comment.

Authored-by: Wu, Xiaochang <xiaochang.wu@intel.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-03-02 08:56:17 -06:00
gatorsmile 28b8713036 [SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT
### What changes were proposed in this pull request?
This patch is to bump the master branch version to 3.1.0-SNAPSHOT.

### Why are the changes needed?
N/A

### Does this PR introduce any user-facing change?
N/A

### How was this patch tested?
N/A

Closes #27698 from gatorsmile/updateVersion.

Authored-by: gatorsmile <gatorsmile@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-02-25 19:44:31 -08:00
zhengruifeng e086a78706 [MINOR][ML] ML cleanup
### What changes were proposed in this pull request?
1, remove used imports and variables;
2, use `.iterator` instead of `.view` to avoid IDEA warnings;
3, remove resolved _TODO_

### Why are the changes needed?
cleanup

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #27600 from zhengruifeng/nits.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-02-25 12:32:12 -06:00
zhengruifeng 8ebbf85a85 [SPARK-30772][ML][SQL] avoid tuple assignment because it will circumvent the transient tag
### What changes were proposed in this pull request?
it is said in [LeastSquaresAggregator](12e1bbaddb/mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/LeastSquaresAggregator.scala (L188)) that :

> // do not use tuple assignment above because it will circumvent the transient tag

I then check this issue with Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_241)

### Why are the changes needed?
avoid tuple assignment because it will circumvent the transient tag

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
existing testsuites

Closes #27523 from zhengruifeng/avoid_tuple_assign_to_transient.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-02-16 10:01:49 -06:00
zhengruifeng 12e1bbaddb Revert "[SPARK-30642][SPARK-30659][SPARK-30660][SPARK-30662]"
### What changes were proposed in this pull request?
Revert
#27360
#27396
#27374
#27389

### Why are the changes needed?
BLAS need more performace tests, specially on sparse datasets.
Perfermance test of LogisticRegression (https://github.com/apache/spark/pull/27374) on sparse dataset shows that blockify vectors to matrices and use BLAS will cause performance regression.
LinearSVC and LinearRegression were also updated in the same way as LogisticRegression, so we need to revert them to make sure no regression.

### Does this PR introduce any user-facing change?
remove newly added param blockSize

### How was this patch tested?
reverted testsuites

Closes #27487 from zhengruifeng/revert_blockify_ii.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-02-08 08:46:16 +08:00
zhengruifeng 96d27274f5 [SPARK-30642][ML][PYSPARK] LinearSVC blockify input vectors
### What changes were proposed in this pull request?
1, stack input vectors to blocks (like ALS/MLP);
2, add new param `blockSize`;
3, add a new class `InstanceBlock`
4, standardize the input outside of optimization procedure;

### Why are the changes needed?
1, reduce RAM to persist traing dataset; (save ~40% in test)
2, use Level-2 BLAS routines; (12% ~ 28% faster, without native BLAS)

### Does this PR introduce any user-facing change?
a new param `blockSize`

### How was this patch tested?
existing and updated testsuites

Closes #27360 from zhengruifeng/blockify_svc.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2020-01-28 20:55:21 +08:00
zhengruifeng 23a49aff27 [SPARK-30329][ML] add iterator/foreach methods for Vectors
### What changes were proposed in this pull request?
1, add new foreach-like methods: foreach/foreachNonZero
2, add iterator: iterator/activeIterator/nonZeroIterator

### Why are the changes needed?
see the [ticke](https://issues.apache.org/jira/browse/SPARK-30329) for details
foreach/foreachNonZero: for both convenience and performace (SparseVector.foreach should be faster than current traversal method)
iterator/activeIterator/nonZeroIterator: add the three iterators, so that we can futuremore add/change some impls based on those iterators for both ml and mllib sides, to avoid vector conversions.

### Does this PR introduce any user-facing change?
Yes, new methods are added

### How was this patch tested?
added testsuites

Closes #26982 from zhengruifeng/vector_iter.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
2019-12-31 15:52:17 +08:00
Yuming Wang 696288f623 [INFRA] Reverts commit 56dcd79 and c216ef1
### What changes were proposed in this pull request?
1. Revert "Preparing development version 3.0.1-SNAPSHOT": 56dcd79

2. Revert "Preparing Spark release v3.0.0-preview2-rc2": c216ef1

### Why are the changes needed?
Shouldn't change master.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
manual test:
https://github.com/apache/spark/compare/5de5e46..wangyum:revert-master

Closes #26915 from wangyum/revert-master.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Yuming Wang <wgyumg@gmail.com>
2019-12-16 19:57:44 -07:00
Yuming Wang 56dcd79992 Preparing development version 3.0.1-SNAPSHOT 2019-12-17 01:57:27 +00:00
Yuming Wang c216ef1d03 Preparing Spark release v3.0.0-preview2-rc2 2019-12-17 01:57:21 +00:00
Sean Owen 33f53cb2d5 [SPARK-30195][SQL][CORE][ML] Change some function, import definitions to work with stricter compiler in Scala 2.13
### What changes were proposed in this pull request?

See https://issues.apache.org/jira/browse/SPARK-30195 for the background; I won't repeat it here. This is sort of a grab-bag of related issues.

### Why are the changes needed?

To cross-compile with Scala 2.13 later.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests for 2.12. I've been manually checking that this actually resolves the compile problems in 2.13 separately.

Closes #26826 from srowen/SPARK-30195.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-11 12:33:58 -08:00
Xingbo Jiang 8207c835b4 Revert "Prepare Spark release v3.0.0-preview-rc2"
This reverts commit 007c873ae3.
2019-10-30 17:45:44 -07:00
Xingbo Jiang 007c873ae3 Prepare Spark release v3.0.0-preview-rc2
### What changes were proposed in this pull request?

To push the built jars to maven release repository, we need to remove the 'SNAPSHOT' tag from the version name.

Made the following changes in this PR:
* Update all the `3.0.0-SNAPSHOT` version name to `3.0.0-preview`
* Update the sparkR version number check logic to allow jvm version like `3.0.0-preview`

**Please note those changes were generated by the release script in the past, but this time since we manually add tags on master branch, we need to manually apply those changes too.**

We shall revert the changes after 3.0.0-preview release passed.

### Why are the changes needed?

To make the maven release repository to accept the built jars.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

N/A
2019-10-30 17:42:59 -07:00
Xingbo Jiang b33a58c0c6 Revert "Prepare Spark release v3.0.0-preview-rc1"
This reverts commit 5eddbb5f1d.
2019-10-28 22:32:34 -07:00
Xingbo Jiang 5eddbb5f1d Prepare Spark release v3.0.0-preview-rc1
### What changes were proposed in this pull request?

To push the built jars to maven release repository, we need to remove the 'SNAPSHOT' tag from the version name.

Made the following changes in this PR:
* Update all the `3.0.0-SNAPSHOT` version name to `3.0.0-preview`
* Update the PySpark version from `3.0.0.dev0` to `3.0.0`

**Please note those changes were generated by the release script in the past, but this time since we manually add tags on master branch, we need to manually apply those changes too.**

We shall revert the changes after 3.0.0-preview release passed.

### Why are the changes needed?

To make the maven release repository to accept the built jars.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

N/A

Closes #26243 from jiangxb1987/3.0.0-preview-prepare.

Lead-authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Co-authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-10-28 22:31:29 -07:00
Dongjoon Hyun bd031c2173 [SPARK-29307][BUILD][TESTS] Remove scalatest deprecation warnings
### What changes were proposed in this pull request?

This PR aims to remove `scalatest` deprecation warnings with the following changes.
- `org.scalatest.mockito.MockitoSugar` -> `org.scalatestplus.mockito.MockitoSugar`
- `org.scalatest.selenium.WebBrowser` -> `org.scalatestplus.selenium.WebBrowser`
- `org.scalatest.prop.Checkers` -> `org.scalatestplus.scalacheck.Checkers`
- `org.scalatest.prop.GeneratorDrivenPropertyChecks` -> `org.scalatestplus.scalacheck.ScalaCheckDrivenPropertyChecks`

### Why are the changes needed?

According to the Jenkins logs, there are 118 warnings about this.
```
 grep "is deprecated" ~/consoleText | grep scalatest | wc -l
     118
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

After Jenkins passes, we need to check the Jenkins log.

Closes #25982 from dongjoon-hyun/SPARK-29307.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-30 21:00:11 -07:00
Sean Owen e1ea806b30 [SPARK-29291][CORE][SQL][STREAMING][MLLIB] Change procedure-like declaration to function + Unit for 2.13
### What changes were proposed in this pull request?

Scala 2.13 emits a deprecation warning for procedure-like declarations:

```
def foo() {
 ...
```

This is equivalent to the following, so should be changed to avoid a warning:

```
def foo(): Unit = {
  ...
```

### Why are the changes needed?

It will avoid about a thousand compiler warnings when we start to support Scala 2.13. I wanted to make the change in 3.0 as there are less likely to be back-ports from 3.0 to 2.4 than 3.1 to 3.0, for example, minimizing that downside to touching so many files.

Unfortunately, that makes this quite a big change.

### Does this PR introduce any user-facing change?

No behavior change at all.

### How was this patch tested?

Existing tests.

Closes #25968 from srowen/SPARK-29291.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-30 10:03:23 -07:00
Patrick Pisciuneri c7c6b642dc [SPARK-29121][ML][MLLIB] Support for dot product operation on Vector(s)
### What changes were proposed in this pull request?

Support for dot product with:
- `ml.linalg.Vector`
- `ml.linalg.Vectors`
- `mllib.linalg.Vector`
- `mllib.linalg.Vectors`

### Why are the changes needed?

Dot product is useful for feature engineering and scoring.  BLAS routines are already there, just a wrapper is needed.

### Does this PR introduce any user-facing change?

No user facing changes, just some new functionality.

### How was this patch tested?

Tests were written and added to the appropriate `VectorSuites` classes.  They can be quickly run with:

```
sbt "mllib-local/testOnly org.apache.spark.ml.linalg.VectorsSuite"
sbt "mllib/testOnly org.apache.spark.mllib.linalg.VectorsSuite"
```

Closes #25818 from phpisciuneri/SPARK-29121.

Authored-by: Patrick Pisciuneri <phpisciuneri@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-21 14:26:54 -05:00
zhengruifeng a3bbc371cb [SPARK-28421][ML] SparseVector.apply performance optimization
## What changes were proposed in this pull request?
optimize the `SparseVector.apply` by avoiding internal conversion
Since the speed up is significant (2.5X ~ 5X), and this method is widely used in ml, I suggest back porting.

| size|  nnz | apply(old) | apply2(new impl) | apply3(new impl with extra range check)|
|------|----------|------------|----------|----------|
|10000000|100|75294|12208|18682|
|10000000|10000|75616|23132|32932|
|10000000|1000000|92949|42529|48821|

## How was this patch tested?
existing tests

using following code to test performance (here the new impl is named `apply2`, and another impl with extra range check is named `apply3`):
```
import scala.util.Random
import org.apache.spark.ml.linalg._

val size = 10000000
for (nnz <- Seq(100, 10000, 1000000)) {
	val rng = new Random(123)
	val indices = Array.fill(nnz + nnz)(rng.nextInt.abs % size).distinct.take(nnz).sorted
	val values = Array.fill(nnz)(rng.nextDouble)
	val vec = Vectors.sparse(size, indices, values).toSparse

	val tic1 = System.currentTimeMillis;
	(0 until 100).foreach{ round => var i = 0; var sum = 0.0; while(i < size) {sum+=vec(i); i+=1} };
	val toc1 = System.currentTimeMillis;

	val tic2 = System.currentTimeMillis;
	(0 until 100).foreach{ round => var i = 0; var sum = 0.0; while(i < size) {sum+=vec.apply2(i); i+=1} };
	val toc2 = System.currentTimeMillis;

	val tic3 = System.currentTimeMillis;
	(0 until 100).foreach{ round => var i = 0; var sum = 0.0; while(i < size) {sum+=vec.apply3(i); i+=1} };
	val toc3 = System.currentTimeMillis;

	println((size, nnz, toc1 - tic1, toc2 - tic2, toc3 - tic3))
}
```

Closes #25178 from zhengruifeng/sparse_vec_apply.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-07-23 20:20:22 -05:00
Ilya Matiach b2d36f65db [SPARK-19591][ML][MLLIB] Add sample weights to decision trees
This is updated PR https://github.com/apache/spark/pull/16722 to latest master

## What changes were proposed in this pull request?

This patch adds support for sample weights to DecisionTreeRegressor and DecisionTreeClassifier.

Note: This patch does not add support for sample weights to RandomForest. As discussed in the JIRA, we would like to add sample weights into the bagging process. This patch is large enough as is, and there are some additional considerations to be made for random forests. Since the machinery introduced here needs to be present regardless, I have opted to leave random forests for a follow up pr.
## How was this patch tested?

The algorithms are tested to ensure that:
    1. Arbitrary scaling of constant weights has no effect
    2. Outliers with small weights do not affect the learned model
    3. Oversampling and weighting are equivalent

Unit tests are also added to test other smaller components.
## Summary of changes

   - Impurity aggregators now store weighted sufficient statistics. They also store a raw count, however, since this is needed to use minInstancesPerNode.

   - Impurity aggregators now also hold the raw count.

   - This patch maintains the meaning of minInstancesPerNode, in that the parameter still corresponds to raw, unweighted counts. It also adds a new parameter minWeightFractionPerNode which requires that nodes must contain at least minWeightFractionPerNode * weightedNumExamples total weight.

   - This patch modifies findSplitsForContinuousFeatures to use weighted sums. Unit tests are added.

   - TreePoint is modified to hold a sample weight

   - BaggedPoint is modified from:
``` Scala
private[spark] class BaggedPoint[Datum](val datum: Datum, val subsampleWeights: Array[Double]) extends Serializable
```
to
``` Scala
private[spark] class BaggedPoint[Datum](
    val datum: Datum,
    val subsampleCounts: Array[Int],
    val sampleWeight: Double) extends Serializable
```
We do not simply multiply the counts by the weight and store that because we need the raw counts and the weight in order to use both minInstancesPerNode and minWeightPerNode

**Note**: many of the changed files are due simply to using Instance instead of LabeledPoint

Closes #21632 from imatiach-msft/ilmat/sample-weights.

Authored-by: Ilya Matiach <ilmat@microsoft.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-01-24 18:20:28 -07:00
zhengruifeng 91405b3b6e [SPARK-22450][WIP][CORE][MLLIB][FOLLOWUP] Safely register MultivariateGaussian
## What changes were proposed in this pull request?
register following classes in Kryo:
"org.apache.spark.ml.stat.distribution.MultivariateGaussian",
"org.apache.spark.mllib.stat.distribution.MultivariateGaussian"

## How was this patch tested?
added tests

Due to existing module dependency, I can not import spark-core in mllib-local's testsuits, so I do not add testsuite in `org.apache.spark.ml.stat.distribution.MultivariateGaussianSuite`.
And I notice that class `ClusterStats` in `ClusteringEvaluator` is registered in a different way, should it be modified to keep in line with others in ML? srowen

Closes #22974 from zhengruifeng/kryo_MultivariateGaussian.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-11-15 09:22:31 -06:00
DB Tsai ad853c5678
[SPARK-25956] Make Scala 2.12 as default Scala version in Spark 3.0
## What changes were proposed in this pull request?

This PR makes Spark's default Scala version as 2.12, and Scala 2.11 will be the alternative version. This implies that Scala 2.12 will be used by our CI builds including pull request builds.

We'll update the Jenkins to include a new compile-only jobs for Scala 2.11 to ensure the code can be still compiled with Scala 2.11.

## How was this patch tested?

existing tests

Closes #22967 from dbtsai/scala2.12.

Authored-by: DB Tsai <d_tsai@apple.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2018-11-14 16:22:23 -08:00
Yuanjian Li 2977e2312d [SPARK-25986][BUILD] Add rules to ban throw Errors in application code
## What changes were proposed in this pull request?

Add scala and java lint check rules to ban the usage of `throw new xxxErrors` and fix up all exists instance followed by https://github.com/apache/spark/pull/22989#issuecomment-437939830. See more details in https://github.com/apache/spark/pull/22969.

## How was this patch tested?

Local test with lint-scala and lint-java.

Closes #22989 from xuanyuanking/SPARK-25986.

Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-11-14 13:05:18 -08:00
Sean Owen 722369ee55 [SPARK-24421][BUILD][CORE] Accessing sun.misc.Cleaner in JDK11
…. Other related changes to get JDK 11 working, to test

## What changes were proposed in this pull request?

- Access `sun.misc.Cleaner` (Java 8) and `jdk.internal.ref.Cleaner` (JDK 9+) by reflection (note: the latter only works if illegal reflective access is allowed)
- Access `sun.misc.Unsafe.invokeCleaner` in Java 9+ instead of `sun.misc.Cleaner` (Java 8)

In order to test anything on JDK 11, I also fixed a few small things, which I include here:

- Fix minor JDK 11 compile issues
- Update scala plugin, Jetty for JDK 11, to facilitate tests too

This doesn't mean JDK 11 tests all pass now, but lots do. Note also that the JDK 9+ solution for the Cleaner has a big caveat.

## How was this patch tested?

Existing tests. Manually tested JDK 11 build and tests, and tests covering this change appear to pass. All Java 8 tests should still pass, but this change alone does not achieve full JDK 11 compatibility.

Closes #22993 from srowen/SPARK-24421.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-11-14 12:52:54 -08:00
gatorsmile 9bf397c0e4 [SPARK-25592] Setting version to 3.0.0-SNAPSHOT
## What changes were proposed in this pull request?

This patch is to bump the master branch version to 3.0.0-SNAPSHOT.

## How was this patch tested?
N/A

Closes #22606 from gatorsmile/bump3.0.

Authored-by: gatorsmile <gatorsmile@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
2018-10-02 08:48:24 -07:00
gatorsmile bb2f069cf2 [SPARK-25436] Bump master branch version to 2.5.0-SNAPSHOT
## What changes were proposed in this pull request?
In the dev list, we can still discuss whether the next version is 2.5.0 or 3.0.0. Let us first bump the master branch version to `2.5.0-SNAPSHOT`.

## How was this patch tested?
N/A

Closes #22426 from gatorsmile/bumpVersionMaster.

Authored-by: gatorsmile <gatorsmile@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
2018-09-15 16:24:02 -07:00