### What changes were proposed in this pull request?
Fix wrong assert statement, a mistake when coding
### Why are the changes needed?
wrong assert statement
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Existing tests
Closes#33953 from dgd-contributor/SPARK-36685.
Authored-by: dgd-contributor <dgd_contributor@viettel.com.vn>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 9af0132516)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
softmax support offset and step, then we can use it in ANN and NB
### Why are the changes needed?
to simplify impl
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
existing testsuite
Closes#32991 from zhengruifeng/softmax_support_offset_step.
Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
### What changes were proposed in this pull request?
use newly impled softmax function in NB
### Why are the changes needed?
to simplify impl
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
existing testsuite
Closes#32927 from zhengruifeng/softmax__followup.
Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Huaxin Gao <huaxin_gao@apple.com>
### What changes were proposed in this pull request?
Sparse gemm use mothod `DenseMatrix.apply` to access the values, which can be optimized by skipping checking the bound and `isTransposed`
```
override def apply(i: Int, j: Int): Double = values(index(i, j))
private[ml] def index(i: Int, j: Int): Int = {
require(i >= 0 && i < numRows, s"Expected 0 <= i < $numRows, got i = $i.")
require(j >= 0 && j < numCols, s"Expected 0 <= j < $numCols, got j = $j.")
if (!isTransposed) i + numRows * j else j + numCols * i
}
```
### Why are the changes needed?
to improve performance, about 15% faster in the designed case
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
existing testsuite and additional performance test
Closes#32857 from zhengruifeng/gemm_opt_index.
Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
In existing impls, it is common case that the vector/matrix need to be sliced/copied just due to shape match.
which makes the logic complex and introduce extra costing of slicing & copying.
### Why are the changes needed?
1, avoid slicing and copying due to shape checking;
2, simpify the usages;
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
existing testsuites
Closes#32805 from zhengruifeng/new_blas_func_for_agg.
Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
add softmax function in utils
### Why are the changes needed?
it can be used in multi places
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
existing testsuites
Closes#32822 from zhengruifeng/add_softmax_func.
Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
After SPARK-29291 and SPARK-33352, there are still some compilation warnings about `procedure syntax is deprecated` as follows:
```
[WARNING] [Warn] /spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:723: [deprecation | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `registerMergeResult`'s return type
[WARNING] [Warn] /spark/core/src/main/scala/org/apache/spark/MapOutputTracker.scala:748: [deprecation | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `unregisterMergeResult`'s return type
[WARNING] [Warn] /spark/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala:223: [deprecation | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `testSimpleSpillingForAllCodecs`'s return type
[WARNING] [Warn] /spark/mllib-local/src/test/scala/org/apache/spark/ml/linalg/BLASBenchmark.scala:53: [deprecation | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `runBLASBenchmark`'s return type
[WARNING] [Warn] /spark/sql/core/src/main/scala/org/apache/spark/sql/execution/command/DataWritingCommand.scala:110: [deprecation | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `assertEmptyRootPath`'s return type
[WARNING] [Warn] /spark/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala:602: [deprecation | origin= | version=2.13.0] procedure syntax is deprecated: instead, add `: Unit =` to explicitly declare `executeCTASWithNonEmptyLocation`'s return type
```
So the main change of this pr is cleanup these compilation warnings.
### Why are the changes needed?
Eliminate compilation warnings in Scala 2.13 and this change should be compatible with Scala 2.12
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Pass the Jenkins or GitHub Action
Closes#32669 from LuciferYang/re-clean-procedure-syntax.
Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
This PR adds benchmark results for `BLASBenchmark` created by GitHub Actions machines.
Benchmark result files are added for both JDK 8 (`BLASBenchmark-result.txt`) and 11 (`BLASBenchmark-jdk11-result.txt`) in `{SPARK_HOME}/mllib-local/benchmarks/`.
### Why are the changes needed?
In [SPARK-34950](https://issues.apache.org/jira/browse/SPARK-34950), benchmark results were updated to the ones created by Github Actions machines.
As benchmark results for `BLASBenchmark` (added at [SPARK-33882](https://issues.apache.org/jira/browse/SPARK-33882) and [SPARK-35150](https://issues.apache.org/jira/browse/SPARK-35150)) are not currently available at the repository, this PR adds them.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
The benchmark results were obtained by running tests with GitHub Actions workflow in my forked repository.
You can refer to the test results and output files from the link below.
- https://github.com/byungsoo-oh/spark/actions/runs/809900377
- https://github.com/byungsoo-oh/spark/actions/runs/810084610Closes#32435 from byungsoo-oh/SPARK-35306.
Authored-by: byungsoo <byungsoo@byungsoo-pc.tn.corp.samsungelectronics.net>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
### What changes were proposed in this pull request?
This patch introduces a VectorizedBLAS class which implements such hardware-accelerated BLAS operations. This feature is hidden behind the "vectorized" profile that you can enable by passing "-Pvectorized" to sbt or maven.
The Vector API has been introduced in JDK 16. Following discussion on the mailing list, this API is introduced transparently and needs to be enabled explicitely.
### Why are the changes needed?
Whenever a native BLAS implementation isn't available on the system, Spark automatically falls back onto a Java implementation. With the recent release of the Vector API in the OpenJDK [1], we can use hardware acceleration for such operations.
This change was also discussed on the mailing list. [2]
### Does this PR introduce _any_ user-facing change?
It introduces a build-time profile called `vectorized`. You can pass it to sbt and mvn with `-Pvectorized`. There is no change to the end-user of Spark and it should only impact Spark developpers. It is also disabled by default.
### How was this patch tested?
It passes `build/sbt mllib-local/test` with and without `-Pvectorized` with JDK 16. This patch also introduces benchmarks for BLAS.
The benchmark results are as follows:
```
[info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 37 37 0 271.5 3.7 1.0X
[info] vector 24 25 4 416.1 2.4 1.5X
[info]
[info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 70 70 0 143.2 7.0 1.0X
[info] vector 35 35 2 288.7 3.5 2.0X
[info]
[info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 50 51 1 199.8 5.0 1.0X
[info] vector 15 15 0 648.7 1.5 3.2X
[info]
[info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 34 34 0 295.6 3.4 1.0X
[info] vector 19 19 0 531.2 1.9 1.8X
[info]
[info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 25 25 1 399.0 2.5 1.0X
[info] vector 8 9 1 1177.3 0.8 3.0X
[info]
[info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 27 27 0 0.0 26651.5 1.0X
[info] vector 21 21 0 0.0 20646.3 1.3X
[info]
[info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 36 36 0 0.0 35501.4 1.0X
[info] vector 22 22 0 0.0 21930.3 1.6X
[info]
[info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 20 20 0 0.0 20283.3 1.0X
[info] vector 9 9 0 0.1 8657.7 2.3X
[info]
[info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 30 30 0 0.0 29845.8 1.0X
[info] vector 10 10 1 0.1 9695.4 3.1X
[info]
[info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 182 182 0 0.5 1820.0 1.0X
[info] vector 160 160 1 0.6 1597.6 1.1X
[info]
[info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 211 211 1 0.5 2106.2 1.0X
[info] vector 156 157 0 0.6 1564.4 1.3X
[info]
[info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
[info] ------------------------------------------------------------------------------------------------------------------------
[info] f2j 276 276 0 0.4 2757.8 1.0X
[info] vector 137 137 0 0.7 1365.1 2.0X
```
/cc srowen xkrogen
[1] https://openjdk.java.net/jeps/338
[2] https://mail-archives.apache.org/mod_mbox/spark-dev/202012.mbox/%3cDM5PR2101MB11106162BB3AF32AD29C6C79B0C69DM5PR2101MB1110.namprd21.prod.outlook.com%3eCloses#30810 from luhenry/master.
Lead-authored-by: Ludovic Henry <luhenry@microsoft.com>
Co-authored-by: Ludovic Henry <git@ludovic.dev>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
1, add a new param `sorted` in `slice`;
2, in `VectorSlicer`, set `sorted = true` if input indices are ordered.
### Why are the changes needed?
The input indices of VectorSlicer are probably ordered.
VectorSlicer should use this attribute if possible.
I did a simple test and `sorted = true` maybe about 70% faster than existing `slice`
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
added testsuite
Closes#31588 from zhengruifeng/vector_slice_for_sorted_indices.
Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
There are some redundant collection conversion can be removed, for version compatibility, clean up these with Scala-2.13 profile.
### Why are the changes needed?
Remove redundant collection conversion
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- Pass the Jenkins or GitHub Action
- Manual test `core`, `graphx`, `mllib`, `mllib-local`, `sql`, `yarn`,`kafka-0-10` in Scala 2.13 passed
Closes#31125 from LuciferYang/SPARK-34068.
Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
This PR aims to update `master` branch version to 3.2.0-SNAPSHOT.
### Why are the changes needed?
Start to prepare Apache Spark 3.2.0.
### Does this PR introduce _any_ user-facing change?
N/A.
### How was this patch tested?
Pass the CIs.
Closes#30606 from dongjoon-hyun/SPARK-3.2.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
This PR aims the followings.
1. Upgrade from Scala 2.13.3 to 2.13.4 for Apache Spark 3.1
2. Fix exhaustivity issues in both Scala 2.12/2.13 (Scala 2.13.4 requires this for compilation.)
3. Enforce the improved exhaustive check by using the existing Scala 2.13 GitHub Action compilation job.
### Why are the changes needed?
Scala 2.13.4 is a maintenance release for 2.13 line and improves JDK 15 support.
- https://github.com/scala/scala/releases/tag/v2.13.4
Also, it improves exhaustivity check.
- https://github.com/scala/scala/pull/9140 (Check exhaustivity of pattern matches with "if" guards and custom extractors)
- https://github.com/scala/scala/pull/9147 (Check all bindings exhaustively, e.g. tuples components)
### Does this PR introduce _any_ user-facing change?
Yep. Although it's a maintenance version change, it's a Scala version change.
### How was this patch tested?
Pass the CIs and do the manual testing.
- Scala 2.12 CI jobs(GitHub Action/Jenkins UT/Jenkins K8s IT) to check the validity of code change.
- Scala 2.13 Compilation job to check the compilation
Closes#30455 from dongjoon-hyun/SCALA_3.13.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
revert blockify gmm
### Why are the changes needed?
WeichenXu123 and I thought we should use memory size instead of number of rows to blockify instance; then if a buffer's size is large and determined by number of rows, we should discard it.
In GMM, we found that the pre-allocated memory maybe too large and should be discarded:
```
transient private lazy val auxiliaryPDFMat = DenseMatrix.zeros(blockSize, numFeatures)
```
We had some offline discuss and thought it is better to revert blockify GMM.
### Does this PR introduce _any_ user-facing change?
blockSize added in master branch will be removed
### How was this patch tested?
existing testsuites
Closes#29782 from zhengruifeng/unblockify_gmm.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
This PR fix code which causes error when build with sbt and Scala 2.13 like as follows.
```
[error] [warn] /home/kou/work/oss/spark-scala-2.13/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala:251: method with a single empty parameter list overrides method without any parameter list
[error] [warn] override def hasNext(): Boolean = requestOffset < part.untilOffset
[error] [warn]
[error] [warn] /home/kou/work/oss/spark-scala-2.13/external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala:294: method with a single empty parameter list overrides method without any parameter list
[error] [warn] override def hasNext(): Boolean = okNext
```
More specifically, what this PR fixes are
* Methods which has an empty parameter list and overrides an method which has no parameter list.
```
override def hasNext(): Boolean = okNext
```
* Methods which has no parameter list and overrides an method which has an empty parameter list.
```
override def next: (Int, Double) = {
```
* Infix operator expression that the operator wraps.
```
3L * math.min(k, numFeatures) * math.min(k, numFeatures)
3L * math.min(k, numFeatures) * math.min(k, numFeatures) +
+ math.max(math.max(k, numFeatures), 4L * math.min(k, numFeatures)
math.max(math.max(k, numFeatures), 4L * math.min(k, numFeatures) *
* math.min(k, numFeatures) + 4L * math.min(k, numFeatures))
```
### Why are the changes needed?
For building Spark with sbt and Scala 2.13.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
After this change and #29742 applied, compile passed with the following command.
```
build/sbt -Pscala-2.13 -Phive -Phive-thriftserver -Pyarn -Pkubernetes compile test:compile
```
Closes#29745 from sarutak/fix-code-for-sbt-and-spark-2.13.
Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
### What changes were proposed in this pull request?
Updates to scalatest 3.2.0. Though it looks large, it is 99% changes to the new location of scalatest classes.
### Why are the changes needed?
3.2.0+ has a fix that is required for Scala 2.13.3+ compatibility.
### Does this PR introduce _any_ user-facing change?
No, only affects tests.
### How was this patch tested?
Existing tests.
Closes#29196 from srowen/SPARK-32398.
Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
1, add new param blockSize;
2, if blockSize==1, keep original behavior, code path trainOnRows;
3, if blockSize>1, standardize and stack input vectors to blocks (like ALS/MLP), code path trainOnBlocks
### Why are the changes needed?
performance gain on dense dataset HIGGS:
1, save about 45% RAM;
2, 3X faster with openBLAS
### Does this PR introduce any user-facing change?
add a new expert param `blockSize`
### How was this patch tested?
added testsuites
Closes#27473 from zhengruifeng/blockify_gmm.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
1, reorg the `fit` method in LR to several blocks (`createModel`, `createBounds`, `createOptimizer`, `createInitCoefWithInterceptMatrix`);
2, add new param blockSize;
3, if blockSize==1, keep original behavior, code path `trainOnRows`;
4, if blockSize>1, standardize and stack input vectors to blocks (like ALS/MLP), code path `trainOnBlocks`
### Why are the changes needed?
On dense dataset `epsilon_normalized.t`:
1, reduce RAM to persist traing dataset; (save about 40% RAM)
2, use Level-2 BLAS routines; (4x ~ 5x faster)
### Does this PR introduce _any_ user-facing change?
Yes, a new param is added
### How was this patch tested?
existing and added testsuites
Closes#28458 from zhengruifeng/blockify_lor_II.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
1, add new param `blockSize`;
2, add a new class InstanceBlock;
3, **if `blockSize==1`, keep original behavior; if `blockSize>1`, stack input vectors to blocks (like ALS/MLP);**
4, if `blockSize>1`, standardize the input outside of optimization procedure;
### Why are the changes needed?
1, reduce RAM to persist traing dataset; (save about 40% RAM)
2, use Level-2 BLAS routines; (4x ~ 5x faster on dataset `epsilon`)
### Does this PR introduce any user-facing change?
Yes, a new param is added
### How was this patch tested?
existing and added testsuites
Closes#28349 from zhengruifeng/blockify_svc_II.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
apply Lemma 1 in [Using the Triangle Inequality to Accelerate K-Means](https://www.aaai.org/Papers/ICML/2003/ICML03-022.pdf):
> Let x be a point, and let b and c be centers. If d(b,c)>=2d(x,b) then d(x,c) >= d(x,b);
It can be directly applied in EuclideanDistance, but not in CosineDistance.
However, for CosineDistance we can luckily get a variant in the space of radian/angle.
### Why are the changes needed?
It help improving the performance of prediction and training (mostly)
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
existing testsuites
Closes#27758 from zhengruifeng/km_triangle.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
Change BLAS for part of level-1 routines(axpy, dot, scal(double, denseVector)) from java implementation to NativeBLAS when vector size>256
### Why are the changes needed?
In current ML BLAS.scala, all level-1 routines are fixed to use java
implementation. But NativeBLAS(intel MKL, OpenBLAS) can bring up to 11X
performance improvement based on performance test which apply direct
calls against these methods. We should provide a way to allow user take
advantage of NativeBLAS for level-1 routines. Here we do it through
switching to NativeBLAS for these methods from f2jBLAS.
### Does this PR introduce any user-facing change?
Yes, methods axpy, dot, scal in level-1 routines will switch to NativeBLAS when it has more than nativeL1Threshold(fixed value 256) elements and will fallback to f2jBLAS if native BLAS is not properly configured in system.
### How was this patch tested?
Perf test direct calls level-1 routines
Closes#27546 from yma11/SPARK-30773.
Lead-authored-by: yan ma <yan.ma@intel.com>
Co-authored-by: Ma Yan <yan.ma@intel.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
Current impl needs to convert ml.Vector to breeze.Vector, which can be skipped.
### Why are the changes needed?
avoid unnecessary vector conversions
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
existing testsuites
Closes#27519 from zhengruifeng/gmm_transform_opt.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
### What changes were proposed in this pull request?
Fix mistakes in comments
### Why are the changes needed?
There are mistakes in comments
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
N/A
Closes#27564 from xwu99/fix-mllib-sprand-comment.
Authored-by: Wu, Xiaochang <xiaochang.wu@intel.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
This patch is to bump the master branch version to 3.1.0-SNAPSHOT.
### Why are the changes needed?
N/A
### Does this PR introduce any user-facing change?
N/A
### How was this patch tested?
N/A
Closes#27698 from gatorsmile/updateVersion.
Authored-by: gatorsmile <gatorsmile@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
### What changes were proposed in this pull request?
1, remove used imports and variables;
2, use `.iterator` instead of `.view` to avoid IDEA warnings;
3, remove resolved _TODO_
### Why are the changes needed?
cleanup
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
existing testsuites
Closes#27600 from zhengruifeng/nits.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
it is said in [LeastSquaresAggregator](12e1bbaddb/mllib/src/main/scala/org/apache/spark/ml/optim/aggregator/LeastSquaresAggregator.scala (L188)) that :
> // do not use tuple assignment above because it will circumvent the transient tag
I then check this issue with Scala 2.13.1 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_241)
### Why are the changes needed?
avoid tuple assignment because it will circumvent the transient tag
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
existing testsuites
Closes#27523 from zhengruifeng/avoid_tuple_assign_to_transient.
Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>