spark-instrumented-optimizer/mllib
Yuhao Yang 2f82c841fa [SPARK-5186] [MLLIB] Vector.equals and Vector.hashCode are very inefficient
JIRA Issue: https://issues.apache.org/jira/browse/SPARK-5186

Currently SparseVector is using the inherited equals from Vector, which will create a full-size array for even the sparse vector. The pull request contains a specialized equals optimization that improves on both time and space.

1. The implementation will be consistent with the original. Especially it will keep equality comparison between SparseVector and DenseVector.

Author: Yuhao Yang <hhbyyh@gmail.com>
Author: Yuhao Yang <yuhao@yuhaodevbox.sh.intel.com>

Closes #3997 from hhbyyh/master and squashes the following commits:

0d9d130 [Yuhao Yang] function name change and ut update
93f0d46 [Yuhao Yang] unify sparse vs dense vectors
985e160 [Yuhao Yang] improve locality for equals
bdf8789 [Yuhao Yang] improve equals and rewrite hashCode for Vector
a6952c3 [Yuhao Yang] fix scala style for comments
50abef3 [Yuhao Yang] fix ut for sparse vector with explicit 0
f41b135 [Yuhao Yang] iterative equals for sparse vector
5741144 [Yuhao Yang] Specialized equals for SparseVector
2015-01-20 15:20:20 -08:00
..
src [SPARK-5186] [MLLIB] Vector.equals and Vector.hashCode are very inefficient 2015-01-20 15:20:20 -08:00
pom.xml [SPARK-4048] Enhance and extend hadoop-provided profile. 2015-01-08 17:15:13 -08:00