19f882ce1b
### What changes were proposed in this pull request? When Initializing factors in ALS, we should use `mapPartitions` instead of current `map`, so we can preserve existing partition of the RDD of `InBlock`. The RDD of `InBlock` is already partitioned by src block id. We don't change the partition when initializing factors. ### Why are the changes needed? This patch can reduce unnecessary shuffle after initializing factors. ### Does this PR introduce any user-facing change? No ### How was this patch tested? It should not change existing tests. It should pass added test that verifies shuffle dependency of factor RDDs. Closes #25639 from viirya/fix-als-partition. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <liangchi@uber.com> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |