402375b59e
### What changes were proposed in this pull request? Overload methods `PageRank.runWithOptions` and `PageRank.runWithOptionsWithPreviousPageRank` (not to break any user-facing signature) with a `normalized` parameter that describes "whether or not to normalize the rank sum". ### Why are the changes needed? https://issues.apache.org/jira/browse/SPARK-35357 When dealing with a non negligible proportion of sinks in a graph, algorithm based on incremental update of ranks can get a **precision gain for free** if they are allowed to manipulate non normalized ranks. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By adding a unit test that verifies that (even when dealing with a graph containing a sink) we end up with the same result for both these scenarios: a) - Run **6 iterations** of pagerank in a row using `PageRank.runWithOptions` with **normalization enabled** b) - Run **2 iterations** using `PageRank.runWithOptions` with **normalization disabled** - Resume from the `preRankGraph1` and run **2 more iterations** using `PageRank.runWithOptionsWithPreviousPageRank` with **normalization disabled** - Finally resume from the `preRankGraph2` and run **2 more iterations** using `PageRank.runWithOptionsWithPreviousPageRank` with **normalization enabled** Closes #32485 from bonnal-enzo/make-pagerank-normalization-optional. Authored-by: Enzo Bonnal <enzobonnal@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com> |
||
---|---|---|
.. | ||
src | ||
pom.xml |