spark-instrumented-optimizer/graphx
Yves Raimond 1fec3ce4e1 [SPARK-11496][GRAPHX] Parallel implementation of personalized pagerank
(Updated version of [PR-9457](https://github.com/apache/spark/pull/9457), rebased on latest Spark master, and using mllib-local).

This implements a parallel version of personalized pagerank, which runs all propagations for a list of source vertices in parallel.

I ran a few benchmarks on the full [DBpedia](http://dbpedia.org/) graph. When running personalized pagerank for only one source node, the existing implementation is twice as fast as the parallel one (because of the SparseVector overhead). However for 10 source nodes, the parallel implementation is four times as fast. When increasing the number of source nodes, this difference becomes even greater.

![image](https://cloud.githubusercontent.com/assets/2491/10927702/dd82e4fa-8256-11e5-89a8-4799b407f502.png)

Author: Yves Raimond <yraimond@netflix.com>

Closes #14998 from moustaki/parallel-ppr.
2016-09-10 00:15:59 -07:00
..
src [SPARK-11496][GRAPHX] Parallel implementation of personalized pagerank 2016-09-10 00:15:59 -07:00
pom.xml [SPARK-11496][GRAPHX] Parallel implementation of personalized pagerank 2016-09-10 00:15:59 -07:00