f377431a57
Add single pseudo-eigenvector PIC Including documentations and updated pom.xml with the following codes: mllib/src/main/scala/org/apache/spark/mllib/clustering/PIClustering.scala mllib/src/test/scala/org/apache/spark/mllib/clustering/PIClusteringSuite.scala Author: sboeschhuawei <stephen.boesch@huawei.com> Author: Fan Jiang <fanjiang.sc@huawei.com> Author: Jiang Fan <fjiang6@gmail.com> Author: Stephen Boesch <stephen.boesch@huawei.com> Author: Xiangrui Meng <meng@databricks.com> Closes #4254 from fjiang6/PIC and squashes the following commits: 4550850 [sboeschhuawei] Removed pic test data f292f31 [Stephen Boesch] Merge pull request #44 from mengxr/SPARK-4259 4b78aaf [Xiangrui Meng] refactor PIC 24fbf52 [sboeschhuawei] Updated API to be similar to KMeans plus other changes requested by Xiangrui on the PR c12dfc8 [sboeschhuawei] Removed examples files and added pic_data.txt. Revamped testcases yet to come 92d4752 [sboeschhuawei] Move the Guassian/ Affinity matrix calcs out of PIC. Presently in the test suite 7ebd149 [sboeschhuawei] Incorporate Xiangrui's first set of PR comments except restructure PIC.run to take Graph but do not remove Gaussian 121e4d5 [sboeschhuawei] Remove unused testing data files 1c3a62e [sboeschhuawei] removed matplot.py and reordered all private methods to bottom of PIC 218a49d [sboeschhuawei] Applied Xiangrui's comments - especially removing RDD/PICLinalg classes and making noncritical methods private 43ab10b [sboeschhuawei] Change last two println's to log4j logger 88aacc8 [sboeschhuawei] Add assert to testcase on cluster sizes 24f438e [sboeschhuawei] fixed incorrect markdown in clustering doc 060e6bf [sboeschhuawei] Added link to PIC doc from the main clustering md doc be659e3 [sboeschhuawei] Added mllib specific log4j 90e7fa4 [sboeschhuawei] Converted from custom Linalg routines to Breeze: added JavaDoc comments; added Markdown documentation bea48ea [sboeschhuawei] Converted custom Linear Algebra datatypes/routines to use Breeze. b29c0db [Fan Jiang] Update PIClustering.scala ace9749 [Fan Jiang] Update PIClustering.scala a112f38 [sboeschhuawei] Added graphx main and test jars as dependencies to mllib/pom.xml f656c34 [sboeschhuawei] Added iris dataset b7dbcbe [sboeschhuawei] Added axes and combined into single plot for matplotlib a2b1e57 [sboeschhuawei] Revert inadvertent update to KMeans 9294263 [sboeschhuawei] Added visualization/plotting of input/output data e5df2b8 [sboeschhuawei] First end to end working PIC 0700335 [sboeschhuawei] First end to end working version: but has bad performance issue 32a90dc [sboeschhuawei] Update circles test data values 0ef163f [sboeschhuawei] Added ConcentricCircles data generation and KMeans clustering 3fd5bc8 [sboeschhuawei] PIClustering is running in new branch (up to the pseudo-eigenvector convergence step) d5aae20 [Jiang Fan] Adding Power Iteration Clustering and Suite test a3c5fbe [Jiang Fan] Adding Power Iteration Clustering |
||
---|---|---|
.. | ||
cluster-overview.png | ||
cluster-overview.pptx | ||
edge-cut.png | ||
edge_cut_vs_vertex_cut.png | ||
glyphicons-halflings-white.png | ||
glyphicons-halflings.png | ||
graph_parallel.png | ||
graphx_figures.pptx | ||
graphx_logo.png | ||
graphx_performance_comparison.png | ||
incubator-logo.png | ||
java-sm.png | ||
ml-Pipeline.png | ||
ml-PipelineModel.png | ||
ml-Pipelines.pptx | ||
PIClusteringFiveCirclesInputsAndOutputs.png | ||
property_graph.png | ||
python-sm.png | ||
scala-sm.png | ||
spark-logo-77x40px-hd.png | ||
spark-logo-77x50px-hd.png | ||
spark-logo-100x40px.png | ||
spark-logo-hd.png | ||
streaming-arch.png | ||
streaming-dstream-ops.png | ||
streaming-dstream-window.png | ||
streaming-dstream.png | ||
streaming-figures.pptx | ||
streaming-flow.png | ||
streaming-kinesis-arch.png | ||
triplet.png | ||
vertex-cut.png | ||
vertex_routing_edge_tables.png |