Commit graph

209 commits

Author SHA1 Message Date
Dan Crankshaw c430d2e21d Added bitset to kryo register 2013-10-31 01:01:59 -07:00
Dan Crankshaw 37b4afbbf9 Merge branch 'cleanup' 2013-10-30 23:17:50 -07:00
Joseph E. Gonzalez a3ce484a2c Adding additional type constraints to VertexSetRDD to help diagnose issues with recent benchmarks. 2013-10-30 21:02:21 -07:00
Joseph E. Gonzalez 09ea661bbb removing completely unnecessary map operation. 2013-10-30 20:07:26 -07:00
Joseph E. Gonzalez 003f8a505d Removing potential additional shuffle dependency where an already partitioned RDD[(Vid, VD)] is repartitioned. 2013-10-30 20:06:54 -07:00
Joseph E. Gonzalez d513addb77 added lineage tracking code 2013-10-30 20:05:29 -07:00
Joseph E. Gonzalez a4b8ddf417 removing unused commented code 2013-10-30 16:07:05 -07:00
Dan Crankshaw a0c86c3689 Merge pull request #38 from jegonzal/Documentation
Improving Documentation
2013-10-30 15:34:39 -07:00
Dan Crankshaw e1099f4d89 Fixed issue with canonical edge partitioner. 2013-10-30 15:03:21 -07:00
Joey 06adf636c5 Merge pull request #33 from kellrott/master
Fixing graph/pom.xml
2013-10-29 16:43:46 -07:00
Joseph E. Gonzalez 38ec0baf5c fixing a typo in the VertexSetRDD docs 2013-10-29 16:27:55 -07:00
Joseph E. Gonzalez d8c8256e52 merging upstream changes 2013-10-29 16:23:26 -07:00
Joseph E. Gonzalez 08c7b040d6 Documented the VertexSetRDD 2013-10-29 15:03:13 -07:00
Joseph E. Gonzalez ede329336d Fixing a scaladoc bug in graph generators. 2013-10-29 14:50:12 -07:00
Joseph E. Gonzalez 15958ca65a Reindenting documentation. 2013-10-29 14:01:24 -07:00
Joseph E. Gonzalez d316cad9b1 Documented Graph.appy functions. 2013-10-29 13:58:04 -07:00
Joseph E. Gonzalez 19da8820fc Minor modifications to documentation. 2013-10-29 11:06:06 -07:00
Joseph E. Gonzalez 77626d1507 Adding collect neighbors and documenting GraphOps. 2013-10-29 11:05:42 -07:00
Joseph E. Gonzalez 942de98433 Making suggested changes. 2013-10-29 10:19:49 -07:00
Joseph E. Gonzalez d6a902f309 Finished updating connected components to used Pregel like abstraction and created a series of tests in the AnalyticsSuite. 2013-10-28 11:52:26 -07:00
Joseph E. Gonzalez a2287ae138 Implementing connected components on top of pregel like abstraction. 2013-10-27 10:42:11 -07:00
Joseph E. Gonzalez 6a0fbc0374 Updating the GraphLab API to match the changes made to the Pregel API. 2013-10-26 15:44:19 -07:00
Joseph E. Gonzalez 08024c938c Adding more documentation to the Pregel API as well as additional functionality including the ability to specify the edge direction along which messages are computed. 2013-10-26 15:42:51 -07:00
Joseph E. Gonzalez 00e73833cc Fixing a bug in reverse edge direction. 2013-10-26 15:10:30 -07:00
Kyle Ellrott 8236d5dcc4 More changes to the graph/pom.xml to make it match the other subprojects 2013-10-25 15:52:44 -07:00
Kyle Ellrott d39ac2eb40 Merge https://github.com/amplab/graphx 2013-10-25 13:16:05 -07:00
Kyle Ellrott 59ec6b85d0 Merge branch 'master' of https://github.com/amplab/graphx 2013-10-24 10:29:24 -07:00
Joseph E. Gonzalez c30624dcbb Adding dynamic pregel, fixing bugs in PageRank, and adding basic analytics unit tests. 2013-10-23 00:25:45 -07:00
Joseph E. Gonzalez 0bd92ed8d0 Fixing a bug in pregel where the initial vertex-program results are lost. 2013-10-22 19:10:51 -07:00
Joseph E. Gonzalez be8269af07 Merge branch 'VertexSetRDD_Tests' into AnalyticsCleanup 2013-10-22 15:03:49 -07:00
Joseph E. Gonzalez e3eb03d5b5 Starting analytics test suite. 2013-10-22 15:03:16 -07:00
Joseph E. Gonzalez ba5c75692a Updating analytics to reflect changes in the pregel interface and moving degree information into the edge attribute. 2013-10-22 15:03:00 -07:00
Joseph E. Gonzalez 46b195253e Adding some additional graph generators to support unit testing of the analytics package. 2013-10-22 15:01:49 -07:00
Joseph E. Gonzalez 14a3329a11 Changing the Pregel interface slightly to better support type inference. 2013-10-22 15:01:20 -07:00
Kyle Ellrott 73bf8587e2 Fixing graph/pom.xml 2013-10-21 15:13:31 -07:00
Joseph E. Gonzalez ebdbedc3e9 Documenting VertexSetRDD and added some testing code for VertexSetRDD 2013-10-19 01:26:08 -07:00
Joseph E. Gonzalez dbc8c9868a Fixing bug in VertexSetRDD that breaks Graph tests. 2013-10-18 23:44:06 -07:00
Reynold Xin 9cf43cfeb7 Merge pull request #28 from jegonzal/VertexSetRDD
Refactoring IndexedRDD to VertexSetRDD.
2013-10-18 22:07:21 -07:00
Ankur Dave 2d3603930e Add a unit test for GraphOps.joinVertices 2013-10-18 19:46:13 -07:00
Ankur Dave d15db10831 Add a unit test for Graph.mapEdges 2013-10-18 19:46:13 -07:00
Ankur Dave d429f015c0 Update GraphSuite aggregateNeighbors test 2013-10-18 19:46:13 -07:00
Joseph E. Gonzalez 5d01ebca3c Specializing IndexedRDD as VertexSetRDD.
1) This allows the index map to be optimized for Vids
2) This makes the code more readable
2) The Graph API can now return VertexSetRDDs from operations that produce results for vertices
2013-10-18 19:03:59 -07:00
Joseph E. Gonzalez bb58aa5330 Added some stub code to address the case where a vertex could occur multiple times in the vertex table or where a vertex in the edge list may not appear in the vertex table.
Moving IndexedRDD into the graphx source tree and removing dependencies in /core.
2013-10-18 18:15:32 -07:00
Ankur Dave 36a902e52d Revert accidental removal of code in 3a40a5e 2013-10-18 16:19:40 -07:00
Dan Crankshaw 3a40a5eb30 Added some documentation. 2013-10-18 15:11:21 -07:00
Joseph E. Gonzalez 3f3d28c73f Switching from Seq to IndexedSeq 2013-10-17 19:55:36 -07:00
Joseph E. Gonzalez 9a03c5fe28 This commit accomplishes three goals:
1) Further simplification of the IndexedRDD operations (eliminating some)
 2) Aggressive reuse of HashMaps
 3) Pipelining join operations within indexedrdd
2013-10-17 19:01:48 -07:00
Ankur Dave bf19aac2b7 Use ArrayBuilder instead of ArrayBuffer
ArrayBuilder is specialized for holding primitive VD types.
2013-10-17 13:19:00 -07:00
Ankur Dave 2282d27cf1 Cache msgsByPartition 2013-10-16 23:56:15 -07:00
Ankur Dave bc234bf0e1 Split vTableReplicated into two RDDs
Previously, (vTableReplicated: IndexedRDD[Pid, VertexHashMap[VD]])
stored one hashmap per partition, taking Vid directly to VD.

To take advantage of rxin's new hashmaps (see
rxin/incubator-spark@32a79d6d13), this
commit splits that data structure into two RDDs:

(vTableReplicationMap: IndexedRDD[Pid, VertexIdToIndexMap]) stores a map
per partition from vertex ID to the index where that vertex's attribute
is stored. This index refers to an array in the same partition in
vTableReplicatedValues.

(vTableReplicatedValues: IndexedRDD[Pid, Array[VD]]) stores the vertex
data and is arranged as described above.
2013-10-16 19:22:23 -07:00
Ankur Dave af8e461841 Set serialization properties in GraphSuite 2013-10-16 19:21:24 -07:00
Joseph E. Gonzalez 57ac9073ae Introducing unique indexedrdd and adding numerous specialized joins 2013-10-16 04:08:22 -07:00
Joseph E. Gonzalez 59700c0c2a switched to more efficienct implementation of reduce by key 2013-10-16 00:18:37 -07:00
Joseph E. Gonzalez 9058f261fe Addressing issue where statistics are not computed correctly 2013-10-15 17:39:09 -07:00
Joseph E. Gonzalez 194bb03d16 Resolved closure capture issues by addressing capture through implicit variables. 2013-10-15 15:10:41 -07:00
Joseph E. Gonzalez 7241cf1632 Updating unit tests. 2013-10-15 14:18:03 -07:00
Joseph E. Gonzalez 345e1e94cc Still trying to resolve issues with capture. 2013-10-15 14:01:38 -07:00
Joseph E. Gonzalez b64337ec40 Trying to resolve issues with closure capture. 2013-10-15 13:02:17 -07:00
Joseph E. Gonzalez e7d0320000 More refactoring and documentating including renaming data to attr for vertex and edge data and eliminating the vertex type. 2013-10-15 02:20:06 -07:00
Joseph E. Gonzalez 67bb39c54b Removing extraneous code 2013-10-14 18:49:05 -07:00
Joseph E. Gonzalez bff223454a trying to address issues with GraphImpl being caught in closures. 2013-10-13 22:27:10 -07:00
Joseph E. Gonzalez 637b67da56 merging changes from upstream benchmarking branch 2013-10-13 19:54:09 -07:00
Joseph E. Gonzalez 494472a6cc Integrated IndexedRDD into graph design. 2013-10-13 19:42:32 -07:00
Dan Crankshaw 1a961dd1f2 Fixed connected components CL params 2013-10-12 01:47:38 +00:00
Dan Crankshaw 1e5535cfcf Added connected components back 2013-10-11 16:38:52 -07:00
Dan Crankshaw 543a54dffa Tried to fix some indenting 2013-10-11 16:07:49 -07:00
Dan Crankshaw c4a23f95c3 Updated code so benchmarks actually run. 2013-10-11 22:57:43 +00:00
Joseph E. Gonzalez fa2f87ca63 added replication and balance reporting 2013-10-10 14:48:40 -07:00
Joseph E. Gonzalez 5f756fb63f added support for random vertex cuts 2013-10-10 14:10:47 -07:00
Joseph E. Gonzalez 8dfac4ea8f added support for random vertex cuts 2013-10-10 14:09:01 -07:00
Dan Crankshaw 9929e7b9a5 Merge branch 'benchmarks' of github.com:amplab/graphx 2013-10-10 13:36:51 -07:00
Dan Crankshaw 4b46d519db Merge pull request #17 from amplab/product2
product 2 change
2013-10-10 13:35:36 -07:00
Reynold Xin 5218e46178 Updated Kryo registration. 2013-10-07 11:48:50 -07:00
Reynold Xin 4f916f5302 Created a MessageToPartition class to send messages without saving the partition id. 2013-10-07 11:31:00 -07:00
Dan Crankshaw 2a8f3db94d Fixed groupEdgeTriplets - it now passes a basic unit test.
The problem was with the way the EdgeTripletRDD iterator worked. Calling
toList on it returned the last value repeatedly. Fixed by overriding
toList in the iterator.
2013-10-06 19:52:40 -07:00
Dan Crankshaw 0d3ea36fd8 Added a groupEdges and a groupEdgeTriplets method. For some reason the groupEdgeTriplets method isn't properly iterating through the set of edges and thus is returning the wrong result. groupEdges seems to be working. 2013-10-06 18:34:23 -07:00
Dan Crankshaw 6cb21ce889 groupEdges() now compiles. Still need some unit tests 2013-10-06 15:33:35 -07:00
Dan Crankshaw 730a3156d3 Added initial groupEdges code. Still a prototype, I haven't figured out quite how it should all work yet. 2013-10-05 19:44:28 -07:00
Dan Crankshaw bfedbee13a Edge partitioner now partitions by canonical edge so all edges between two vertices (in either direction) will be sent to same machine. 2013-10-05 16:04:57 -07:00
Dan Crankshaw e096cbe90e Added 2D canonical edge partitioner 2013-10-05 15:20:15 -07:00
Dan Crankshaw da3e123afb Removed some comments 2013-10-03 18:11:35 -07:00
Dan Crankshaw 1ee60d3b34 Fixed bug in sampleLogNormal 2013-10-03 17:46:37 -07:00
Dan Crankshaw 27b442dc06 Fixed annotation import 2013-10-03 10:29:00 -07:00
Dan Crankshaw 8edd499eff Added rmat graph generator 2013-10-03 10:21:34 -07:00
Dan Crankshaw 3c3cc1508b Added initial implementation of lognormal graph generator. Haven't tested it yet. 2013-09-28 16:00:44 -07:00
Ankur Dave bf05dc7e78 Add a unit test for aggregateNeighbors 2013-09-19 23:45:15 -07:00
Ankur Dave 7cadeffdf4 Merge branch 'master' into aggregateNeighbors-returns-graph 2013-09-19 23:14:26 -07:00
Ankur Dave f08e520f4c Initialize sc in GraphSuite to avoid NullPointerException 2013-09-19 23:12:24 -07:00
Ankur Dave f02d5c8c53 Fix typo in aggregateNeighbors docs 2013-09-19 23:06:37 -07:00
Ankur Dave d3cbde0085 Import appropriate Spark core classes 2013-09-19 19:29:58 -07:00
Ankur Dave c278907bf0 Move BytecodeUtils to the right package 2013-09-19 19:28:22 -07:00
Ankur Dave 4c694bd705 Move IndexedRDD and GraphSuite to org.apache.spark 2013-09-19 19:13:07 -07:00
Ankur Dave 4e967af6af Return Graph from default aggregateNeighbors also 2013-09-18 16:18:33 -07:00
Ankur Dave b04f1a4019 Implement aggregateNeighbors returning Graph 2013-09-18 16:18:33 -07:00
Ankur Dave 9ff783599b Return Graph from aggregateNeighbors; update callers
This commit only affects the Graph API, not GraphImpl.
2013-09-18 16:18:33 -07:00
Joseph E. Gonzalez 55696e2584 GraphX now builds with all merged changes. 2013-09-17 22:42:12 -07:00
Joseph E. Gonzalez 5ccb60d467 Working on graph test suite 2013-08-11 14:49:22 -07:00
Joseph E. Gonzalez ddf126edad added subgraph 2013-08-06 17:48:04 -07:00
Joseph E. Gonzalez b454314e07 Added 2d partitioning 2013-08-06 15:14:13 -07:00
Joseph E. Gonzalez 7ae83f6ef4 Switching to Long vids instead of integers. This required a surprising number of changes since the fastutil library function names include the type (e.g., getLong() instead of just get()) 2013-08-06 14:05:54 -07:00