Commit graph

124 commits

Author SHA1 Message Date
Joseph E. Gonzalez ebdbedc3e9 Documenting VertexSetRDD and added some testing code for VertexSetRDD 2013-10-19 01:26:08 -07:00
Joseph E. Gonzalez dbc8c9868a Fixing bug in VertexSetRDD that breaks Graph tests. 2013-10-18 23:44:06 -07:00
Reynold Xin 9cf43cfeb7 Merge pull request #28 from jegonzal/VertexSetRDD
Refactoring IndexedRDD to VertexSetRDD.
2013-10-18 22:07:21 -07:00
Ankur Dave 2d3603930e Add a unit test for GraphOps.joinVertices 2013-10-18 19:46:13 -07:00
Ankur Dave d15db10831 Add a unit test for Graph.mapEdges 2013-10-18 19:46:13 -07:00
Ankur Dave d429f015c0 Update GraphSuite aggregateNeighbors test 2013-10-18 19:46:13 -07:00
Joseph E. Gonzalez 5d01ebca3c Specializing IndexedRDD as VertexSetRDD.
1) This allows the index map to be optimized for Vids
2) This makes the code more readable
2) The Graph API can now return VertexSetRDDs from operations that produce results for vertices
2013-10-18 19:03:59 -07:00
Joseph E. Gonzalez bb58aa5330 Added some stub code to address the case where a vertex could occur multiple times in the vertex table or where a vertex in the edge list may not appear in the vertex table.
Moving IndexedRDD into the graphx source tree and removing dependencies in /core.
2013-10-18 18:15:32 -07:00
Ankur Dave 36a902e52d Revert accidental removal of code in 3a40a5e 2013-10-18 16:19:40 -07:00
Dan Crankshaw 3a40a5eb30 Added some documentation. 2013-10-18 15:11:21 -07:00
Joseph E. Gonzalez 3f3d28c73f Switching from Seq to IndexedSeq 2013-10-17 19:55:36 -07:00
Joseph E. Gonzalez 9a03c5fe28 This commit accomplishes three goals:
1) Further simplification of the IndexedRDD operations (eliminating some)
 2) Aggressive reuse of HashMaps
 3) Pipelining join operations within indexedrdd
2013-10-17 19:01:48 -07:00
Ankur Dave bf19aac2b7 Use ArrayBuilder instead of ArrayBuffer
ArrayBuilder is specialized for holding primitive VD types.
2013-10-17 13:19:00 -07:00
Ankur Dave 2282d27cf1 Cache msgsByPartition 2013-10-16 23:56:15 -07:00
Ankur Dave bc234bf0e1 Split vTableReplicated into two RDDs
Previously, (vTableReplicated: IndexedRDD[Pid, VertexHashMap[VD]])
stored one hashmap per partition, taking Vid directly to VD.

To take advantage of rxin's new hashmaps (see
rxin/incubator-spark@32a79d6d13), this
commit splits that data structure into two RDDs:

(vTableReplicationMap: IndexedRDD[Pid, VertexIdToIndexMap]) stores a map
per partition from vertex ID to the index where that vertex's attribute
is stored. This index refers to an array in the same partition in
vTableReplicatedValues.

(vTableReplicatedValues: IndexedRDD[Pid, Array[VD]]) stores the vertex
data and is arranged as described above.
2013-10-16 19:22:23 -07:00
Ankur Dave af8e461841 Set serialization properties in GraphSuite 2013-10-16 19:21:24 -07:00
Joseph E. Gonzalez 57ac9073ae Introducing unique indexedrdd and adding numerous specialized joins 2013-10-16 04:08:22 -07:00
Joseph E. Gonzalez 59700c0c2a switched to more efficienct implementation of reduce by key 2013-10-16 00:18:37 -07:00
Joseph E. Gonzalez 9058f261fe Addressing issue where statistics are not computed correctly 2013-10-15 17:39:09 -07:00
Joseph E. Gonzalez 194bb03d16 Resolved closure capture issues by addressing capture through implicit variables. 2013-10-15 15:10:41 -07:00
Joseph E. Gonzalez 7241cf1632 Updating unit tests. 2013-10-15 14:18:03 -07:00
Joseph E. Gonzalez 345e1e94cc Still trying to resolve issues with capture. 2013-10-15 14:01:38 -07:00
Joseph E. Gonzalez b64337ec40 Trying to resolve issues with closure capture. 2013-10-15 13:02:17 -07:00
Joseph E. Gonzalez e7d0320000 More refactoring and documentating including renaming data to attr for vertex and edge data and eliminating the vertex type. 2013-10-15 02:20:06 -07:00
Joseph E. Gonzalez 67bb39c54b Removing extraneous code 2013-10-14 18:49:05 -07:00
Joseph E. Gonzalez bff223454a trying to address issues with GraphImpl being caught in closures. 2013-10-13 22:27:10 -07:00
Joseph E. Gonzalez 637b67da56 merging changes from upstream benchmarking branch 2013-10-13 19:54:09 -07:00
Joseph E. Gonzalez 494472a6cc Integrated IndexedRDD into graph design. 2013-10-13 19:42:32 -07:00
Dan Crankshaw 1a961dd1f2 Fixed connected components CL params 2013-10-12 01:47:38 +00:00
Dan Crankshaw 1e5535cfcf Added connected components back 2013-10-11 16:38:52 -07:00
Dan Crankshaw 543a54dffa Tried to fix some indenting 2013-10-11 16:07:49 -07:00
Dan Crankshaw c4a23f95c3 Updated code so benchmarks actually run. 2013-10-11 22:57:43 +00:00
Joseph E. Gonzalez fa2f87ca63 added replication and balance reporting 2013-10-10 14:48:40 -07:00
Joseph E. Gonzalez 5f756fb63f added support for random vertex cuts 2013-10-10 14:10:47 -07:00
Joseph E. Gonzalez 8dfac4ea8f added support for random vertex cuts 2013-10-10 14:09:01 -07:00
Dan Crankshaw 9929e7b9a5 Merge branch 'benchmarks' of github.com:amplab/graphx 2013-10-10 13:36:51 -07:00
Dan Crankshaw 4b46d519db Merge pull request #17 from amplab/product2
product 2 change
2013-10-10 13:35:36 -07:00
Reynold Xin 5218e46178 Updated Kryo registration. 2013-10-07 11:48:50 -07:00
Reynold Xin 4f916f5302 Created a MessageToPartition class to send messages without saving the partition id. 2013-10-07 11:31:00 -07:00
Dan Crankshaw 2a8f3db94d Fixed groupEdgeTriplets - it now passes a basic unit test.
The problem was with the way the EdgeTripletRDD iterator worked. Calling
toList on it returned the last value repeatedly. Fixed by overriding
toList in the iterator.
2013-10-06 19:52:40 -07:00
Dan Crankshaw 0d3ea36fd8 Added a groupEdges and a groupEdgeTriplets method. For some reason the groupEdgeTriplets method isn't properly iterating through the set of edges and thus is returning the wrong result. groupEdges seems to be working. 2013-10-06 18:34:23 -07:00
Dan Crankshaw 6cb21ce889 groupEdges() now compiles. Still need some unit tests 2013-10-06 15:33:35 -07:00
Dan Crankshaw 730a3156d3 Added initial groupEdges code. Still a prototype, I haven't figured out quite how it should all work yet. 2013-10-05 19:44:28 -07:00
Dan Crankshaw bfedbee13a Edge partitioner now partitions by canonical edge so all edges between two vertices (in either direction) will be sent to same machine. 2013-10-05 16:04:57 -07:00
Dan Crankshaw e096cbe90e Added 2D canonical edge partitioner 2013-10-05 15:20:15 -07:00
Dan Crankshaw da3e123afb Removed some comments 2013-10-03 18:11:35 -07:00
Dan Crankshaw 1ee60d3b34 Fixed bug in sampleLogNormal 2013-10-03 17:46:37 -07:00
Dan Crankshaw 27b442dc06 Fixed annotation import 2013-10-03 10:29:00 -07:00
Dan Crankshaw 8edd499eff Added rmat graph generator 2013-10-03 10:21:34 -07:00
Dan Crankshaw 3c3cc1508b Added initial implementation of lognormal graph generator. Haven't tested it yet. 2013-09-28 16:00:44 -07:00