ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Dan Crankshaw	958d7213a5	Merge branch 'master' of https://github.com/amplab/graphx	2013-11-13 23:31:14 +00:00
Reynold Xin	a81fcb749d	Merge pull request #68 from jegonzal/BitSetSetUntilBug Addressing bug in BitSet.setUntil(ind)	2013-11-13 10:41:01 -08:00
Joseph E. Gonzalez	f0ef75c7a4	Addressing bug in BitSet.setUntil(ind) where if invoked with a multiple of 64 could lead to an index out of bounds error.	2013-11-13 10:35:23 -08:00
Dan Crankshaw	d19f2e8f3e	Removed slaves from git	2013-11-12 05:21:34 +00:00
Joey	143c01dbd6	Update README.md Changing image references to master branch.	2013-11-11 19:37:16 -08:00
Reynold Xin	2e8d45032d	Merge pull request #63 from jegonzal/VertexSetCleanup Cleanup of VertexSetRDD	2013-11-11 17:34:09 -08:00
Joseph E. Gonzalez	577092080c	Cleanning up documentation of VertexSetRDD.scala	2013-11-11 17:29:22 -08:00
Reynold Xin	b8e294a21b	Merge pull request #61 from ankurdave/pid2vid Shuffle replicated vertex attributes efficiently in columnar format	2013-11-11 16:25:42 -08:00
Reynold Xin	3d7277ccbe	Merge pull request #55 from ankurdave/aggregateNeighbors-variants Specialize mapReduceTriplets for accessing subsets of vertex attributes	2013-11-11 15:49:28 -08:00
Ankur Dave	bee1015620	Handle ClassNotFoundException from ByteCodeUtils ByteCodeUtils.invokedMethod(), which we use in mapReduceTriplets, throws a ClassNotFoundException when called with a closure defined in the console. This commit catches the exception and conservatively assumes the closure references all edge attributes.	2013-11-10 23:00:37 -08:00
Dan Crankshaw	60db25bded	Fixed merge conflicts.	2013-11-10 15:45:55 -08:00
Ankur Dave	d1ff1b7222	Build pid2vid structures only once, in Vid2Pid	2013-11-10 14:47:39 -08:00
Ankur Dave	502c511711	Use pid2vid for creating VTableReplicatedValues	2013-11-10 14:36:14 -08:00
Ankur Dave	53d24a973e	Fix typo	2013-11-10 14:24:38 -08:00
Ankur Dave	aa24b0bbe8	Add test for mapReduceTriplets in GraphSuite	2013-11-10 14:24:38 -08:00
Ankur Dave	bf4e45e685	Factor out VTableReplicatedValues	2013-11-10 14:24:38 -08:00
Ankur Dave	cdbd19bbee	Create all versions of vid2pid ahead of time	2013-11-10 14:10:23 -08:00
Ankur Dave	27e4355d61	Test no vertex attribute replication	2013-11-10 14:04:12 -08:00
Ankur Dave	80abc28078	Optimize mrTriplets for source-attr-only mapF using bytecode inspection	2013-11-10 14:04:12 -08:00
Joey	1a06f707e3	Merge pull request #60 from amplab/rxin Looks good to me.	2013-11-10 10:54:44 -08:00
Reynold Xin	0e813cd483	Fix the hanging bug.	2013-11-09 23:29:37 -08:00
Reynold Xin	f6c946206a	Merge pull request #58 from jegonzal/KryoMessages Kryo messages	2013-11-09 16:14:45 -08:00
Joseph E. Gonzalez	6083e4350f	Adding unit tests to reproduce error.	2013-11-08 15:39:30 -08:00
Joseph E. Gonzalez	161784d0e6	Fixing tests	2013-11-07 20:40:21 -08:00
Joseph E. Gonzalez	e523f0d2fb	merged and debugged	2013-11-07 20:19:49 -08:00
Joseph E. Gonzalez	908e606473	Additional optimizations	2013-11-07 19:47:30 -08:00
Reynold Xin	bac7be30cd	Made more specialized messages.	2013-11-07 19:39:48 -08:00
Reynold Xin	64ad3b18d9	Merge branch 'master' into rxin Conflicts: graph/src/main/scala/org/apache/spark/graph/impl/GraphImpl.scala	2013-11-07 19:23:42 -08:00
Reynold Xin	2406bf33e4	Use custom serializer for aggregation messages when the data type is int/double.	2013-11-07 19:18:58 -08:00
Ankur Dave	6ee05be1c8	Merge pull request #49 from jegonzal/graphxshell GraphX Console with Logo Text	2013-11-07 19:12:41 -08:00
Ankur Dave	a9f96b54e4	Merge pull request #56 from jegonzal/PregelAPIChanges Changing Pregel API to use mapReduceTriplets instead of aggregateNeighbors	2013-11-07 18:56:56 -08:00
Joseph E. Gonzalez	e9308e0e75	Changing Pregel API to operate directly on edge triplets in SendMessage rather than (Vid, EdgeTriplet) pairs.	2013-11-07 18:04:06 -08:00
Reynold Xin	5907137d11	Merge pull request #54 from amplab/rxin Converted for loops to while loops in EdgePartition.	2013-11-07 16:58:31 -08:00
Reynold Xin	6fadff2b92	Converted for loops to while loops in EdgePartition.	2013-11-07 16:54:33 -08:00
Reynold Xin	edf41647f4	Merge pull request #53 from amplab/rxin Added GraphX to classpath.	2013-11-07 16:22:43 -08:00
Reynold Xin	95f1f5315e	Added GraphX to classpath.	2013-11-07 16:22:05 -08:00
Reynold Xin	c379e10455	Merge pull request #51 from jegonzal/VertexSetRDD Reverting to Array based (materialized) output in VertexSetRDD	2013-11-07 16:01:47 -08:00
Dan Crankshaw	384befb208	Merge branch 'master' of github.com:amplab/graphx	2013-11-06 19:50:55 -08:00
Joseph E. Gonzalez	8ac15e8e43	Merge branch 'master' of https://github.com/amplab/graphx into graphxshell	2013-11-05 01:37:12 -08:00
Joseph E. Gonzalez	3e504938c2	merging upstream changes	2013-11-05 01:36:48 -08:00
Joey	ca44b5134a	Merge pull request #50 from amplab/mergemerge Merge Spark master into graphx	2013-11-05 01:32:55 -08:00
Joseph E. Gonzalez	2dc9ec2387	Reverting to Array based (materialized) output of all VertexSetRDD operations.	2013-11-05 01:15:12 -08:00
Reynold Xin	551a43fd3d	Merge branch 'master' of github.com:apache/incubator-spark into mergemerge Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala	2013-11-04 21:02:36 -08:00
Joseph E. Gonzalez	3c37928fab	This commit adds a new graphx-shell which is essentially the same as the spark shell but with GraphX packages automatically imported and with Kryo serialization enabled for GraphX types. In addition the graphx-shell has a nifty new logo. To make these changes minimally invasive in the SparkILoop.scala I added some additional environment variables: SPARK_BANNER_TEXT: If set this string is displayed instead of the spark logo SPARK_SHELL_INIT_BLOCK: if set this expression is evaluated in the spark shell after the spark context is created.	2013-11-04 20:10:15 -08:00
Reynold Xin	7a26104ab7	Merge pull request #130 from aarondav/shuffle Memory-optimized shuffle file consolidation Reduces overhead of each shuffle block for consolidation from >300 bytes to 8 bytes (1 primitive Long). Verified via profiler testing with 1 mil shuffle blocks, net overhead was ~8,400,000 bytes. Despite the memory-optimized implementation incurring extra CPU overhead, the runtime of the shuffle phase in this test was only around 2% slower, while the reduce phase was 40% faster, when compared to not using any shuffle file consolidation. This is accomplished by replacing the map from ShuffleBlockId to FileSegment (i.e., block id to where it's located), which had high overhead due to being a gigantic, timestamped, concurrent map with a more space-efficient structure. Namely, the following are introduced (I have omitted the word "Shuffle" from some names for clarity): ShuffleFile - there is one ShuffleFile per consolidated shuffle file on disk. We store an array of offsets into the physical shuffle file for each ShuffleMapTask that wrote into the file. This is sufficient to reconstruct FileSegments for mappers that are in the file. FileGroup - contains a set of ShuffleFiles, one per reducer, that a MapTask can use to write its output. There is one FileGroup created per _concurrent_ MapTask. The FileGroup contains an array of the mapIds that have been written to all files in the group. The positions of elements in this array map directly onto the positions in each ShuffleFile's offsets array. In order to locate the FileSegment associated with a BlockId, we have another structure which maps each reducer to the set of ShuffleFiles that were created for it. (There will be as many ShuffleFiles per reducer as there are FileGroups.) To lookup a given ShuffleBlockId (shuffleId, reducerId, mapId), we thus search through all ShuffleFiles associated with that reducer. As a time optimization, we ensure that FileGroups are only reused for MapTasks with monotonically increasing mapIds. This allows us to perform a binary search to locate a mapId inside a group, and also enables potential future optimization (based on the usual monotonic access order).	2013-11-04 17:54:06 -08:00
Aaron Davidson	1ba11b1c6a	Minor cleanup in ShuffleBlockManager	2013-11-04 17:16:41 -08:00
Aaron Davidson	6201e5e249	Refactor ShuffleBlockManager to reduce public interface - ShuffleBlocks has been removed and replaced by ShuffleWriterGroup. - ShuffleWriterGroup no longer contains a reference to a ShuffleFileGroup. - ShuffleFile has been removed and its contents are now within ShuffleFileGroup. - ShuffleBlockManager.forShuffle has been replaced by a more stateful forMapTask.	2013-11-04 09:41:04 -08:00
Aaron Davidson	b0cf19fe3c	Add javadoc and remove unused code	2013-11-03 22:16:58 -08:00
Aaron Davidson	39d93ed4b9	Clean up test files properly For some reason, even calling java.nio.Files.createTempDirectory().getFile.deleteOnExit() does not delete the directory on exit. Guava's analagous function seems to work, however.	2013-11-03 21:52:59 -08:00
Aaron Davidson	a0bb569a81	use OpenHashMap, remove monotonicity requirement, fix failure bug	2013-11-03 21:34:56 -08:00

1 2 3 4 5 ...

4760 commits