ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Reynold Xin	7a26104ab7	Merge pull request #130 from aarondav/shuffle Memory-optimized shuffle file consolidation Reduces overhead of each shuffle block for consolidation from >300 bytes to 8 bytes (1 primitive Long). Verified via profiler testing with 1 mil shuffle blocks, net overhead was ~8,400,000 bytes. Despite the memory-optimized implementation incurring extra CPU overhead, the runtime of the shuffle phase in this test was only around 2% slower, while the reduce phase was 40% faster, when compared to not using any shuffle file consolidation. This is accomplished by replacing the map from ShuffleBlockId to FileSegment (i.e., block id to where it's located), which had high overhead due to being a gigantic, timestamped, concurrent map with a more space-efficient structure. Namely, the following are introduced (I have omitted the word "Shuffle" from some names for clarity): ShuffleFile - there is one ShuffleFile per consolidated shuffle file on disk. We store an array of offsets into the physical shuffle file for each ShuffleMapTask that wrote into the file. This is sufficient to reconstruct FileSegments for mappers that are in the file. FileGroup - contains a set of ShuffleFiles, one per reducer, that a MapTask can use to write its output. There is one FileGroup created per _concurrent_ MapTask. The FileGroup contains an array of the mapIds that have been written to all files in the group. The positions of elements in this array map directly onto the positions in each ShuffleFile's offsets array. In order to locate the FileSegment associated with a BlockId, we have another structure which maps each reducer to the set of ShuffleFiles that were created for it. (There will be as many ShuffleFiles per reducer as there are FileGroups.) To lookup a given ShuffleBlockId (shuffleId, reducerId, mapId), we thus search through all ShuffleFiles associated with that reducer. As a time optimization, we ensure that FileGroups are only reused for MapTasks with monotonically increasing mapIds. This allows us to perform a binary search to locate a mapId inside a group, and also enables potential future optimization (based on the usual monotonic access order).	2013-11-04 17:54:06 -08:00
Aaron Davidson	1ba11b1c6a	Minor cleanup in ShuffleBlockManager	2013-11-04 17:16:41 -08:00
tgravescs	a35472e1dd	Allow spark on yarn to be run from HDFS. Allows the spark.jar, app.jar, and log4j.properties to be put into hdfs.	2013-11-04 16:16:28 -06:00
Aaron Davidson	6201e5e249	Refactor ShuffleBlockManager to reduce public interface - ShuffleBlocks has been removed and replaced by ShuffleWriterGroup. - ShuffleWriterGroup no longer contains a reference to a ShuffleFileGroup. - ShuffleFile has been removed and its contents are now within ShuffleFileGroup. - ShuffleBlockManager.forShuffle has been replaced by a more stateful forMapTask.	2013-11-04 09:41:04 -08:00
Aaron Davidson	b0cf19fe3c	Add javadoc and remove unused code	2013-11-03 22:16:58 -08:00
Aaron Davidson	39d93ed4b9	Clean up test files properly For some reason, even calling java.nio.Files.createTempDirectory().getFile.deleteOnExit() does not delete the directory on exit. Guava's analagous function seems to work, however.	2013-11-03 21:52:59 -08:00
Aaron Davidson	a0bb569a81	use OpenHashMap, remove monotonicity requirement, fix failure bug	2013-11-03 21:34:56 -08:00
Aaron Davidson	8703898d3f	Address Reynold's comments	2013-11-03 21:34:44 -08:00
Aaron Davidson	3ca52309f2	Fix test breakage	2013-11-03 21:34:44 -08:00
Aaron Davidson	1592adfa25	Add documentation and address other comments	2013-11-03 21:34:44 -08:00
Aaron Davidson	7d44dec9bd	Fix weird bug with specialized PrimitiveVector	2013-11-03 21:34:43 -08:00
Aaron Davidson	7453f31181	Address minor comments	2013-11-03 21:34:43 -08:00
Aaron Davidson	84991a1b91	Memory-optimized shuffle file consolidation Overhead of each shuffle block for consolidation has been reduced from >300 bytes to 8 bytes (1 primitive Long). Verified via profiler testing with 1 mil shuffle blocks, net overhead was ~8,400,000 bytes. Despite the memory-optimized implementation incurring extra CPU overhead, the runtime of the shuffle phase in this test was only around 2% slower, while the reduce phase was 40% faster, when compared to not using any shuffle file consolidation.	2013-11-03 21:34:13 -08:00
Reynold Xin	b5dc3393a5	Merge pull request #70 from rxin/hash1 Fast, memory-efficient hash set, hash table implementations optimized for primitive data types. This pull request adds two hash table implementations optimized for primitive data types. For primitive types, the new hash tables are much faster than the current Spark AppendOnlyMap (3X faster - note that the current AppendOnlyMap is already much better than the Java map) while uses much less space (1/4 of the space). Details: This PR first adds a open hash set implementation (OpenHashSet) optimized for primitive types (using Scala's specialization feature). This OpenHashSet is designed to serve as building blocks for more advanced structures. It is currently used to build the following two hash tables, but can be used in the future to build multi-valued hash tables as well (GraphX has this use case). Note that there are some peculiarities in the code for working around some Scala compiler bugs. Building on top of OpenHashSet, this PR adds two different hash tables implementations: 1. OpenHashSet: for nullable keys, optional specialization for primitive values 2. PrimitiveKeyOpenHashMap: for primitive keys that are not nullable, and optional specialization for primitive values I tested the update speed of these two implementations using the changeValue function (which is what Aggregator and cogroup would use). Runtime relative to AppendOnlyMap for inserting 10 million items: Int to Int: ~30% java.lang.Integer to java.lang.Integer: ~100% Int to java.lang.Integer: ~50% java.lang.Integer to Int: ~85%	2013-11-03 20:43:15 -08:00
Reynold Xin	eb5f8a3f97	Code review feedback.	2013-11-03 18:11:44 -08:00
Reynold Xin	1e9543b567	Fixed a bug that uses twice amount of memory for the primitive arrays due to a scala compiler bug. Also addressed Matei's code review comment.	2013-11-02 23:19:01 -07:00
Reynold Xin	da6bb0aedd	Merge branch 'master' into hash1	2013-11-02 22:45:15 -07:00
Reynold Xin	41ead7a745	Merge pull request #133 from Mistobaan/link_fix update default github	2013-11-02 14:41:50 -07:00
Reynold Xin	d407c0732a	Merge pull request #134 from rxin/readme Fixed a typo in Hadoop version in README.	2013-11-02 14:36:37 -07:00
Reynold Xin	895747bb05	Fixed a typo in Hadoop version in README.	2013-11-02 12:58:44 -07:00
Fabrizio (Misto) Milo	4b5d61f31f	update default github	2013-11-01 18:41:49 -07:00
Reynold Xin	e7c7b804b5	Merge pull request #132 from Mistobaan/doc_fix fix persistent-hdfs	2013-11-01 17:58:10 -07:00
Fabrizio (Misto) Milo	3f89354c45	fix persistent-hdfs	2013-11-01 17:47:37 -07:00
Matei Zaharia	d6d11c2edb	Merge pull request #129 from velvia/2013-11/document-local-uris Document & finish support for local: URIs Review all the supported URI schemes for addJar / addFile to the Cluster Overview page. Add support for local: URI to addFile.	2013-11-01 15:40:33 -07:00
Evan Chan	f3679fd494	Add local: URI support to addFile as well	2013-11-01 11:08:03 -07:00
Evan Chan	e54a37fe15	Document all the URIs for addJar/addFile	2013-11-01 10:58:11 -07:00
Kay Ousterhout	fb64828b0b	Cleaned up imports and fixed test bug	2013-10-31 23:42:56 -07:00
Matei Zaharia	8f1098a3f0	Merge pull request #117 from stephenh/avoid_concurrent_modification_exception Handle ConcurrentModificationExceptions in SparkContext init. System.getProperties.toMap will fail-fast when concurrently modified, and it seems like some other thread started by SparkContext does a System.setProperty during it's initialization. Handle this by just looping on ConcurrentModificationException, which seems the safest, since the non-fail-fast methods (Hastable.entrySet) have undefined behavior under concurrent modification.	2013-10-30 20:11:48 -07:00
Kay Ousterhout	a124658e53	Fixed most issues with unit tests	2013-10-30 19:29:38 -07:00
Kay Ousterhout	5e91495f5c	Deduplicate Local and Cluster schedulers. The code in LocalScheduler/LocalTaskSetManager was nearly identical to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy made making updating the schedulers unnecessarily painful and error- prone. This commit combines the two into a single TaskScheduler/ TaskSetManager.	2013-10-30 18:48:34 -07:00
Matei Zaharia	dc9ce16f6b	Merge pull request #126 from kayousterhout/local_fix Fixed incorrect log message in local scheduler This change is especially relevant at the moment, because some users are seeing this failure, and the log message is misleading/incorrect (because for the tests, the max failures is set to 0, not 4)	2013-10-30 17:01:56 -07:00
Matei Zaharia	33de11c51d	Merge pull request #124 from tgravescs/sparkHadoopUtilFix Pull SparkHadoopUtil out of SparkEnv (jira SPARK-886) Having the logic to initialize the correct SparkHadoopUtil in SparkEnv prevents it from being used until after the SparkContext is initialized. This causes issues like https://spark-project.atlassian.net/browse/SPARK-886. It also makes it hard to use in singleton objects. For instance I want to use it in the security code.	2013-10-30 16:58:27 -07:00
Kay Ousterhout	ff038eb4e0	Fixed incorrect log message in local scheduler	2013-10-30 15:27:23 -07:00
Matei Zaharia	618c1f6cf3	Merge pull request #125 from velvia/2013-10/local-jar-uri Add support for local:// URI scheme for addJars() This PR adds support for a new URI scheme for SparkContext.addJars(): `local://file/path`. The local scheme indicates that the `/file/path` exists on every worker node. The reason for its existence is for big library JARs, which would be really expensive to serve using the standard HTTP fileserver distribution method, especially for big clusters. Today the only inexpensive method (assuming such a file is on every host, via say NFS, rsync, etc.) of doing this is to add the JAR to the SPARK_CLASSPATH, but we want a method where the user does not need to modify the Spark configuration. I would add something to the docs, but it's not obvious where to add it. Oh, and it would be great if this could be merged in time for 0.8.1.	2013-10-30 12:03:44 -07:00
Stephen Haberman	09f3b677cb	Avoid match errors when filtering for spark.hadoop settings.	2013-10-30 12:29:39 -05:00
tgravescs	f231aaa24c	move the hadoopJobMetadata back into SparkEnv	2013-10-30 11:46:12 -05:00
Evan Chan	de0285556a	Add support for local:// URI scheme for addJars() This indicates that a jar is available locally on each worker node.	2013-10-30 09:41:35 -07:00
tgravescs	54d9c6f253	Merge remote-tracking branch 'upstream/master' into sparkHadoopUtilFix	2013-10-30 10:41:21 -05:00
Matei Zaharia	745dc42908	Merge pull request #118 from JoshRosen/blockinfo-memory-usage Reduce the memory footprint of BlockInfo objects This pull request reduces the memory footprint of all BlockInfo objects and makes additional optimizations for shuffle blocks. For all BlockInfo objects, these changes remove two boolean fields and one Object field. For shuffle blocks, we additionally remove an Object field and a boolean field. When storing tens of thousands of these objects, this may add up to significant memory savings. A ShuffleBlockInfo now only needs to wrap a single long. This was motivated by a [report of high blockInfo memory usage during shuffles](https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201310.mbox/%3C20131026134353.202b2b9b%40sh9%3E). I haven't run benchmarks to measure the exact memory savings. /cc @aarondav	2013-10-29 23:47:10 -07:00
tgravescs	e5e0ebdb11	fix sparkhdfs lr test	2013-10-29 20:12:45 -05:00
Josh Rosen	cb9c8a922f	Extract BlockInfo classes from BlockManager. This saves space, since the inner classes needed to keep a reference to the enclosing BlockManager.	2013-10-29 18:06:51 -07:00
Stephen Haberman	3a388c320c	Use Properties.clone() instead.	2013-10-29 19:20:40 -05:00
Josh Rosen	846b1cf5ab	Store fewer BlockInfo fields for shuffle blocks.	2013-10-29 15:14:29 -07:00
tgravescs	eeb5f64c67	Remove SparkHadoopUtil stuff from SparkEnv	2013-10-29 17:12:16 -05:00
Reynold Xin	f0e23a023c	Merge pull request #119 from soulmachine/master A little revise for the document	2013-10-29 01:41:44 -04:00
soulmachine	a197137fde	A little revise for the document	2013-10-29 00:28:56 +08:00
Josh Rosen	2d7cf6a271	Restructure BlockInfo fields to reduce memory use.	2013-10-27 23:01:03 -07:00
Matei Zaharia	aec9bf9060	Merge pull request #112 from kayousterhout/ui_task_attempt_id Display both task ID and task attempt ID in UI, and rename taskId to taskAttemptId Previously only the task attempt ID was shown in the UI; this was confusing because the job can be shown as complete while there are tasks still running. Showing the task ID in addition to the attempt ID makes it clear which tasks are redundant. This commit also renames taskId to taskAttemptId in TaskInfo and in the local/cluster schedulers. This identifier was used to uniquely identify attempts, not tasks, so the current naming was confusing. The new naming is also more consistent with map reduce.	2013-10-27 19:32:00 -07:00
Reynold Xin	d4df4749a8	Merge pull request #115 from aarondav/shuffle-fix Eliminate extra memory usage when shuffle file consolidation is disabled Otherwise, we see SPARK-946 even when shuffle file consolidation is disabled. Fixing SPARK-946 is still forthcoming.	2013-10-27 22:11:21 -04:00
Stephen Haberman	a6ae2b4832	Handle ConcurrentModificationExceptions in SparkContext init. System.getProperties.toMap will fail-fast when concurrently modified, and it seems like some other thread started by SparkContext does a System.setProperty during it's initialization. Handle this by just looping on ConcurrentModificationException, which seems the safest, since the non-fail-fast methods (Hastable.entrySet) have undefined behavior under concurrent modification.	2013-10-27 14:08:32 -05:00

1 2 3 4 5 ...

4526 commits