ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Ankur Dave	11dd35c28b	Clean up GraphGenerators	2014-01-10 15:23:32 -08:00
Ankur Dave	9e48af6dba	Remove unused HashUtils class	2014-01-10 15:22:57 -08:00
Ankur Dave	b437ed62a8	graph -> graphx in pom.xml	2014-01-10 15:22:31 -08:00
Andrew Or	e4c51d2113	Address Patrick's and Reynold's comments Aside from trivial formatting changes, use nulls instead of Options for DiskMapIterator, and add documentation for spark.shuffle.externalSorting and spark.shuffle.memoryFraction. Also, set spark.shuffle.memoryFraction to 0.3, and spark.storage.memoryFraction = 0.6.	2014-01-10 15:09:51 -08:00
RongGu	94776f753f	fix a type error in comment lines	2014-01-11 05:43:56 +08:00
Thomas Graves	7cef8435d7	Merge pull request #371 from tgravescs/yarn_client_addjar_misc_fixes Yarn client addjar and misc fixes Fix the addJar functionality in yarn-client mode, add support for the other options supported in yarn-standalone mode, set the application type on yarn in hadoop 2.X, add documentation, change heartbeat interval to be same code as the yarn-standalone so it doesn't take so long to get containers and exit.	2014-01-10 15:34:15 -06:00
Ankur Dave	7bda997785	Improve docs for PartitionStrategy	2014-01-10 13:00:28 -08:00
Patrick Wendell	7b58f116e5	Merge pull request #384 from pwendell/debug-logs Make DEBUG-level logs consummable. Removes two things that caused issues with the debug logs: (a) Internal polling in the DAGScheduler was polluting the logs. (b) The Scala REPL logs were really noisy.	2014-01-10 12:47:46 -08:00
Ankur Dave	eb4b46f8d1	Improve docs for GraphOps	2014-01-10 12:46:00 -08:00
Shivaram Venkataraman	7c4e6e1bf1	Add i2 instance types to Spark EC2.	2014-01-10 12:44:55 -08:00
Ankur Dave	9454fa1f6c	Remove duplicate method in GraphLoader and improve docs	2014-01-10 12:37:20 -08:00
Ankur Dave	37611e57f6	Improve docs for EdgeRDD, EdgeTriplet, and GraphLab	2014-01-10 12:37:03 -08:00
Ankur Dave	eee9bc0958	Remove commented-out perf files	2014-01-10 12:36:15 -08:00
Ankur Dave	c39ec3017f	Remove some commented code	2014-01-10 12:17:17 -08:00
Tathagata Das	e4bb845238	Updated docs based on Patrick's comments in PR 383.	2014-01-10 12:17:09 -08:00
Ankur Dave	5fcd2a61b4	Finish cleaning up Graph docs	2014-01-10 12:17:04 -08:00
Ankur Dave	4c114a7556	Start cleaning up Scaladocs in Graph and EdgeRDD	2014-01-10 11:37:54 -08:00
Ankur Dave	3eb83191cb	Generate GraphX docs	2014-01-10 11:37:28 -08:00
Ankur Dave	6bd9a78e78	Add back Bagel links to docs, but mark them superseded	2014-01-10 11:37:10 -08:00
Ankur Dave	cfc10c74a3	Remove EdgeTriplet.{src,dst}Stale, which were unused	2014-01-10 10:43:23 -08:00
Ankur Dave	bf50e8c6cd	Remove commented code from Analytics	2014-01-10 10:37:04 -08:00
Ankur Dave	1b2aad918c	Update graphx/pom.xml to mirror mllib/pom.xml	2014-01-10 10:34:40 -08:00
Patrick Wendell	e9ed2d9e82	Make DEBUG-level logs consummable. Removes two things that caused issues with the debug logs: (a) Internal polling in the DAGScheduler was polluting the logs. (b) The Scala REPL logs were really noisy.	2014-01-10 10:33:24 -08:00
Ankur Dave	23d2995116	Merge pull request #1 from jegonzal/graphx ProgrammingGuide	2014-01-10 10:20:02 -08:00
Tathagata Das	2213a5a47f	Merge branch 'driver-test' of github.com:tdas/incubator-spark into driver-test	2014-01-10 05:06:22 -08:00
Tathagata Das	740730a179	Fixed conf/slaves and updated docs.	2014-01-10 05:06:15 -08:00
Tathagata Das	4f609f7901	Removed spark.hostPort and other setting from SparkConf before saving to checkpoint.	2014-01-10 12:58:07 +00:00
Tathagata Das	d7ec73ac76	Merge branch 'driver-test' of github.com:tdas/incubator-spark into driver-test	2014-01-10 11:44:17 +00:00
Tathagata Das	9d3d9c8251	Refactored graph checkpoint file reading and writing code to make it cleaner and easily debuggable.	2014-01-10 11:44:02 +00:00
Ankur Dave	729277ebc4	Undo `8b6b8ac87f` Getting unpersist right in GraphLab is tricky.	2014-01-10 01:53:28 -08:00
Ankur Dave	4cc550909a	graph -> graphx in log4j.properties	2014-01-10 00:59:59 -08:00
Joseph E. Gonzalez	b1eeefb401	WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide.	2014-01-10 00:39:08 -08:00
Ankur Dave	ba511f890e	Avoid recomputation by caching all multiply-used RDDs	2014-01-10 00:35:02 -08:00
Ankur Dave	8b6b8ac87f	Unpersist previous iterations in GraphLab	2014-01-10 00:34:08 -08:00
Matei Zaharia	669ba4caa9	Fix default TTL for metadata cleaner It seems to have been set to 3500 in a previous commit for debugging, but it should be off by default	2014-01-10 00:21:36 -08:00
Pillis	8d021b42bc	SPARK-961. Add a Vector.random() method - update 1	2014-01-10 00:07:36 -08:00
Matei Zaharia	0ebc97305a	Merge pull request #375 from mateiz/option-fix Fix bug added when we changed AppDescription.maxCores to an Option The Scala compiler warned about this -- we were comparing an Option against an integer now.	2014-01-09 23:58:49 -08:00
Patrick Wendell	dd03cea02a	Merge pull request #378 from pwendell/consolidate_on Enable shuffle consolidation by default. Bump this to being enabled for 0.9.0.	2014-01-09 23:38:03 -08:00
Ankur Dave	2578332f97	Add Graph.unpersistVertices()	2014-01-09 23:34:35 -08:00
Ankur Dave	8ae108f6c4	Unpersist previous iterations in Pregel	2014-01-09 23:25:35 -08:00
Reza Zadeh	21c8a54c08	Merge remote-tracking branch 'upstream/master' into sparsesvd Conflicts: docs/mllib-guide.md	2014-01-09 22:45:32 -08:00
Patrick Wendell	460f655cc6	Enable shuffle consolidation by default. Bump this to being enabled for 0.9.0.	2014-01-09 22:42:50 -08:00
Reza Zadeh	cf5bd4ab2e	fix example	2014-01-09 22:39:41 -08:00
Patrick Wendell	997c830e0b	Merge pull request #363 from pwendell/streaming-logs Set default logging to WARN for Spark streaming examples. This programatically sets the log level to WARN by default for streaming tests. If the user has already specified a log4j.properties file, the user's file will take precedence over this default.	2014-01-09 22:22:20 -08:00
Andrew Or	372a533a6c	Fix wonky imports from merge	2014-01-09 21:47:49 -08:00
Ankur Dave	210f2dd84f	graph -> graphx in bin/compute-classpath.sh	2014-01-09 21:47:40 -08:00
Andrew Or	aa5002bb96	Defensively allocate memory from global pool This is an alternative to the existing approach, which evenly distributes the collective shuffle memory among all running tasks. In the new approach, each thread requests a chunk of memory whenever its map is about to multiplicatively grow. If there is sufficient memory in the global pool, the thread allocates it and grows its map. Otherwise, it spills. A danger with the previous approach is that a new task may quickly fill up its map before old tasks finish spilling, potentially causing an OOM. This approach prevents this scenario as it favors existing tasks over new tasks; any thread that may step over the boundary of other threads defensively backs off and starts spilling. Testing through spark-perf reveals: (1) When no spills have occured, the performance of external sorting using this memory management approach is essentially the same as without external sorting. (2) When one or more spills have occured, the performance of external sorting is a small multiple (3x) worse	2014-01-09 21:43:58 -08:00
Andrew Or	d76e1f90a8	Merge github.com:apache/incubator-spark Conflicts: core/src/main/scala/org/apache/spark/SparkEnv.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java	2014-01-09 21:38:48 -08:00
Ankur Dave	b7c92dded3	Add implicit algorithm methods for Graph; remove standalone PageRank	2014-01-09 20:44:28 -08:00
Patrick Wendell	7b748b83a1	Minor clean-up	2014-01-09 20:42:48 -08:00

... 22 23 24 25 26 ...

7129 commits