ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Ankur Dave	971f824014	Revert unnecessary changes to core While benchmarking, we accidentally committed some unnecessary changes to core such as adding logging. These changes make it more difficult to merge from Spark upstream, so this commit reverts them.	2013-10-18 16:07:38 -07:00
Joseph E. Gonzalez	1856b37e9d	Merge branch 'master' of https://github.com/apache/incubator-spark into indexedrdd_graphx	2013-10-18 12:21:19 -07:00
Joseph E. Gonzalez	3f3d28c73f	Switching from Seq to IndexedSeq	2013-10-17 19:55:36 -07:00
Joseph E. Gonzalez	9a03c5fe28	This commit accomplishes three goals: 1) Further simplification of the IndexedRDD operations (eliminating some) 2) Aggressive reuse of HashMaps 3) Pipelining join operations within indexedrdd	2013-10-17 19:01:48 -07:00
Kay Ousterhout	809f547633	Fixed unit tests	2013-10-16 23:16:12 -07:00
Kay Ousterhout	ec512583ab	Removed TaskSchedulerListener interface. The interface was used only by the DAG scheduler (so it wasn't necessary to define the additional interface), and the naming makes it very confusing when reading the code (because "listener" was used to describe the DAG scheduler, rather than SparkListeners, which implement a nearly-identical interface but serve a different function).	2013-10-16 16:57:42 -07:00
Joseph E. Gonzalez	57ac9073ae	Introducing unique indexedrdd and adding numerous specialized joins	2013-10-16 04:08:22 -07:00
Joseph E. Gonzalez	59700c0c2a	switched to more efficienct implementation of reduce by key	2013-10-16 00:18:37 -07:00
Joseph E. Gonzalez	80e4ec3278	IndexedRDD now only supports unique keys	2013-10-16 00:16:44 -07:00
Matei Zaharia	4e46fde818	Merge pull request #62 from harveyfeng/master Make TaskContext's stageId publicly accessible.	2013-10-15 23:14:27 -07:00
Harvey Feng	65b46236e7	Proper formatting for SparkHadoopWriter class extensions.	2013-10-15 21:51:52 -07:00
Matei Zaharia	6dbd2208ff	Merge pull request #34 from kayousterhout/rename Renamed StandaloneX to CoarseGrainedX. (as suggested by @rxin here https://github.com/apache/incubator-spark/pull/14) The previous names were confusing because the components weren't just used in Standalone mode. The scheduler used for Standalone mode is called SparkDeploySchedulerBackend, so referring to the base class as StandaloneSchedulerBackend was misleading.	2013-10-15 19:02:57 -07:00
Joseph E. Gonzalez	3cb6dffce0	adding indexed reduce by key	2013-10-15 18:55:06 -07:00
Harvey Feng	c4c76e37a7	Fix line length > 100 chars in SparkHadoopWriter	2013-10-15 18:35:59 -07:00
Harvey Feng	5b8083fee5	Make TaskContext's stageId publicly accessible.	2013-10-15 18:06:37 -07:00
Joseph E. Gonzalez	1b22eef744	Merge branch 'master' of https://github.com/apache/incubator-spark into indexedrdd_graphx	2013-10-15 16:15:19 -07:00
Kay Ousterhout	f95a2be045	Fixed build error after merging in master	2013-10-15 14:51:37 -07:00
Kay Ousterhout	acc7638f7c	Merge remote branch 'upstream/master' into rename	2013-10-15 14:43:56 -07:00
Kay Ousterhout	707ad8cc4f	Unified daemon thread pools	2013-10-15 14:23:43 -07:00
Reynold Xin	f41feb7b33	Bump up logging level to warning for failed tasks.	2013-10-14 23:35:32 -07:00
Joseph E. Gonzalez	6a13d02319	Merging chagnes for IndexedRDD branch	2013-10-14 23:30:36 -07:00
Joseph E. Gonzalez	6700ccd7d5	Introducing indexedrdd The rest of indexed rdd	2013-10-14 23:30:35 -07:00
Joseph E. Gonzalez	4755f42d78	moving indexedrdd to the correct location	2013-10-14 23:13:27 -07:00
Joseph E. Gonzalez	ef7c369092	merged with upstream changes	2013-10-14 22:56:42 -07:00
Reynold Xin	9cd8786e4a	Merge branch 'master' of github.com:apache/incubator-spark into kill Conflicts: core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala	2013-10-14 21:51:30 -07:00
Joseph E. Gonzalez	bf059691f0	Adding a few extra comments.	2013-10-14 19:59:11 -07:00
Joseph E. Gonzalez	11a44d0ec9	Introducing indexedrdd The rest of indexed rdd	2013-10-14 19:46:42 -07:00
Reynold Xin	3b11f43e36	Merge pull request #57 from aarondav/bid Refactor BlockId into an actual type Converts all of our BlockId strings into actual BlockId types. Here are some advantages of doing this now: + Type safety + Code clarity - it's now obvious what the key of a shuffle or rdd block is, for instance. Additionally, appearing in tuple/map type signatures is a big readability bonus. A Seq[(String, BlockStatus)] is not very clear. Further, we can now use more Scala features, like matching on BlockId types. + Explicit usage - we can now formally tell where various BlockIds are being used (without doing string searches); this makes updating current BlockIds a much clearer process, and compiler-supported. (I'm looking at you, shuffle file consolidation.) + It will only get harder to make this change as time goes on. Downside is, of course, that this is a very invasive change touching a lot of different files, which will inevitably lead to merge conflicts for many.	2013-10-14 14:20:01 -07:00
Aaron Davidson	4a45019fb0	Address Matei's comments	2013-10-14 00:24:17 -07:00
Joseph E. Gonzalez	637b67da56	merging changes from upstream benchmarking branch	2013-10-13 19:54:09 -07:00
Joseph E. Gonzalez	494472a6cc	Integrated IndexedRDD into graph design.	2013-10-13 19:42:32 -07:00
Aaron Davidson	da896115ec	Change BlockId filename to name + rest of Patrick's comments	2013-10-13 11:15:02 -07:00
Aaron Davidson	d60352283c	Add unit test and address rest of Reynold's comments	2013-10-12 22:45:15 -07:00
Aaron Davidson	a395911138	Refactor BlockId into an actual type This is an unfortunately invasive change which converts all of our BlockId strings into actual BlockId types. Here are some advantages of doing this now: + Type safety + Code clarity - it's now obvious what the key of a shuffle or rdd block is, for instance. Additionally, appearing in tuple/map type signatures is a big readability bonus. A Seq[(String, BlockStatus)] is not very clear. Further, we can now use more Scala features, like matching on BlockId types. + Explicit usage - we can now formally tell where various BlockIds are being used (without doing string searches); this makes updating current BlockIds a much clearer process, and compiler-supported. (I'm looking at you, shuffle file consolidation.) + It will only get harder to make this change as time goes on. Since this touches a lot of files, it'd be best to either get this patch in quickly or throw it on the ground to avoid too many secondary merge conflicts.	2013-10-12 22:44:57 -07:00
Reynold Xin	99796904ae	Merge pull request #52 from harveyfeng/hadoop-closure Add an optional closure parameter to HadoopRDD instantiation to use when creating local JobConfs. Having HadoopRDD accept this optional closure eliminates the need for the HadoopFileRDD added earlier. It makes the HadoopRDD more general, in that the caller can specify any JobConf initialization flow.	2013-10-12 21:23:26 -07:00
Harvey Feng	6c32aab87d	Remove the new HadoopRDD constructor from SparkContext API, plus some minor style changes.	2013-10-12 21:02:08 -07:00
Reynold Xin	88866ea9c9	Fixed PairRDDFunctionsSuite after removing InterruptibleRDD.	2013-10-12 20:05:23 -07:00
Reynold Xin	6b288b75d4	Job cancellation: address Matei's code review feedback.	2013-10-12 15:53:31 -07:00
Reynold Xin	ab0940f0c2	Job cancellation: addressed code review feedback round 2 from Kay.	2013-10-11 18:15:04 -07:00
Reynold Xin	97ffebbe87	Fixed dagscheduler suite because of a logging message change.	2013-10-11 16:18:22 -07:00
Reynold Xin	a61cf40ab9	Job cancellation: addressed code review feedback from Kay.	2013-10-11 15:58:14 -07:00
Dan Crankshaw	c4a23f95c3	Updated code so benchmarks actually run.	2013-10-11 22:57:43 +00:00
Matei Zaharia	fb25f32300	Merge pull request #53 from witgo/master Add a zookeeper compile dependency to fix build in maven Add a zookeeper compile dependency to fix build in maven	2013-10-11 15:44:43 -07:00
Matei Zaharia	d6ead47809	Merge pull request #32 from mridulm/master Address review comments, move to incubator spark Also includes a small fix to speculative execution. <edit> Continued from https://github.com/mesos/spark/pull/914 </edit>	2013-10-11 15:43:01 -07:00
Reynold Xin	e2047d3927	Making takeAsync and collectAsync deterministic.	2013-10-11 13:04:45 -07:00
Reynold Xin	09f7609254	Properly handle interrupted exception in FutureAction.	2013-10-11 11:20:15 -07:00
LiGuoqiang	fc60c412ab	Add a zookeeper compile dependency to fix build in maven	2013-10-11 16:31:47 +08:00
Reynold Xin	42fb1df694	Merge branch 'master' of github.com:apache/incubator-spark into kill Conflicts: core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala	2013-10-10 23:48:05 -07:00
Reynold Xin	d9e724e756	Fixed the broken local scheduler test.	2013-10-10 23:08:13 -07:00
Reynold Xin	37397b73ba	Added comprehensive tests for job cancellation in a variety of environments (local vs cluster, fifo vs fair).	2013-10-10 22:57:43 -07:00

1 2 3 4 5 ...

2266 commits