ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	aa8ccec315	Abort jobs if a task fails more than a limited number of times	2010-10-15 15:57:26 -07:00
Matei Zaharia	57a778426c	Updated guava to version r07	2010-10-15 15:55:58 -07:00
Matei Zaharia	31b5b8b4a6	A couple of improvements to ReplSuite: - Use collect instead of toArray - Disable the "running on Mesos" test when MESOS_HOME is not set	2010-10-15 15:37:14 -07:00
Matei Zaharia	28d6f23196	Made locality scheduling constant-time and added support for changing CPU and memory requested per task.	2010-10-15 15:36:40 -07:00
Mosharaf Chowdhury	a4c0281902	sendObject now takes parameters instead of relying on class variables.	2010-10-14 15:36:23 -07:00
Mosharaf Chowdhury	a137ca75da	Got rid pf dualMode.	2010-10-13 17:01:00 -07:00
Mosharaf Chowdhury	38194e5731	- Changed guidePort to GuideInfo that now contains the hostAddress as well as the port. This will allow anyone other than the master to be a guide. - The GuideInfo object now contains the constants related to tracker response.	2010-10-13 16:26:18 -07:00
Mosharaf Chowdhury	8690be8f5a	Cleared up some formatting. Branching out from here to work on BT.	2010-10-13 11:40:03 -07:00
Mosharaf Chowdhury	0d67bc1cee	multi-tracker branch now compiles and runs; but it crashes right before the end. The same problem is seen also in the master branch (in the ChainedStreaming implementation)	2010-10-12 15:39:53 -07:00
Mosharaf Chowdhury	4fdd48295b	Added mesos.jar. Still not working. Major changes required.	2010-10-12 13:10:31 -07:00
Mosharaf Chowdhury	e73a5f3491	Now compiles with Scala 2.8.0, but doesn't run with nexus.jar Must update it to use mesos.jar	2010-10-12 13:05:32 -07:00
Mosharaf Chowdhury	ad7a9c5a36	Minor cleanup in Broadcast.scala. Changed BroadcastTest.scala to have multiple broadcasts.	2010-10-12 12:55:43 -07:00
Matei Zaharia	a9098ad5d4	Moved Job and SimpleJob to new files	2010-10-07 18:27:26 -07:00
Matei Zaharia	a5155206a1	Merge branch 'master' into matei-scheduling	2010-10-07 17:18:32 -07:00
Matei Zaharia	630a982b88	Added a getId method to split to force classes to specify a unique ID for each split. This replaces the previous method of calling split.toString, which would produce different results for the same split each time it is deserialized (because the default implementation returns the Java object's address).	2010-10-07 17:17:07 -07:00
Matei Zaharia	4d9c2aee98	Merge branch 'master' into matei-scheduling	2010-10-07 16:19:53 -07:00
Justin Ma	f9671b086b	got rid of unnecessary line	2010-10-07 14:41:10 -07:00
Justin Ma	4cbca25f49	Merge branch 'master' into jtma-accumulator	2010-10-07 14:39:54 -07:00
Justin Ma	b3517614d8	Added toString() methods to UnionSplit, SeededSplit and CartesianSplit to ensure that the proper keys will be generated when they cached.	2010-10-07 14:38:25 -07:00
Matei Zaharia	0195ee5ed8	Merge branch 'master' into matei-scheduling	2010-10-05 14:26:20 -07:00
Matei Zaharia	a41ca20375	Added splitWords function in Utils	2010-10-04 12:01:05 -07:00
Matei Zaharia	9f20b6b433	Added reduceByKey operation for RDDs containing pairs	2010-10-03 20:28:20 -07:00
Matei Zaharia	a826294c3a	Merge branch 'master' into matei-scheduling	2010-10-03 13:28:06 -07:00
Matei Zaharia	aef9e5b98c	Renamed ParallelOperation to Job	2010-10-03 13:28:01 -07:00
root	34eccedbf5	Fixed a rather bad bug in HDFS files that has been in for a while: caching was not working because Split objects did not have a consistent toString value	2010-10-03 05:06:06 +00:00
Matei Zaharia	b6debf5da1	Merge branch 'matei-logging'	2010-09-29 10:59:01 -07:00
Matei Zaharia	f50b23b825	Increase default locality wait to 3s. Fixes #20 .	2010-09-29 10:04:00 -07:00
Matei Zaharia	a7c0e2a7c3	Made task-finished log messages slightly nicer	2010-09-29 00:22:11 -07:00
Matei Zaharia	40f69140b6	Made spark-executor output slightly nicer	2010-09-29 00:22:09 -07:00
Matei Zaharia	0d28bdcefd	A couple of minor fixes: - Don't include trailing $'s in class names of Scala objects - Report errors using logError instead of printStackTrace	2010-09-29 00:10:46 -07:00
Matei Zaharia	0fa70a6770	Updated log4j.properties to ignore jetty messages below WARN level	2010-09-28 23:58:19 -07:00
Matei Zaharia	7090dea44b	Changed printlns to log statements and fixed a bug in run that was causing it to fail on a Mesos cluster	2010-09-28 23:54:29 -07:00
Matei Zaharia	516248aa66	Added log4j.properties	2010-09-28 23:22:39 -07:00
Matei Zaharia	332c8b8c22	Removed Hadoop's SLF4J jars	2010-09-28 23:16:28 -07:00
Matei Zaharia	db623defbe	Added Logging trait	2010-09-28 23:12:23 -07:00
Matei Zaharia	c7d233b911	Added log4j jars and paths	2010-09-28 23:08:01 -07:00
Matei Zaharia	e5e9edeeb3	Merge branch 'http-repl-class-serving'	2010-09-28 22:43:04 -07:00
Matei Zaharia	e068f21e01	More work on HTTP class loading	2010-09-28 22:32:38 -07:00
Matei Zaharia	7ef3a20a0c	Modified the interpreter to serve classes to the executors using a Jetty HTTP server instead of a shared (NFS) file system.	2010-09-28 17:55:11 -07:00
Justin Ma	b749f0e209	fixed typo in printing which task is already finished	2010-09-28 17:28:54 -07:00
Justin Ma	b7ce592bec	changes to accumulator to add objects in-place.	2010-09-25 14:37:25 -07:00
Justin Ma	366c09c47b	Let's use future instead of actors	2010-09-13 15:30:22 -07:00
Justin Ma	0896fd6219	Added fork()/join() operations for SparkContext, as well as corresponding changes to MesosScheduler to support multiple ParallelOperations.	2010-09-12 09:01:44 -07:00
Justin Ma	6f0d2c1cbc	round robin scheduling of tasks has been added	2010-09-07 14:03:59 -07:00
Justin Ma	e9ffe6caab	now adding the Split object.	2010-09-01 13:31:06 -07:00
Justin Ma	7a9ff1cc9a	- Got rid of 'Split' type parameter in RDD - Added SampledRDD, SplitRDD and CartesianRDD - Made Split a class rather than a type parameter - Added numCores() to Scheduler to help set default level of parallelism	2010-08-31 12:08:09 -07:00
Justin Ma	ea8c2785dd	now we have sampling with replacement (at least on a per-split basis)	2010-08-18 15:59:35 -07:00
Justin Ma	156bccbe23	HdfsFile.scala: added a try/catch block to exit gracefully for correupted gzip files MesosScheduler.scala: formatted the slaveOffer() output to include the serialized task size RDD.scala: added support for aggregating RDDs on a per-split basis (aggregateSplit()) as well as for sampling without replacement (sample())	2010-08-18 15:25:57 -07:00
Matei Zaharia	75b2ca10c3	Removed HOD from included Hadoop because it was making the project count as Python on GitHub :\|.	2010-08-16 23:16:35 -07:00
Matei Zaharia	1cbffaae6f	Modified Scala interpreter to have it avoid computing string versions of all results when :silent is enabled, so that it is easier to work with large arrays in Spark. (The string version of an array of numbers might not fit in memory even though the array itself does.)	2010-08-15 18:33:27 -07:00

1 2 3 4 5

247 commits