ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	97eee50825	Fixes a nasty bug that could happen when tasks fail, because calling wait() with a timeout of 0 on a Java object means "wait forever".	2012-03-01 13:43:17 -08:00
Matei Zaharia	1e10df0a46	Merge pull request #111 from alupher/master Adding sorting to RDDs	2012-02-24 15:50:14 -08:00
Antonio	0d93d95bcf	Removed unnecessary import	2012-02-21 19:57:12 -08:00
Antonio	2990298f71	Added sorting testing suite	2012-02-21 19:54:21 -08:00
Matei Zaharia	aa04f87cd2	Added support for parallel execution of jobs in DAGScheduler.	2012-02-19 22:50:23 -08:00
Antonio	620798161b	Added fixes to sorting	2012-02-13 00:07:39 -08:00
Matei Zaharia	2587ce1690	Fixed a deadlock that occured with MesosScheduler due to an earlier synchronization change	2012-02-11 21:22:45 -08:00
Antonio	e93f622665	Added sorting by key for pair RDDs	2012-02-11 00:56:28 -08:00
Matei Zaharia	98f008b721	Formatting fixes	2012-02-10 10:52:03 -08:00
Matei Zaharia	7660a8b12f	Merge branch 'formatting' Conflicts: core/src/main/scala/spark/DAGScheduler.scala core/src/main/scala/spark/SimpleShuffleFetcher.scala core/src/main/scala/spark/SparkContext.scala	2012-02-10 10:42:14 -08:00
haoyuan	194c42ab79	Code format.	2012-02-10 08:19:53 -08:00
Matei Zaharia	8f5ed51234	Delete Spark's temporary directories when the JVM exits.	2012-02-09 22:58:24 -08:00
Matei Zaharia	c0a0df3285	Made the default cache BoundedMemoryCache, and reduced its default size	2012-02-09 22:32:02 -08:00
Matei Zaharia	a766780f4c	Added some tests for multithreaded access to Spark.	2012-02-09 22:27:53 -08:00
Matei Zaharia	0e93891d3d	Replaced LocalFileShuffle with a non-singleton ShuffleManager class and made DAGScheduler automatically set SparkEnv.	2012-02-09 22:14:56 -08:00
haoyuan	445e0bb1b5	Format the code a bit mroe.	2012-02-09 15:50:26 -08:00
haoyuan	651932e703	Format the code as coding style agreed by Matei/TD/Haoyuan	2012-02-09 13:26:23 -08:00
Matei Zaharia	e02dc83a5b	IO optimizations	2012-02-06 20:40:39 -08:00
Matei Zaharia	c40e766368	Use java.util.HashMap in shuffles	2012-02-06 19:20:25 -08:00
Matei Zaharia	d6ec664b48	Add dependency on fastutil and update Guava	2012-02-06 15:37:27 -08:00
Matei Zaharia	b267175ab5	Synchronization fix in case SparkContext is used from multiple threads.	2012-02-06 14:28:18 -08:00
haoyuan	b72d93a0da	Test commit	2012-02-06 09:58:06 -08:00
Matei Zaharia	43a3335090	Simplifying test	2012-02-05 22:46:51 -08:00
Matei Zaharia	7449ecfb7e	Merge branch 'master' of github.com:mesos/spark	2012-01-31 00:33:24 -08:00
Matei Zaharia	100e800782	Some fixes to the examples (mostly to use functional API)	2012-01-31 00:33:18 -08:00
Matei Zaharia	72d2489b6d	Merge pull request #108 from patelh/master Added immutable map registration in kryo serializer	2012-01-30 16:31:12 -08:00
Hiral Patel	b47952342e	Add register immutable map to kryo serializer	2012-01-26 15:24:20 -08:00
Matei Zaharia	fabcc82528	Merge pull request #103 from edisontung/master Made improvements to takeSample. Also changed SparkLocalKMeans to SparkKMeans	2012-01-13 19:20:03 -08:00
Matei Zaharia	fd5581a0d3	Fixed a failure recovery bug and added some tests for fault recovery.	2012-01-13 19:17:27 -08:00
Matei Zaharia	eb05154b7a	Fixed a failure recovery bug and added some tests for fault recovery.	2012-01-13 19:08:25 -08:00
Edison Tung	1ecc221f84	Fixed bugs I've fixed the bugs detailed in the diff. One of the bugs was already fixed on the local file (forgot to commit).	2012-01-09 11:59:52 -08:00
Matei Zaharia	e269f6f7ea	Register RDDs with the MapOutputTracker even if they have no partitions. Fixes #105.	2012-01-05 15:59:20 -05:00
Matei Zaharia	5fd101d79e	Add dependency on Akka and Netty	2011-12-15 13:21:14 +01:00
Matei Zaharia	3034fc0d91	Merge commit 'ad4ebff42c1b738746b2b9ecfbb041b6d06e3e16'	2011-12-14 18:19:43 +01:00
Matei Zaharia	6a650cbbdf	Make Spark port default to 7077 so that it's not an ephemeral port that might be taken	2011-12-14 18:18:22 +01:00
Matei Zaharia	735843a049	Merge remote-tracking branch 'origin/charles-newhadoop'	2011-12-02 21:59:30 -08:00
Charles Reiss	66f05f383e	Add new Hadoop API reading support.	2011-12-01 14:02:10 -08:00
Charles Reiss	02d43e6986	Add new Hadoop API writing support.	2011-12-01 14:01:28 -08:00
Matei Zaharia	72c4839c5f	Fixed LocalFileLR to deal with a change in Scala IO sources (you can no longer iterate over a Source multiple times).	2011-12-01 13:52:12 -08:00
Edison Tung	42f8847a21	Revert de01b6deaaee1b43321e0aac330f4a98c0ea61c6^..HEAD	2011-12-01 13:43:25 -08:00
Edison Tung	de01b6deaa	Fixed bug in RDD Math.min takes 2 args, not 1. This was not committed earlier for some reason	2011-12-01 13:34:37 -08:00
Edison Tung	e1c814be4c	Renamed SparkLocalKMeans to SparkKMeans	2011-12-01 13:34:03 -08:00
Matei Zaharia	22b8fcf632	Added fold() and aggregate() operations that reuse an object to merge results into rather than requiring a new object allocation for each element merged. Fixes #95.	2011-11-30 11:37:47 -08:00
Matei Zaharia	09dd58b3a7	Send SPARK_JAVA_OPTS to slave nodes.	2011-11-30 11:34:58 -08:00
Edison Tung	a3bc012af8	added takeSamples method takeSamples method takes a specified number of samples from the RDD and outputs it in an array.	2011-11-21 16:38:44 -08:00
Edison Tung	3b9d9de583	Added KMeans examples LocalKMeans runs locally with a randomly generated dataset. SparkLocalKMeans takes an input file and runs KMeans on it.	2011-11-21 16:37:58 -08:00
Ankur Dave	ad4ebff42c	Deduplicate exceptions when printing them The first time they appear, exceptions are printed in full, including a stack trace. After that, they are printed in abbreviated form. They are periodically reprinted in full; the reprint interval defaults to 5 seconds and is configurable using the property spark.logging.exceptionPrintInterval.	2011-11-14 01:54:53 +00:00
Ankur Dave	35b6358a7c	Report errors in tasks to the driver via a Mesos status update When a task throws an exception, the Spark executor previously just logged it to a local file on the slave and exited. This commit causes Spark to also report the exception back to the driver using a Mesos status update, so the user doesn't have to look through a log file on the slave. Here's what the reporting currently looks like: # ./run spark.examples.ExceptionHandlingTest master@203.0.113.1:5050 [...] 11/10/26 21:04:13 INFO spark.SimpleJob: Lost TID 1 (task 0:1) 11/10/26 21:04:13 INFO spark.SimpleJob: Loss was due to java.lang.Exception: Testing exception handling [...] 11/10/26 21:04:16 INFO spark.SparkContext: Job finished in 5.988547328 s	2011-11-14 01:54:53 +00:00
Matei Zaharia	07532021fe	Bug fix: reject offers that we didn't find any tasks for	2011-11-08 23:05:54 -08:00
Matei Zaharia	13f6900ee6	Merge branch 'master' of github.com:mesos/spark	2011-11-08 21:46:03 -08:00

1 2 3 4 5 ...

664 commits