ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	c7af538ac1	Some fixes to sorting for when the RDD has fewer elements than the number of partitions we ask to partition it into. Also, removed a test that was taking way too long to run.	2012-03-17 13:08:36 -07:00
Matei Zaharia	a099a63a8a	Initial work to make Spark compile with Mesos 0.9 and Hadoop 1.0	2012-03-17 12:31:34 -07:00
Matei Zaharia	a5e2b6a6bd	Merge pull request #112 from cengle/master Changed HadoopRDD to get key and value containers from the RecordReader instead of through reflection	2012-03-06 13:38:32 -08:00
Matei Zaharia	97eee50825	Fixes a nasty bug that could happen when tasks fail, because calling wait() with a timeout of 0 on a Java object means "wait forever".	2012-03-01 13:43:17 -08:00
Cliff Engle	dd68cb6099	Get key and value container from RecordReader	2012-02-29 16:33:23 -08:00
Matei Zaharia	1e10df0a46	Merge pull request #111 from alupher/master Adding sorting to RDDs	2012-02-24 15:50:14 -08:00
Antonio	0d93d95bcf	Removed unnecessary import	2012-02-21 19:57:12 -08:00
Antonio	2990298f71	Added sorting testing suite	2012-02-21 19:54:21 -08:00
Matei Zaharia	aa04f87cd2	Added support for parallel execution of jobs in DAGScheduler.	2012-02-19 22:50:23 -08:00
Antonio	620798161b	Added fixes to sorting	2012-02-13 00:07:39 -08:00
Matei Zaharia	2587ce1690	Fixed a deadlock that occured with MesosScheduler due to an earlier synchronization change	2012-02-11 21:22:45 -08:00
Antonio	e93f622665	Added sorting by key for pair RDDs	2012-02-11 00:56:28 -08:00
Matei Zaharia	98f008b721	Formatting fixes	2012-02-10 10:52:03 -08:00
Matei Zaharia	7660a8b12f	Merge branch 'formatting' Conflicts: core/src/main/scala/spark/DAGScheduler.scala core/src/main/scala/spark/SimpleShuffleFetcher.scala core/src/main/scala/spark/SparkContext.scala	2012-02-10 10:42:14 -08:00
haoyuan	194c42ab79	Code format.	2012-02-10 08:19:53 -08:00
Matei Zaharia	8f5ed51234	Delete Spark's temporary directories when the JVM exits.	2012-02-09 22:58:24 -08:00
Matei Zaharia	c0a0df3285	Made the default cache BoundedMemoryCache, and reduced its default size	2012-02-09 22:32:02 -08:00
Matei Zaharia	a766780f4c	Added some tests for multithreaded access to Spark.	2012-02-09 22:27:53 -08:00
Matei Zaharia	0e93891d3d	Replaced LocalFileShuffle with a non-singleton ShuffleManager class and made DAGScheduler automatically set SparkEnv.	2012-02-09 22:14:56 -08:00
haoyuan	445e0bb1b5	Format the code a bit mroe.	2012-02-09 15:50:26 -08:00
haoyuan	651932e703	Format the code as coding style agreed by Matei/TD/Haoyuan	2012-02-09 13:26:23 -08:00
Matei Zaharia	e02dc83a5b	IO optimizations	2012-02-06 20:40:39 -08:00
Matei Zaharia	c40e766368	Use java.util.HashMap in shuffles	2012-02-06 19:20:25 -08:00
Matei Zaharia	d6ec664b48	Add dependency on fastutil and update Guava	2012-02-06 15:37:27 -08:00
Matei Zaharia	b267175ab5	Synchronization fix in case SparkContext is used from multiple threads.	2012-02-06 14:28:18 -08:00
haoyuan	b72d93a0da	Test commit	2012-02-06 09:58:06 -08:00
Matei Zaharia	43a3335090	Simplifying test	2012-02-05 22:46:51 -08:00
Matei Zaharia	7449ecfb7e	Merge branch 'master' of github.com:mesos/spark	2012-01-31 00:33:24 -08:00
Matei Zaharia	100e800782	Some fixes to the examples (mostly to use functional API)	2012-01-31 00:33:18 -08:00
Matei Zaharia	72d2489b6d	Merge pull request #108 from patelh/master Added immutable map registration in kryo serializer	2012-01-30 16:31:12 -08:00
Hiral Patel	b47952342e	Add register immutable map to kryo serializer	2012-01-26 15:24:20 -08:00
Matei Zaharia	fabcc82528	Merge pull request #103 from edisontung/master Made improvements to takeSample. Also changed SparkLocalKMeans to SparkKMeans	2012-01-13 19:20:03 -08:00
Matei Zaharia	fd5581a0d3	Fixed a failure recovery bug and added some tests for fault recovery.	2012-01-13 19:17:27 -08:00
Matei Zaharia	eb05154b7a	Fixed a failure recovery bug and added some tests for fault recovery.	2012-01-13 19:08:25 -08:00
Edison Tung	1ecc221f84	Fixed bugs I've fixed the bugs detailed in the diff. One of the bugs was already fixed on the local file (forgot to commit).	2012-01-09 11:59:52 -08:00
Matei Zaharia	e269f6f7ea	Register RDDs with the MapOutputTracker even if they have no partitions. Fixes #105.	2012-01-05 15:59:20 -05:00
Matei Zaharia	5fd101d79e	Add dependency on Akka and Netty	2011-12-15 13:21:14 +01:00
Matei Zaharia	3034fc0d91	Merge commit 'ad4ebff42c1b738746b2b9ecfbb041b6d06e3e16'	2011-12-14 18:19:43 +01:00
Matei Zaharia	6a650cbbdf	Make Spark port default to 7077 so that it's not an ephemeral port that might be taken	2011-12-14 18:18:22 +01:00
Matei Zaharia	735843a049	Merge remote-tracking branch 'origin/charles-newhadoop'	2011-12-02 21:59:30 -08:00
Charles Reiss	66f05f383e	Add new Hadoop API reading support.	2011-12-01 14:02:10 -08:00
Charles Reiss	02d43e6986	Add new Hadoop API writing support.	2011-12-01 14:01:28 -08:00
Matei Zaharia	72c4839c5f	Fixed LocalFileLR to deal with a change in Scala IO sources (you can no longer iterate over a Source multiple times).	2011-12-01 13:52:12 -08:00
Edison Tung	42f8847a21	Revert de01b6deaaee1b43321e0aac330f4a98c0ea61c6^..HEAD	2011-12-01 13:43:25 -08:00
Edison Tung	de01b6deaa	Fixed bug in RDD Math.min takes 2 args, not 1. This was not committed earlier for some reason	2011-12-01 13:34:37 -08:00
Edison Tung	e1c814be4c	Renamed SparkLocalKMeans to SparkKMeans	2011-12-01 13:34:03 -08:00
Matei Zaharia	22b8fcf632	Added fold() and aggregate() operations that reuse an object to merge results into rather than requiring a new object allocation for each element merged. Fixes #95.	2011-11-30 11:37:47 -08:00
Matei Zaharia	09dd58b3a7	Send SPARK_JAVA_OPTS to slave nodes.	2011-11-30 11:34:58 -08:00
Edison Tung	a3bc012af8	added takeSamples method takeSamples method takes a specified number of samples from the RDD and outputs it in an array.	2011-11-21 16:38:44 -08:00
Edison Tung	3b9d9de583	Added KMeans examples LocalKMeans runs locally with a randomly generated dataset. SparkLocalKMeans takes an input file and runs KMeans on it.	2011-11-21 16:37:58 -08:00

... 3 4 5 6 7 ...

868 commits