ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	b267175ab5	Synchronization fix in case SparkContext is used from multiple threads.	2012-02-06 14:28:18 -08:00
haoyuan	b72d93a0da	Test commit	2012-02-06 09:58:06 -08:00
Matei Zaharia	43a3335090	Simplifying test	2012-02-05 22:46:51 -08:00
Matei Zaharia	7449ecfb7e	Merge branch 'master' of github.com:mesos/spark	2012-01-31 00:33:24 -08:00
Matei Zaharia	100e800782	Some fixes to the examples (mostly to use functional API)	2012-01-31 00:33:18 -08:00
Matei Zaharia	72d2489b6d	Merge pull request #108 from patelh/master Added immutable map registration in kryo serializer	2012-01-30 16:31:12 -08:00
Hiral Patel	b47952342e	Add register immutable map to kryo serializer	2012-01-26 15:24:20 -08:00
Matei Zaharia	fabcc82528	Merge pull request #103 from edisontung/master Made improvements to takeSample. Also changed SparkLocalKMeans to SparkKMeans	2012-01-13 19:20:03 -08:00
Matei Zaharia	fd5581a0d3	Fixed a failure recovery bug and added some tests for fault recovery.	2012-01-13 19:17:27 -08:00
Matei Zaharia	eb05154b7a	Fixed a failure recovery bug and added some tests for fault recovery.	2012-01-13 19:08:25 -08:00
Edison Tung	1ecc221f84	Fixed bugs I've fixed the bugs detailed in the diff. One of the bugs was already fixed on the local file (forgot to commit).	2012-01-09 11:59:52 -08:00
Matei Zaharia	e269f6f7ea	Register RDDs with the MapOutputTracker even if they have no partitions. Fixes #105.	2012-01-05 15:59:20 -05:00
Matei Zaharia	5fd101d79e	Add dependency on Akka and Netty	2011-12-15 13:21:14 +01:00
Matei Zaharia	3034fc0d91	Merge commit 'ad4ebff42c1b738746b2b9ecfbb041b6d06e3e16'	2011-12-14 18:19:43 +01:00
Matei Zaharia	6a650cbbdf	Make Spark port default to 7077 so that it's not an ephemeral port that might be taken	2011-12-14 18:18:22 +01:00
Matei Zaharia	735843a049	Merge remote-tracking branch 'origin/charles-newhadoop'	2011-12-02 21:59:30 -08:00
Charles Reiss	66f05f383e	Add new Hadoop API reading support.	2011-12-01 14:02:10 -08:00
Charles Reiss	02d43e6986	Add new Hadoop API writing support.	2011-12-01 14:01:28 -08:00
Matei Zaharia	72c4839c5f	Fixed LocalFileLR to deal with a change in Scala IO sources (you can no longer iterate over a Source multiple times).	2011-12-01 13:52:12 -08:00
Edison Tung	42f8847a21	Revert de01b6deaaee1b43321e0aac330f4a98c0ea61c6^..HEAD	2011-12-01 13:43:25 -08:00
Edison Tung	de01b6deaa	Fixed bug in RDD Math.min takes 2 args, not 1. This was not committed earlier for some reason	2011-12-01 13:34:37 -08:00
Edison Tung	e1c814be4c	Renamed SparkLocalKMeans to SparkKMeans	2011-12-01 13:34:03 -08:00
Matei Zaharia	22b8fcf632	Added fold() and aggregate() operations that reuse an object to merge results into rather than requiring a new object allocation for each element merged. Fixes #95.	2011-11-30 11:37:47 -08:00
Matei Zaharia	09dd58b3a7	Send SPARK_JAVA_OPTS to slave nodes.	2011-11-30 11:34:58 -08:00
Edison Tung	a3bc012af8	added takeSamples method takeSamples method takes a specified number of samples from the RDD and outputs it in an array.	2011-11-21 16:38:44 -08:00
Edison Tung	3b9d9de583	Added KMeans examples LocalKMeans runs locally with a randomly generated dataset. SparkLocalKMeans takes an input file and runs KMeans on it.	2011-11-21 16:37:58 -08:00
Ankur Dave	ad4ebff42c	Deduplicate exceptions when printing them The first time they appear, exceptions are printed in full, including a stack trace. After that, they are printed in abbreviated form. They are periodically reprinted in full; the reprint interval defaults to 5 seconds and is configurable using the property spark.logging.exceptionPrintInterval.	2011-11-14 01:54:53 +00:00
Ankur Dave	35b6358a7c	Report errors in tasks to the driver via a Mesos status update When a task throws an exception, the Spark executor previously just logged it to a local file on the slave and exited. This commit causes Spark to also report the exception back to the driver using a Mesos status update, so the user doesn't have to look through a log file on the slave. Here's what the reporting currently looks like: # ./run spark.examples.ExceptionHandlingTest master@203.0.113.1:5050 [...] 11/10/26 21:04:13 INFO spark.SimpleJob: Lost TID 1 (task 0:1) 11/10/26 21:04:13 INFO spark.SimpleJob: Loss was due to java.lang.Exception: Testing exception handling [...] 11/10/26 21:04:16 INFO spark.SparkContext: Job finished in 5.988547328 s	2011-11-14 01:54:53 +00:00
Matei Zaharia	07532021fe	Bug fix: reject offers that we didn't find any tasks for	2011-11-08 23:05:54 -08:00
Matei Zaharia	13f6900ee6	Merge branch 'master' of github.com:mesos/spark	2011-11-08 21:46:03 -08:00
Matei Zaharia	c7d6f1a65c	Really upgrade to SBT 0.11.1 (through build.properties and plugin changes)	2011-11-08 21:45:29 -08:00
Ankur Dave	c5be7d2b22	Update Bagel unit tests to reflect API change	2011-11-08 19:56:44 +00:00
Matei Zaharia	9e4c79a4d3	Closure cleaner unit test	2011-11-08 00:40:15 -08:00
Matei Zaharia	f346e64637	Updates to the closure cleaner to work better with closures in classes. Before, the cleaner attempted to clone $outer objects that were classes (as opposed to nested closures) and preserve only their used fields, which was bad because it would miss fields that are accessed indirectly by methods, and in general it would confuse user code. Now we keep a reference to those objects without cloning them. This is not perfect because the user still needs to be careful of what they'll carry along into closures, but it works better in some cases that seemed confusing before. We need to improve the documentation on what variables get passed along with a closure and possibly add some debugging tools for it as well. Fixes #71 -- that code now works in the REPL.	2011-11-08 00:33:28 -08:00
Matei Zaharia	7fd05cbb8f	Update to SBT 0.11.1	2011-11-07 20:28:08 -08:00
Matei Zaharia	63da22c025	Update REPL code to use our own version of JLineReader, which fixes #89 . I'm not entirely sure why this broke in the jump from Scala 2.9.0.1 to 2.9.1 -- maybe something about name resolution changed?	2011-11-07 20:16:25 -08:00
Matei Zaharia	3fad5e580f	Fix Scala version requirement in README	2011-11-03 22:42:36 -07:00
Matei Zaharia	c2b7fd6899	Make parallelize() work efficiently for ranges of Long, Double, etc (splitting them into sub-ranges). Fixes #87.	2011-11-02 15:16:02 -07:00
Matei Zaharia	d4c8e69dc7	K-means example	2011-11-01 19:25:58 -07:00
Matei Zaharia	157279e9eb	Update Spark to work with the latest Mesos API	2011-10-30 14:10:56 -07:00
root	3a0e6c4363	Miscellaneous fixes: - Executor should initialize logging properly - groupByKey should allow custom partitioner	2011-10-17 18:07:35 +00:00
root	49505a0b0b	Switched Jetty to version 7.5 because 8.0 was causing a conflict with the log4j and Jetty libraries in Hadoop.	2011-10-17 18:06:41 +00:00
root	62aa820084	Merge branch 'ankur-master'	2011-10-14 02:14:07 +00:00
Ankur Dave	ab3889f627	Implement standalone WikipediaPageRank with custom serializer	2011-10-09 16:53:10 -07:00
Ankur Dave	cbdc01eecd	Update WikipediaPageRank to reflect Bagel API changes	2011-10-09 16:19:34 -07:00
Ankur Dave	6d707f6b63	Remove ShortestPath for now	2011-10-09 16:19:34 -07:00
Ankur Dave	0028caf3a4	Simplify and genericize type parameters in Bagel	2011-10-09 15:58:39 -07:00
Ankur Dave	2d7057bf5d	Implement PairRDDFunctions.partitionBy	2011-10-09 15:52:09 -07:00
Ankur Dave	06637cb69e	Fix PairRDDFunctions.groupWith partitioning This commit fixes a bug in groupWith that was causing it to destroy partitioning information. It replaces a call to map with a call to mapValues, which preserves partitioning.	2011-10-09 15:48:46 -07:00
Ankur Dave	2911a783d6	Add custom partitioner support to PairRDDFunctions.combineByKey	2011-10-09 15:47:20 -07:00

... 391 392 393 394 395 ...

20244 commits