ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	e269f6f7ea	Register RDDs with the MapOutputTracker even if they have no partitions. Fixes #105.	2012-01-05 15:59:20 -05:00
Matei Zaharia	5fd101d79e	Add dependency on Akka and Netty	2011-12-15 13:21:14 +01:00
Matei Zaharia	3034fc0d91	Merge commit 'ad4ebff42c1b738746b2b9ecfbb041b6d06e3e16'	2011-12-14 18:19:43 +01:00
Matei Zaharia	6a650cbbdf	Make Spark port default to 7077 so that it's not an ephemeral port that might be taken	2011-12-14 18:18:22 +01:00
Matei Zaharia	735843a049	Merge remote-tracking branch 'origin/charles-newhadoop'	2011-12-02 21:59:30 -08:00
Charles Reiss	66f05f383e	Add new Hadoop API reading support.	2011-12-01 14:02:10 -08:00
Charles Reiss	02d43e6986	Add new Hadoop API writing support.	2011-12-01 14:01:28 -08:00
Matei Zaharia	72c4839c5f	Fixed LocalFileLR to deal with a change in Scala IO sources (you can no longer iterate over a Source multiple times).	2011-12-01 13:52:12 -08:00
Edison Tung	42f8847a21	Revert de01b6deaaee1b43321e0aac330f4a98c0ea61c6^..HEAD	2011-12-01 13:43:25 -08:00
Edison Tung	de01b6deaa	Fixed bug in RDD Math.min takes 2 args, not 1. This was not committed earlier for some reason	2011-12-01 13:34:37 -08:00
Edison Tung	e1c814be4c	Renamed SparkLocalKMeans to SparkKMeans	2011-12-01 13:34:03 -08:00
Matei Zaharia	22b8fcf632	Added fold() and aggregate() operations that reuse an object to merge results into rather than requiring a new object allocation for each element merged. Fixes #95.	2011-11-30 11:37:47 -08:00
Matei Zaharia	09dd58b3a7	Send SPARK_JAVA_OPTS to slave nodes.	2011-11-30 11:34:58 -08:00
Edison Tung	a3bc012af8	added takeSamples method takeSamples method takes a specified number of samples from the RDD and outputs it in an array.	2011-11-21 16:38:44 -08:00
Edison Tung	3b9d9de583	Added KMeans examples LocalKMeans runs locally with a randomly generated dataset. SparkLocalKMeans takes an input file and runs KMeans on it.	2011-11-21 16:37:58 -08:00
Ankur Dave	ad4ebff42c	Deduplicate exceptions when printing them The first time they appear, exceptions are printed in full, including a stack trace. After that, they are printed in abbreviated form. They are periodically reprinted in full; the reprint interval defaults to 5 seconds and is configurable using the property spark.logging.exceptionPrintInterval.	2011-11-14 01:54:53 +00:00
Ankur Dave	35b6358a7c	Report errors in tasks to the driver via a Mesos status update When a task throws an exception, the Spark executor previously just logged it to a local file on the slave and exited. This commit causes Spark to also report the exception back to the driver using a Mesos status update, so the user doesn't have to look through a log file on the slave. Here's what the reporting currently looks like: # ./run spark.examples.ExceptionHandlingTest master@203.0.113.1:5050 [...] 11/10/26 21:04:13 INFO spark.SimpleJob: Lost TID 1 (task 0:1) 11/10/26 21:04:13 INFO spark.SimpleJob: Loss was due to java.lang.Exception: Testing exception handling [...] 11/10/26 21:04:16 INFO spark.SparkContext: Job finished in 5.988547328 s	2011-11-14 01:54:53 +00:00
Matei Zaharia	07532021fe	Bug fix: reject offers that we didn't find any tasks for	2011-11-08 23:05:54 -08:00
Matei Zaharia	13f6900ee6	Merge branch 'master' of github.com:mesos/spark	2011-11-08 21:46:03 -08:00
Matei Zaharia	c7d6f1a65c	Really upgrade to SBT 0.11.1 (through build.properties and plugin changes)	2011-11-08 21:45:29 -08:00
Ankur Dave	c5be7d2b22	Update Bagel unit tests to reflect API change	2011-11-08 19:56:44 +00:00
Matei Zaharia	9e4c79a4d3	Closure cleaner unit test	2011-11-08 00:40:15 -08:00
Matei Zaharia	f346e64637	Updates to the closure cleaner to work better with closures in classes. Before, the cleaner attempted to clone $outer objects that were classes (as opposed to nested closures) and preserve only their used fields, which was bad because it would miss fields that are accessed indirectly by methods, and in general it would confuse user code. Now we keep a reference to those objects without cloning them. This is not perfect because the user still needs to be careful of what they'll carry along into closures, but it works better in some cases that seemed confusing before. We need to improve the documentation on what variables get passed along with a closure and possibly add some debugging tools for it as well. Fixes #71 -- that code now works in the REPL.	2011-11-08 00:33:28 -08:00
Matei Zaharia	7fd05cbb8f	Update to SBT 0.11.1	2011-11-07 20:28:08 -08:00
Matei Zaharia	63da22c025	Update REPL code to use our own version of JLineReader, which fixes #89 . I'm not entirely sure why this broke in the jump from Scala 2.9.0.1 to 2.9.1 -- maybe something about name resolution changed?	2011-11-07 20:16:25 -08:00
Matei Zaharia	3fad5e580f	Fix Scala version requirement in README	2011-11-03 22:42:36 -07:00
Matei Zaharia	c2b7fd6899	Make parallelize() work efficiently for ranges of Long, Double, etc (splitting them into sub-ranges). Fixes #87.	2011-11-02 15:16:02 -07:00
Matei Zaharia	d4c8e69dc7	K-means example	2011-11-01 19:25:58 -07:00
Matei Zaharia	157279e9eb	Update Spark to work with the latest Mesos API	2011-10-30 14:10:56 -07:00
root	3a0e6c4363	Miscellaneous fixes: - Executor should initialize logging properly - groupByKey should allow custom partitioner	2011-10-17 18:07:35 +00:00
root	49505a0b0b	Switched Jetty to version 7.5 because 8.0 was causing a conflict with the log4j and Jetty libraries in Hadoop.	2011-10-17 18:06:41 +00:00
root	62aa820084	Merge branch 'ankur-master'	2011-10-14 02:14:07 +00:00
Ankur Dave	ab3889f627	Implement standalone WikipediaPageRank with custom serializer	2011-10-09 16:53:10 -07:00
Ankur Dave	cbdc01eecd	Update WikipediaPageRank to reflect Bagel API changes	2011-10-09 16:19:34 -07:00
Ankur Dave	6d707f6b63	Remove ShortestPath for now	2011-10-09 16:19:34 -07:00
Ankur Dave	0028caf3a4	Simplify and genericize type parameters in Bagel	2011-10-09 15:58:39 -07:00
Ankur Dave	2d7057bf5d	Implement PairRDDFunctions.partitionBy	2011-10-09 15:52:09 -07:00
Ankur Dave	06637cb69e	Fix PairRDDFunctions.groupWith partitioning This commit fixes a bug in groupWith that was causing it to destroy partitioning information. It replaces a call to map with a call to mapValues, which preserves partitioning.	2011-10-09 15:48:46 -07:00
Ankur Dave	2911a783d6	Add custom partitioner support to PairRDDFunctions.combineByKey	2011-10-09 15:47:20 -07:00
Ankur Dave	6c6e47e3cd	Use BufferedOutputStream in ShuffleMapTask	2011-10-09 15:43:31 -07:00
Matei Zaharia	6483b41377	Merge pull request #83 from ijuma/sbt-0.11 Upgrade to SBT 0.11.0	2011-09-30 21:52:18 -07:00
Ismael Juma	d76c0fc781	Upgrade to sbt-idea 0.11.0 final.	2011-09-27 23:13:38 +01:00
Ismael Juma	7e92ef9d19	Add workaround for bug in SBT (issue #206 ).	2011-09-27 00:04:59 +01:00
Ismael Juma	4019305afe	Set SCALA_VERSION to 2.9.1 (from 2.9.1.final) to match expectation of SBT 0.11.0	2011-09-26 22:44:41 +01:00
Ismael Juma	3562db6374	Include "spark-" prefix in project name (used when artifact is published).	2011-09-26 22:41:07 +01:00
Ismael Juma	28b5d5a2af	Upgrade compress-lzf to 0.8.4.	2011-09-26 22:32:05 +01:00
Ismael Juma	315e55fde3	Upgrade Jetty to 8.0.1.	2011-09-26 22:32:05 +01:00
Ismael Juma	ee980439e2	Use scalatest and scalacheck compiled against Scala 2.9.1.	2011-09-26 22:32:05 +01:00
Ismael Juma	bd774eb274	Use new layout for plugins definitions (recommended for SBT 0.11)	2011-09-26 22:32:05 +01:00
Ismael Juma	e39edcce60	Upgrade to SBT 0.11.0.	2011-09-26 22:24:29 +01:00

... 3 4 5 6 7 ...

833 commits