ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Ismael Juma	0fba22b3d2	Fix issue #65 : Change @serializable to extends Serializable in 2.9 branch Note that we use scala.Serializable introduced in Scala 2.9 instead of java.io.Serializable. Also, case classes inherit from scala.Serializable by default.	2011-08-02 10:16:33 +01:00
Matei Zaharia	8ea67307b9	Merge branch 'master' into scala-2.9	2011-07-14 14:47:12 -04:00
Matei Zaharia	9ac461d85d	Remove RDD.toString because it looked confusing	2011-07-14 14:39:32 -04:00
Matei Zaharia	38f38dda5b	Merge branch 'master' into scala-2.9	2011-07-14 12:42:02 -04:00
Matei Zaharia	969644df8e	Cleaned up a few issues to do with default parallelism levels. Also renamed HadoopFileWriter to HadoopWriter (since it's not only for files) and fixed a bug for lookup().	2011-07-14 12:40:56 -04:00
Matei Zaharia	d0c7958364	Merge branch 'master' into scala-2.9 Conflicts: core/src/main/scala/spark/HadoopFileWriter.scala	2011-07-13 23:09:33 -04:00
Matei Zaharia	9c0069188b	Updated save code to allow non-file-based OutputFormats and added a test for file-related stuff	2011-07-13 23:04:06 -04:00
Matei Zaharia	080869c6ef	Merge branch 'master' into scala-2.9	2011-07-13 00:20:08 -04:00
Matei Zaharia	842e14d567	Added mapPartitions operation and a bunch of tests for RDD ops	2011-07-13 00:19:52 -04:00
Matei Zaharia	9b568d37f7	Merge branch 'master' into scala-2.9 Conflicts: core/src/main/scala/spark/RDD.scala	2011-07-11 22:25:53 -04:00
Matei Zaharia	25c3a7781c	Moved PairRDD and SequenceFileRDD functions to separate source files	2011-07-10 00:06:15 -04:00
Matei Zaharia	393607d5ef	Merge branch 'master' into scala-2.9	2011-06-27 18:08:25 -07:00
Matei Zaharia	2f652f1656	Fix a compile error	2011-06-27 18:07:16 -07:00
Tathagata Das	3f08e1129f	Merge branch 'master' into td-rdd-save Conflicts: core/src/main/scala/spark/SparkContext.scala	2011-06-27 13:43:44 -07:00
Tathagata Das	ad842ac823	Merge branch 'master' into td-rdd-save Conflicts: core/src/main/scala/spark/RDD.scala	2011-06-27 13:39:11 -07:00
Matei Zaharia	bae8a97968	Merge branch 'master' into scala-2.9 Conflicts: repl/src/main/scala/spark/repl/SparkInterpreterLoop.scala	2011-06-26 19:22:27 -07:00
Tathagata Das	38f2ba99cc	Further changes to HadoopFileWriter. Implemented ability to save RDDs as SequenceFiles and ObjectFiles. 1> HadoopFileWriter changed to take class types as constructor parameters (no more generic type) 2> Multiple types of RDD.saveAsHadoopFile() implemented to provide more saving options 3> RDD.saveAsSequenceFile() automatically converts basic types to Writable types before saving as SequenceFile 4> RDD.saveAsObjectFile() serializes objects and saves them to a ObjectFile 5> SparkContext.objectFile() opens the saved ObjectFiles	2011-06-24 19:51:21 -07:00
Olivier Grisel	2e3531d8bf	Implemented RDD.leftOuterJoin and RDD.rightOuterJoin	2011-06-24 11:00:51 +02:00
Matei Zaharia	214250016a	Added simple version of lookup	2011-06-20 11:59:16 -07:00
Matei Zaharia	23b42af70a	Merge branch 'master' into scala-2.9	2011-06-19 23:06:21 -07:00
Matei Zaharia	23b1c309fb	Added pipe() operation on RDDs for mapping through a shell command.	2011-06-19 23:05:19 -07:00
Tathagata Das	b5e6645505	Cleaner reimplementation of HadoopFileWriter. Introduced TaskContext. 1> HadoopFileWriter works correctly with task failures 2> It can also take an user specified JobConf object for configuration settings 3> A Task can now get information like stage ID, split ID, and attempt ID using TaskContext class 4> Minor changes in SparkContext, DAGScheduler and subclasses to allow specification of TaskContext as a parameter	2011-06-16 20:57:57 -07:00
Tathagata Das	869836a2fa	Implemented TaskContext to hold contextual information (jobID, taskID, attemptID) of a task	2011-06-10 19:47:28 -07:00
Tathagata Das	389e56156f	HadoopFileWriter changed to use Hadoop's OutputCommitter	2011-06-09 15:29:22 -07:00
Tathagata Das	24d845833c	First-cut implementation of RDD.SaveAsText	2011-06-05 04:14:43 -07:00
Ismael Juma	82f10bd794	Remove unnecessary toStream calls.	2011-06-01 16:12:42 +01:00
Ismael Juma	1f27d94c48	Use Array.iterator instead of Iterator.fromArray as the latter is deprecated.	2011-05-26 22:04:42 +01:00
Matei Zaharia	cec427e777	Fixed a bug with preferred locations having changed meaning in new RDDs	2011-05-22 17:12:29 -07:00
Matei Zaharia	82329b0b28	Updated scheduler to support running on just some partitions of final RDD	2011-05-19 12:47:09 -07:00
Matei Zaharia	fd1d255821	Stop objectifying various trackers, caches, etc.	2011-05-17 12:41:13 -07:00
Matei Zaharia	16c886a581	Optimization for count()	2011-05-13 10:41:34 -07:00
Matei Zaharia	94ba95bcb2	Added flatMapValues	2011-04-12 19:51:58 -07:00
Matei Zaharia	467f056e29	Remove commented code	2011-03-06 23:38:41 -08:00
Matei Zaharia	bce95b8458	Finished cogroup stuff	2011-03-06 23:38:16 -08:00
Matei Zaharia	04c2d6a60c	stuff	2011-03-06 19:27:03 -08:00
Matei Zaharia	9e59afd710	More work on new RDD design	2011-02-27 19:15:52 -08:00
Matei Zaharia	f38f86d59e	More stuff	2011-02-27 14:27:12 -08:00
Matei Zaharia	2e6023f2bf	stuff	2011-02-26 23:41:44 -08:00
Matei Zaharia	309367c477	Initial work towards new RDD design	2011-02-26 23:15:33 -08:00
Matei Zaharia	99f3f23efa	Changed default shuffle to LocalFileShuffle because it's way faster for small files	2011-02-08 17:03:03 -08:00
Matei Zaharia	e5c4cd8a5e	Made examples and core subprojects	2011-02-01 15:11:08 -08:00

41 commits