ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
tdas	ae63972a89	Merge pull request #64 from mesos/td-rdd-save Functionality to save RDDs to Hadoop files	2011-06-27 13:44:55 -07:00
Tathagata Das	3f08e1129f	Merge branch 'master' into td-rdd-save Conflicts: core/src/main/scala/spark/SparkContext.scala	2011-06-27 13:43:44 -07:00
Tathagata Das	ad842ac823	Merge branch 'master' into td-rdd-save Conflicts: core/src/main/scala/spark/RDD.scala	2011-06-27 13:39:11 -07:00
Matei Zaharia	bae8a97968	Merge branch 'master' into scala-2.9 Conflicts: repl/src/main/scala/spark/repl/SparkInterpreterLoop.scala	2011-06-26 19:22:27 -07:00
Matei Zaharia	b187675b68	Print version number 0.3 in REPL	2011-06-26 18:27:01 -07:00
Matei Zaharia	c4dd68ae21	Merge branch 'mos-bt' This merge keeps only the broadcast work in mos-bt because the structure of shuffle has changed with the new RDD design. We still need some kind of parallel shuffle but that will be added later. Conflicts: core/src/main/scala/spark/BitTorrentBroadcast.scala core/src/main/scala/spark/ChainedBroadcast.scala core/src/main/scala/spark/RDD.scala core/src/main/scala/spark/SparkContext.scala core/src/main/scala/spark/Utils.scala core/src/main/scala/spark/shuffle/BasicLocalFileShuffle.scala core/src/main/scala/spark/shuffle/DfsShuffle.scala	2011-06-26 18:22:12 -07:00
Tathagata Das	38f2ba99cc	Further changes to HadoopFileWriter. Implemented ability to save RDDs as SequenceFiles and ObjectFiles. 1> HadoopFileWriter changed to take class types as constructor parameters (no more generic type) 2> Multiple types of RDD.saveAsHadoopFile() implemented to provide more saving options 3> RDD.saveAsSequenceFile() automatically converts basic types to Writable types before saving as SequenceFile 4> RDD.saveAsObjectFile() serializes objects and saves them to a ObjectFile 5> SparkContext.objectFile() opens the saved ObjectFiles	2011-06-24 19:51:21 -07:00
Matei Zaharia	b626562d54	Merge pull request #63 from ogrisel/outer-join Implemented RDD.leftOuterJoin and RDD.rightOuterJoin	2011-06-24 12:22:15 -07:00
Olivier Grisel	2e3531d8bf	Implemented RDD.leftOuterJoin and RDD.rightOuterJoin	2011-06-24 11:00:51 +02:00
Matei Zaharia	095dd9c444	Merge pull request #62 from ogrisel/cogroup-test Add missing test for RDD.groupWith	2011-06-23 10:12:24 -07:00
Matei Zaharia	e8e35d5fb5	Merge pull request #61 from ogrisel/better-readme Better readme	2011-06-23 10:11:14 -07:00
Tathagata Das	3d2befe831	Improved HadoopFileWriter (saves key and value classes to jobconf)	2011-06-23 08:11:22 -07:00
Olivier Grisel	7ef48a4df0	typo	2011-06-23 02:28:17 +02:00
Olivier Grisel	5b9e0a126d	format	2011-06-23 02:27:14 +02:00
Olivier Grisel	236bcd0d9b	Markdown rendering for the toplevel README.md to improve readability on github	2011-06-23 02:24:04 +02:00
Olivier Grisel	005d1605a4	add missing test for RDD.groupWith	2011-06-23 02:10:52 +02:00
Matei Zaharia	214250016a	Added simple version of lookup	2011-06-20 11:59:16 -07:00
Matei Zaharia	23b42af70a	Merge branch 'master' into scala-2.9	2011-06-19 23:06:21 -07:00
Matei Zaharia	23b1c309fb	Added pipe() operation on RDDs for mapping through a shell command.	2011-06-19 23:05:19 -07:00
Tathagata Das	b5e6645505	Cleaner reimplementation of HadoopFileWriter. Introduced TaskContext. 1> HadoopFileWriter works correctly with task failures 2> It can also take an user specified JobConf object for configuration settings 3> A Task can now get information like stage ID, split ID, and attempt ID using TaskContext class 4> Minor changes in SparkContext, DAGScheduler and subclasses to allow specification of TaskContext as a parameter	2011-06-16 20:57:57 -07:00
Tathagata Das	869836a2fa	Implemented TaskContext to hold contextual information (jobID, taskID, attemptID) of a task	2011-06-10 19:47:28 -07:00
Tathagata Das	389e56156f	HadoopFileWriter changed to use Hadoop's OutputCommitter	2011-06-09 15:29:22 -07:00
Matei Zaharia	c62bb4091b	Merge remote-tracking branch 'origin/master' into scala-2.9	2011-06-07 00:42:23 -07:00
Matei Zaharia	a413b8e59d	Merge pull request #59 from ijuma/master Move managedStyle to SparkProject	2011-06-07 00:41:50 -07:00
Tathagata Das	24d845833c	First-cut implementation of RDD.SaveAsText	2011-06-05 04:14:43 -07:00
Ismael Juma	1ad4dcd3de	Move managedStyle to SparkProject. I had added it to DepJar by mistake.	2011-06-02 14:06:54 +01:00
Matei Zaharia	3297706ab2	Merge remote-tracking branch 'origin/master' into scala-2.9	2011-06-01 11:46:31 -07:00
Matei Zaharia	9bb448a151	Catch Throwable instead of Exception in LocalScheduler and Executor. Fixes #57 .	2011-06-01 11:45:47 -07:00
Matei Zaharia	850fe3274e	Make the runJob API public. Fixes #56 .	2011-06-01 11:38:44 -07:00
Matei Zaharia	0e5dbf2abd	Merge pull request #58 from ijuma/scala-2.9 Remove unnecessary toStream calls	2011-06-01 11:32:07 -07:00
Ismael Juma	82f10bd794	Remove unnecessary toStream calls.	2011-06-01 16:12:42 +01:00
Matei Zaharia	b49d1be65b	Ensure logging is initialized before any Spark threads run in the REPL	2011-05-31 23:54:48 -07:00
Matei Zaharia	10fe324845	Merge remote-tracking branch 'origin/master' into scala-2.9	2011-05-31 23:48:11 -07:00
Matei Zaharia	5166d76843	Ensure logging is initialized before spawning any threads to fix issue #45	2011-05-31 23:47:32 -07:00
Matei Zaharia	96daa31a01	Pass quoted arguments properly to run	2011-05-31 23:34:16 -07:00
Matei Zaharia	7862995569	Give SBT a bit more memory so it can do a update / compile / test in one JVM	2011-05-31 23:33:47 -07:00
Matei Zaharia	90f924202b	Another fix ported forward for the REPL	2011-05-31 23:11:49 -07:00
Matei Zaharia	3854d23dd4	Pass quoted arguments properly to run	2011-05-31 22:17:48 -07:00
Matei Zaharia	8012e388a8	Give SBT a bit more memory so it can do a update / compile / test in one JVM	2011-05-31 22:17:33 -07:00
Ismael Juma	3def9fdb96	Upgrade to scalacheck 1.9.	2011-05-31 22:11:33 -07:00
Matei Zaharia	0afd35a8dd	Some docs in ClosureCleaner	2011-05-31 22:06:30 -07:00
Matei Zaharia	73975d7491	Further fixes to interpreter (adding in some code generation changes I missed before and setting SparkEnv properly on the threads that execute each line in the 2.9 interpreter).	2011-05-31 22:05:24 -07:00
Matei Zaharia	d52660c969	Ported code generation changes from 2.8 interpreter (to use a class for each line's object rather than a singleton object so that we can ship these classes to worker nodes). This is pretty hairy stuff, which would be nice to avoid in the future by integrating with the interpreter some other way.	2011-05-31 19:23:15 -07:00
Matei Zaharia	beb9c117f0	Merge branch 'master' into scala-2.9 Conflicts: project/build/SparkProject.scala	2011-05-31 19:23:07 -07:00
Matei Zaharia	bcce6e8d01	Various work to use the 2.9 interpreter	2011-05-31 17:31:51 -07:00
Matei Zaharia	8b0390d344	Instantiate NullWritable properly in HadoopFile	2011-05-30 23:54:14 -07:00
Matei Zaharia	4096c2287e	Various fixes	2011-05-29 18:46:01 -07:00
Matei Zaharia	ef706ae959	Merge branch 'master' into new-rdds-protobuf Conflicts: run	2011-05-29 16:20:23 -07:00
Matei Zaharia	c501cff924	Executor was looking for the wrong constructor for ExecutorClassLoader	2011-05-29 16:15:59 -07:00
Matei Zaharia	50ac1d2a40	Merge remote-tracking branch 'ijuma/issue51'	2011-05-29 15:41:12 -07:00

... 33 34 35 36 37 ...

2211 commits