Matei Zaharia
b187675b68
Print version number 0.3 in REPL
2011-06-26 18:27:01 -07:00
Matei Zaharia
c4dd68ae21
Merge branch 'mos-bt'
...
This merge keeps only the broadcast work in mos-bt because the structure
of shuffle has changed with the new RDD design. We still need some kind
of parallel shuffle but that will be added later.
Conflicts:
core/src/main/scala/spark/BitTorrentBroadcast.scala
core/src/main/scala/spark/ChainedBroadcast.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/shuffle/BasicLocalFileShuffle.scala
core/src/main/scala/spark/shuffle/DfsShuffle.scala
2011-06-26 18:22:12 -07:00
Tathagata Das
38f2ba99cc
Further changes to HadoopFileWriter. Implemented ability to save RDDs as SequenceFiles and ObjectFiles.
...
1> HadoopFileWriter changed to take class types as constructor parameters (no more generic type)
2> Multiple types of RDD.saveAsHadoopFile() implemented to provide more saving options
3> RDD.saveAsSequenceFile() automatically converts basic types to Writable types before saving as SequenceFile
4> RDD.saveAsObjectFile() serializes objects and saves them to a ObjectFile
5> SparkContext.objectFile() opens the saved ObjectFiles
2011-06-24 19:51:21 -07:00
Matei Zaharia
b626562d54
Merge pull request #63 from ogrisel/outer-join
...
Implemented RDD.leftOuterJoin and RDD.rightOuterJoin
2011-06-24 12:22:15 -07:00
Olivier Grisel
2e3531d8bf
Implemented RDD.leftOuterJoin and RDD.rightOuterJoin
2011-06-24 11:00:51 +02:00
Matei Zaharia
095dd9c444
Merge pull request #62 from ogrisel/cogroup-test
...
Add missing test for RDD.groupWith
2011-06-23 10:12:24 -07:00
Matei Zaharia
e8e35d5fb5
Merge pull request #61 from ogrisel/better-readme
...
Better readme
2011-06-23 10:11:14 -07:00
Tathagata Das
3d2befe831
Improved HadoopFileWriter (saves key and value classes to jobconf)
2011-06-23 08:11:22 -07:00
Olivier Grisel
7ef48a4df0
typo
2011-06-23 02:28:17 +02:00
Olivier Grisel
5b9e0a126d
format
2011-06-23 02:27:14 +02:00
Olivier Grisel
236bcd0d9b
Markdown rendering for the toplevel README.md to improve readability on github
2011-06-23 02:24:04 +02:00
Olivier Grisel
005d1605a4
add missing test for RDD.groupWith
2011-06-23 02:10:52 +02:00
Matei Zaharia
214250016a
Added simple version of lookup
2011-06-20 11:59:16 -07:00
Matei Zaharia
23b42af70a
Merge branch 'master' into scala-2.9
2011-06-19 23:06:21 -07:00
Matei Zaharia
23b1c309fb
Added pipe() operation on RDDs for mapping through a shell command.
2011-06-19 23:05:19 -07:00
Tathagata Das
b5e6645505
Cleaner reimplementation of HadoopFileWriter. Introduced TaskContext.
...
1> HadoopFileWriter works correctly with task failures
2> It can also take an user specified JobConf object for configuration settings
3> A Task can now get information like stage ID, split ID, and attempt ID using TaskContext class
4> Minor changes in SparkContext, DAGScheduler and subclasses to allow specification of TaskContext as a parameter
2011-06-16 20:57:57 -07:00
Tathagata Das
869836a2fa
Implemented TaskContext to hold contextual information (jobID, taskID, attemptID) of a task
2011-06-10 19:47:28 -07:00
Tathagata Das
389e56156f
HadoopFileWriter changed to use Hadoop's OutputCommitter
2011-06-09 15:29:22 -07:00
Matei Zaharia
c62bb4091b
Merge remote-tracking branch 'origin/master' into scala-2.9
2011-06-07 00:42:23 -07:00
Matei Zaharia
a413b8e59d
Merge pull request #59 from ijuma/master
...
Move managedStyle to SparkProject
2011-06-07 00:41:50 -07:00
Tathagata Das
24d845833c
First-cut implementation of RDD.SaveAsText
2011-06-05 04:14:43 -07:00
Ismael Juma
1ad4dcd3de
Move managedStyle to SparkProject.
...
I had added it to DepJar by mistake.
2011-06-02 14:06:54 +01:00
Matei Zaharia
3297706ab2
Merge remote-tracking branch 'origin/master' into scala-2.9
2011-06-01 11:46:31 -07:00
Matei Zaharia
9bb448a151
Catch Throwable instead of Exception in LocalScheduler and Executor. Fixes #57 .
2011-06-01 11:45:47 -07:00
Matei Zaharia
850fe3274e
Make the runJob API public. Fixes #56 .
2011-06-01 11:38:44 -07:00
Matei Zaharia
0e5dbf2abd
Merge pull request #58 from ijuma/scala-2.9
...
Remove unnecessary toStream calls
2011-06-01 11:32:07 -07:00
Ismael Juma
82f10bd794
Remove unnecessary toStream calls.
2011-06-01 16:12:42 +01:00
Matei Zaharia
b49d1be65b
Ensure logging is initialized before any Spark threads run in the REPL
2011-05-31 23:54:48 -07:00
Matei Zaharia
10fe324845
Merge remote-tracking branch 'origin/master' into scala-2.9
2011-05-31 23:48:11 -07:00
Matei Zaharia
5166d76843
Ensure logging is initialized before spawning any threads to fix issue #45
2011-05-31 23:47:32 -07:00
Matei Zaharia
96daa31a01
Pass quoted arguments properly to run
2011-05-31 23:34:16 -07:00
Matei Zaharia
7862995569
Give SBT a bit more memory so it can do a update / compile / test in one JVM
2011-05-31 23:33:47 -07:00
Matei Zaharia
90f924202b
Another fix ported forward for the REPL
2011-05-31 23:11:49 -07:00
Matei Zaharia
3854d23dd4
Pass quoted arguments properly to run
2011-05-31 22:17:48 -07:00
Matei Zaharia
8012e388a8
Give SBT a bit more memory so it can do a update / compile / test in one JVM
2011-05-31 22:17:33 -07:00
Ismael Juma
3def9fdb96
Upgrade to scalacheck 1.9.
2011-05-31 22:11:33 -07:00
Matei Zaharia
0afd35a8dd
Some docs in ClosureCleaner
2011-05-31 22:06:30 -07:00
Matei Zaharia
73975d7491
Further fixes to interpreter (adding in some code generation changes I
...
missed before and setting SparkEnv properly on the threads that execute
each line in the 2.9 interpreter).
2011-05-31 22:05:24 -07:00
Matei Zaharia
d52660c969
Ported code generation changes from 2.8 interpreter (to use a class for
...
each line's object rather than a singleton object so that we can ship
these classes to worker nodes). This is pretty hairy stuff, which would
be nice to avoid in the future by integrating with the interpreter some
other way.
2011-05-31 19:23:15 -07:00
Matei Zaharia
beb9c117f0
Merge branch 'master' into scala-2.9
...
Conflicts:
project/build/SparkProject.scala
2011-05-31 19:23:07 -07:00
Matei Zaharia
bcce6e8d01
Various work to use the 2.9 interpreter
2011-05-31 17:31:51 -07:00
Matei Zaharia
8b0390d344
Instantiate NullWritable properly in HadoopFile
2011-05-30 23:54:14 -07:00
Matei Zaharia
4096c2287e
Various fixes
2011-05-29 18:46:01 -07:00
Matei Zaharia
ef706ae959
Merge branch 'master' into new-rdds-protobuf
...
Conflicts:
run
2011-05-29 16:20:23 -07:00
Matei Zaharia
c501cff924
Executor was looking for the wrong constructor for ExecutorClassLoader
2011-05-29 16:15:59 -07:00
Matei Zaharia
50ac1d2a40
Merge remote-tracking branch 'ijuma/issue51'
2011-05-29 15:41:12 -07:00
Ismael Juma
0c62ee4321
Depend on jetty-server in compile scope and upgrade to 7.4.2.
...
As Matei described: "We're using Jetty to run an HTTP server, not to embed Spark
in a webapp"
2011-05-29 20:12:50 +01:00
Matei Zaharia
22c8b84d8b
Merge pull request #53 from ijuma/master
...
Use explicit asInstanceOf instead of misleading unchecked pattern matching.
2011-05-27 10:20:54 -07:00
Ismael Juma
2c691295bf
SCALA_VERSION in run should be 2.9.0-1
2011-05-27 14:59:23 +01:00
Ismael Juma
1d75c6060a
Update to Scala 2.9.0-1 and disable repl module for now.
...
The repl module requires more complex work.
2011-05-27 14:59:23 +01:00