Matei Zaharia
b7f1f62ff5
bug fix
2011-07-09 18:53:02 -04:00
Matei Zaharia
003480f374
Register byte[] with Kryo serializer
2011-07-09 18:08:07 -04:00
Matei Zaharia
aea5cb4413
Added parallel shuffle fetcher
2011-07-09 17:25:56 -04:00
Matei Zaharia
4b1646a25f
Support for non-filesystem-based Hadoop data sources
2011-07-06 20:37:55 -04:00
Matei Zaharia
b0ecf1ee41
Don't pass a null context when running tasks locally
2011-06-27 22:50:43 -07:00
Matei Zaharia
2f652f1656
Fix a compile error
2011-06-27 18:07:16 -07:00
tdas
ae63972a89
Merge pull request #64 from mesos/td-rdd-save
...
Functionality to save RDDs to Hadoop files
2011-06-27 13:44:55 -07:00
Tathagata Das
3f08e1129f
Merge branch 'master' into td-rdd-save
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2011-06-27 13:43:44 -07:00
Tathagata Das
ad842ac823
Merge branch 'master' into td-rdd-save
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2011-06-27 13:39:11 -07:00
Matei Zaharia
b187675b68
Print version number 0.3 in REPL
2011-06-26 18:27:01 -07:00
Matei Zaharia
c4dd68ae21
Merge branch 'mos-bt'
...
This merge keeps only the broadcast work in mos-bt because the structure
of shuffle has changed with the new RDD design. We still need some kind
of parallel shuffle but that will be added later.
Conflicts:
core/src/main/scala/spark/BitTorrentBroadcast.scala
core/src/main/scala/spark/ChainedBroadcast.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/shuffle/BasicLocalFileShuffle.scala
core/src/main/scala/spark/shuffle/DfsShuffle.scala
2011-06-26 18:22:12 -07:00
Tathagata Das
38f2ba99cc
Further changes to HadoopFileWriter. Implemented ability to save RDDs as SequenceFiles and ObjectFiles.
...
1> HadoopFileWriter changed to take class types as constructor parameters (no more generic type)
2> Multiple types of RDD.saveAsHadoopFile() implemented to provide more saving options
3> RDD.saveAsSequenceFile() automatically converts basic types to Writable types before saving as SequenceFile
4> RDD.saveAsObjectFile() serializes objects and saves them to a ObjectFile
5> SparkContext.objectFile() opens the saved ObjectFiles
2011-06-24 19:51:21 -07:00
Matei Zaharia
b626562d54
Merge pull request #63 from ogrisel/outer-join
...
Implemented RDD.leftOuterJoin and RDD.rightOuterJoin
2011-06-24 12:22:15 -07:00
Olivier Grisel
2e3531d8bf
Implemented RDD.leftOuterJoin and RDD.rightOuterJoin
2011-06-24 11:00:51 +02:00
Matei Zaharia
095dd9c444
Merge pull request #62 from ogrisel/cogroup-test
...
Add missing test for RDD.groupWith
2011-06-23 10:12:24 -07:00
Matei Zaharia
e8e35d5fb5
Merge pull request #61 from ogrisel/better-readme
...
Better readme
2011-06-23 10:11:14 -07:00
Tathagata Das
3d2befe831
Improved HadoopFileWriter (saves key and value classes to jobconf)
2011-06-23 08:11:22 -07:00
Olivier Grisel
7ef48a4df0
typo
2011-06-23 02:28:17 +02:00
Olivier Grisel
5b9e0a126d
format
2011-06-23 02:27:14 +02:00
Olivier Grisel
236bcd0d9b
Markdown rendering for the toplevel README.md to improve readability on github
2011-06-23 02:24:04 +02:00
Olivier Grisel
005d1605a4
add missing test for RDD.groupWith
2011-06-23 02:10:52 +02:00
Matei Zaharia
214250016a
Added simple version of lookup
2011-06-20 11:59:16 -07:00
Matei Zaharia
23b1c309fb
Added pipe() operation on RDDs for mapping through a shell command.
2011-06-19 23:05:19 -07:00
Tathagata Das
b5e6645505
Cleaner reimplementation of HadoopFileWriter. Introduced TaskContext.
...
1> HadoopFileWriter works correctly with task failures
2> It can also take an user specified JobConf object for configuration settings
3> A Task can now get information like stage ID, split ID, and attempt ID using TaskContext class
4> Minor changes in SparkContext, DAGScheduler and subclasses to allow specification of TaskContext as a parameter
2011-06-16 20:57:57 -07:00
Tathagata Das
869836a2fa
Implemented TaskContext to hold contextual information (jobID, taskID, attemptID) of a task
2011-06-10 19:47:28 -07:00
Tathagata Das
389e56156f
HadoopFileWriter changed to use Hadoop's OutputCommitter
2011-06-09 15:29:22 -07:00
Matei Zaharia
a413b8e59d
Merge pull request #59 from ijuma/master
...
Move managedStyle to SparkProject
2011-06-07 00:41:50 -07:00
Tathagata Das
24d845833c
First-cut implementation of RDD.SaveAsText
2011-06-05 04:14:43 -07:00
Ismael Juma
1ad4dcd3de
Move managedStyle to SparkProject.
...
I had added it to DepJar by mistake.
2011-06-02 14:06:54 +01:00
Matei Zaharia
9bb448a151
Catch Throwable instead of Exception in LocalScheduler and Executor. Fixes #57 .
2011-06-01 11:45:47 -07:00
Matei Zaharia
850fe3274e
Make the runJob API public. Fixes #56 .
2011-06-01 11:38:44 -07:00
Matei Zaharia
5166d76843
Ensure logging is initialized before spawning any threads to fix issue #45
2011-05-31 23:47:32 -07:00
Matei Zaharia
96daa31a01
Pass quoted arguments properly to run
2011-05-31 23:34:16 -07:00
Matei Zaharia
7862995569
Give SBT a bit more memory so it can do a update / compile / test in one JVM
2011-05-31 23:33:47 -07:00
Matei Zaharia
8b0390d344
Instantiate NullWritable properly in HadoopFile
2011-05-30 23:54:14 -07:00
Matei Zaharia
c501cff924
Executor was looking for the wrong constructor for ExecutorClassLoader
2011-05-29 16:15:59 -07:00
Matei Zaharia
50ac1d2a40
Merge remote-tracking branch 'ijuma/issue51'
2011-05-29 15:41:12 -07:00
Ismael Juma
0c62ee4321
Depend on jetty-server in compile scope and upgrade to 7.4.2.
...
As Matei described: "We're using Jetty to run an HTTP server, not to embed Spark
in a webapp"
2011-05-29 20:12:50 +01:00
Matei Zaharia
22c8b84d8b
Merge pull request #53 from ijuma/master
...
Use explicit asInstanceOf instead of misleading unchecked pattern matching.
2011-05-27 10:20:54 -07:00
Ismael Juma
e3b323321d
Use ManagedStyle.Maven.
2011-05-27 14:56:01 +01:00
Ismael Juma
3a6b0b8a57
Publish javadoc and sources.
2011-05-27 14:55:51 +01:00
Ismael Juma
59f1f42a9a
Update run to work with SBT managed dependencies and the newly introduced repl module.
2011-05-27 11:22:59 +01:00
Ismael Juma
3af6003c87
Update sbt to 0.7.7.
2011-05-27 11:22:59 +01:00
Ismael Juma
1396678baa
Move REPL classes to separate module.
2011-05-27 11:22:50 +01:00
Ismael Juma
3e8114ddbd
Change project.organization to org.spark-project to fit Maven convention.
2011-05-27 11:22:10 +01:00
Ismael Juma
7b7dfdb085
Set project.version to 0.3-SNAPSHOT.
2011-05-27 11:22:10 +01:00
Ismael Juma
051da8b4ad
Delete liblzf from lib as it's no longer used.
2011-05-27 11:22:10 +01:00
Ismael Juma
ae1a1f91f1
Remove several dependencies from git and configure them as SBT managed dependencies.
...
Upgrade some of the dependencies while at it.
2011-05-27 11:22:01 +01:00
Ismael Juma
164ef4c751
Use explicit asInstanceOf instead of misleading unchecked pattern matching.
...
Also enable -unchecked warnings in SBT build file.
2011-05-27 07:57:10 +01:00
Matei Zaharia
cfbe2da1a6
Merge pull request #52 from ijuma/master
...
Fix deprecations when compiled with Scala 2.8.1
2011-05-26 23:53:10 -07:00