Matei Zaharia
0aa23bf17e
Added a convenience method for getting the JAR file that loaded a class
...
(useful for jobs to pass their own JAR files to SparkContext).
2011-08-29 22:59:44 -07:00
Matei Zaharia
a161f00610
Made a log message slightly less ugly
2011-08-27 16:58:54 -07:00
Matei Zaharia
c22043f150
Minor fix: can use >= when checking memory
2011-08-02 19:11:17 -07:00
Ismael Juma
6ff57f5594
Use scala.math instead of Math as the latter is deprecated.
2011-08-02 10:25:47 +01:00
Ismael Juma
620de2dd1d
Change currentThread to Thread.currentThread as the former is deprecated.
2011-08-02 10:25:16 +01:00
Ismael Juma
0fba22b3d2
Fix issue #65 : Change @serializable to extends Serializable in 2.9 branch
...
Note that we use scala.Serializable introduced in Scala 2.9 instead of
java.io.Serializable. Also, case classes inherit from scala.Serializable by
default.
2011-08-02 10:16:33 +01:00
Matei Zaharia
711575391d
Merge branch 'scala-2.9'
...
Conflicts:
project/build/SparkProject.scala
2011-08-01 15:25:26 -07:00
Matei Zaharia
4050d661c5
Updated to newest Mesos API, which includes better memory accounting
...
by specifying per-executor memory.
2011-08-01 13:54:48 -07:00
Matei Zaharia
d12122502b
Various improvements to Kryo serializer:
...
- Replaced modified Kryo version with the standard one augmented with
the kryo-serializers package, which includes support for classes with
no-arg constructors (that was why we had a modified Kryo before)
- The kryo-serializers version also fixes issue #72 .
- Added a bunch of tests.
- Serialize maps and a few other common types properly by default.
2011-07-21 22:09:33 -07:00
Matei Zaharia
baa72e2747
Removed a debug statement that slipped in as a println
2011-07-21 16:09:33 -07:00
Matei Zaharia
2bfd7931e8
Merge branch 'new-rdds-protobuf'
...
Conflicts:
core/src/main/scala/spark/Executor.scala
core/src/main/scala/spark/RDD.scala
2011-07-21 16:08:39 -07:00
Matei Zaharia
1450fd74d9
Merge branch 'master' into scala-2.9
2011-07-14 17:37:24 -04:00
Matei Zaharia
ccf48388cd
Lowered default number of splits for files
2011-07-14 17:37:04 -04:00
Matei Zaharia
146a18c2a4
Merge branch 'master' into scala-2.9
2011-07-14 17:29:17 -04:00
Matei Zaharia
c8eb8b2b90
Set class loader for remote actors to fix a bug that happens in 2.9
2011-07-14 17:29:11 -04:00
Matei Zaharia
8ea67307b9
Merge branch 'master' into scala-2.9
2011-07-14 14:47:12 -04:00
Matei Zaharia
e4c3402d2d
Renamed ParallelArray to ParallelCollection
2011-07-14 14:47:01 -04:00
Matei Zaharia
9ac461d85d
Remove RDD.toString because it looked confusing
2011-07-14 14:39:32 -04:00
Matei Zaharia
797b4547c3
Fix tracking of updates in accumulators to solve an issue that would manifest in the 2.9 interpreter
2011-07-14 14:08:34 -04:00
Matei Zaharia
3efd9e94d8
Merge branch 'master' into scala-2.9
2011-07-14 12:42:57 -04:00
Matei Zaharia
0ccfe20755
Forgot to add a file
2011-07-14 12:42:50 -04:00
Matei Zaharia
38f38dda5b
Merge branch 'master' into scala-2.9
2011-07-14 12:42:02 -04:00
Matei Zaharia
969644df8e
Cleaned up a few issues to do with default parallelism levels. Also
...
renamed HadoopFileWriter to HadoopWriter (since it's not only for files)
and fixed a bug for lookup().
2011-07-14 12:40:56 -04:00
Matei Zaharia
2fb906e8e5
Merge branch 'master' into scala-2.9
2011-07-14 00:20:14 -04:00
Matei Zaharia
2604939f64
Simplified and documented code a little and added test
2011-07-14 00:19:00 -04:00
Matei Zaharia
2439e51a03
Merge branch 'master' into implicit-sequencefile
2011-07-13 23:20:22 -04:00
Matei Zaharia
d0c7958364
Merge branch 'master' into scala-2.9
...
Conflicts:
core/src/main/scala/spark/HadoopFileWriter.scala
2011-07-13 23:09:33 -04:00
Matei Zaharia
9c0069188b
Updated save code to allow non-file-based OutputFormats and added a test
...
for file-related stuff
2011-07-13 23:04:06 -04:00
Matei Zaharia
da8a3b8926
Increase default value of spark.locality.wait a little
2011-07-13 20:07:24 -04:00
Matei Zaharia
080869c6ef
Merge branch 'master' into scala-2.9
2011-07-13 00:20:08 -04:00
Matei Zaharia
842e14d567
Added mapPartitions operation and a bunch of tests for RDD ops
2011-07-13 00:19:52 -04:00
Matei Zaharia
9b568d37f7
Merge branch 'master' into scala-2.9
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2011-07-11 22:25:53 -04:00
Matei Zaharia
d05fea24f3
Simplified parallel shuffle fetcher to use URLConnection
2011-07-11 22:12:36 -04:00
Matei Zaharia
25c3a7781c
Moved PairRDD and SequenceFileRDD functions to separate source files
2011-07-10 00:06:15 -04:00
Matei Zaharia
b7f1f62ff5
bug fix
2011-07-09 18:53:02 -04:00
Matei Zaharia
003480f374
Register byte[] with Kryo serializer
2011-07-09 18:08:07 -04:00
Matei Zaharia
aea5cb4413
Added parallel shuffle fetcher
2011-07-09 17:25:56 -04:00
Matei Zaharia
4b1646a25f
Support for non-filesystem-based Hadoop data sources
2011-07-06 20:37:55 -04:00
Matei Zaharia
07a97d47c2
Support for non-filesystem-based Hadoop data sources
2011-07-06 20:37:34 -04:00
Matei Zaharia
3488c386a9
Initial work to make stuff like sequenceFile[Int, Int] work without
...
requiring the user to provide a Writable type. The approach here might
not be the best but it seems to work correctly.
2011-06-28 17:07:04 -07:00
Matei Zaharia
5633299ec6
Merge remote-tracking branch 'origin/master' into scala-2.9
2011-06-27 22:50:59 -07:00
Matei Zaharia
b0ecf1ee41
Don't pass a null context when running tasks locally
2011-06-27 22:50:43 -07:00
Matei Zaharia
85cad5d9dd
Fixed HadoopFileWriter to compile for Scala 2.9
2011-06-27 22:44:14 -07:00
Matei Zaharia
393607d5ef
Merge branch 'master' into scala-2.9
2011-06-27 18:08:25 -07:00
Matei Zaharia
2f652f1656
Fix a compile error
2011-06-27 18:07:16 -07:00
Tathagata Das
3f08e1129f
Merge branch 'master' into td-rdd-save
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2011-06-27 13:43:44 -07:00
Tathagata Das
ad842ac823
Merge branch 'master' into td-rdd-save
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2011-06-27 13:39:11 -07:00
Matei Zaharia
bae8a97968
Merge branch 'master' into scala-2.9
...
Conflicts:
repl/src/main/scala/spark/repl/SparkInterpreterLoop.scala
2011-06-26 19:22:27 -07:00
Matei Zaharia
c4dd68ae21
Merge branch 'mos-bt'
...
This merge keeps only the broadcast work in mos-bt because the structure
of shuffle has changed with the new RDD design. We still need some kind
of parallel shuffle but that will be added later.
Conflicts:
core/src/main/scala/spark/BitTorrentBroadcast.scala
core/src/main/scala/spark/ChainedBroadcast.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/shuffle/BasicLocalFileShuffle.scala
core/src/main/scala/spark/shuffle/DfsShuffle.scala
2011-06-26 18:22:12 -07:00
Tathagata Das
38f2ba99cc
Further changes to HadoopFileWriter. Implemented ability to save RDDs as SequenceFiles and ObjectFiles.
...
1> HadoopFileWriter changed to take class types as constructor parameters (no more generic type)
2> Multiple types of RDD.saveAsHadoopFile() implemented to provide more saving options
3> RDD.saveAsSequenceFile() automatically converts basic types to Writable types before saving as SequenceFile
4> RDD.saveAsObjectFile() serializes objects and saves them to a ObjectFile
5> SparkContext.objectFile() opens the saved ObjectFiles
2011-06-24 19:51:21 -07:00