root
34eccedbf5
Fixed a rather bad bug in HDFS files that has been in for a while:
...
caching was not working because Split objects did not have a
consistent toString value
2010-10-03 05:06:06 +00:00
Justin Ma
6f0d2c1cbc
round robin scheduling of tasks has been added
2010-09-07 14:03:59 -07:00
Justin Ma
7a9ff1cc9a
- Got rid of 'Split' type parameter in RDD
...
- Added SampledRDD, SplitRDD and CartesianRDD
- Made Split a class rather than a type parameter
- Added numCores() to Scheduler to help set default level of parallelism
2010-08-31 12:08:09 -07:00
Justin Ma
156bccbe23
HdfsFile.scala: added a try/catch block to exit gracefully for correupted gzip files
...
MesosScheduler.scala: formatted the slaveOffer() output to include the serialized task size
RDD.scala: added support for aggregating RDDs on a per-split basis
(aggregateSplit()) as well as for sampling without replacement (sample())
2010-08-18 15:25:57 -07:00
Matei Zaharia
b56ed67553
Updated code to work with Nexus->Mesos name change
2010-07-25 23:53:46 -04:00
Matei Zaharia
7d0eae17e3
Merge branch 'dev'
...
Conflicts:
src/scala/spark/HdfsFile.scala
src/scala/spark/NexusScheduler.scala
src/test/spark/repl/ReplSuite.scala
2010-06-27 15:21:54 -07:00
Matei Zaharia
cd247b7d86
Created common RDD superclass for distributed files and parallel arrays.
...
This also means that parallel arrays now get all the functionality files
used to have (filter, map, reduce, cache, etc).
2010-06-17 12:49:42 -07:00
Matei Zaharia
92246c843b
Initial work on 2.8 port
2010-06-10 21:50:55 -07:00
Matei Zaharia
06aac8a889
Imported changes from old repository (mostly Mosharaf's work,
...
plus some fault tolerance code).
2010-04-03 23:44:55 -07:00
Matei Zaharia
df29d0ea4c
Initial commit
2010-03-29 16:17:55 -07:00