for each split. This replaces the previous method of calling
split.toString, which would produce different results for the same split
each time it is deserialized (because the default implementation returns
the Java object's address).
- Added SampledRDD, SplitRDD and CartesianRDD
- Made Split a class rather than a type parameter
- Added numCores() to Scheduler to help set default level of parallelism
MesosScheduler.scala: formatted the slaveOffer() output to include the serialized task size
RDD.scala: added support for aggregating RDDs on a per-split basis
(aggregateSplit()) as well as for sampling without replacement (sample())