spark-instrumented-optimizer/core/src/main/scala/spark
Matei Zaharia 64ba6a8c2c Simplify checkpointing code and RDD class a little:
- RDD's getDependencies and getSplits methods are now guaranteed to be
  called only once, so subclasses can safely do computation in there
  without worrying about caching the results.

- The management of a "splits_" variable that is cleared out when we
  checkpoint an RDD is now done in the RDD class.

- A few of the RDD subclasses are simpler.

- CheckpointRDD's compute() method no longer assumes that it is given a
  CheckpointRDDSplit -- it can work just as well on a split from the
  original RDD, because it only looks at its index. This is important
  because things like UnionRDD and ZippedRDD remember the parent's
  splits as part of their own and wouldn't work on checkpointed parents.

- RDD.iterator can now reuse cached data if an RDD is computed before it
  is checkpointed. It seems like it wouldn't do this before (it always
  called iterator() on the CheckpointRDD, which read from HDFS).
2013-01-28 22:30:12 -08:00
..
api Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
broadcast Merge branch 'master' into streaming 2013-01-20 12:47:55 -08:00
deploy Clean up BlockManagerUI a little (make it not be an object, merge with 2013-01-27 23:56:14 -08:00
executor Rename more things from slave to executor 2013-01-27 23:17:20 -08:00
network Refactor daemon thread pool creation. 2013-01-21 23:31:00 -08:00
partial Make more stuff private[spark] 2012-10-02 22:28:55 -07:00
rdd Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
scheduler Merge pull request #413 from pwendell/stage-logging 2013-01-28 22:01:52 -08:00
serializer More doc updates, and moved Serializer to a subpackage. 2012-10-12 18:19:21 -07:00
storage Some DEBUG-level log cleanup. 2013-01-28 20:29:35 -08:00
util Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
Accumulators.scala Minor cleanup. 2013-01-21 15:55:46 -06:00
Aggregator.scala Remove mapSideCombine field from Aggregator. 2012-10-13 14:59:20 -07:00
BlockStoreShuffleFetcher.scala Change ShuffleFetcher to return an Iterator. 2012-10-13 14:59:20 -07:00
Cache.scala Make classes package private 2012-10-02 19:00:19 -07:00
CacheManager.scala Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
ClosureCleaner.scala Make classes package private 2012-10-02 19:00:19 -07:00
Dependency.scala Updated PruneDependency to change "split" to "partition". 2013-01-23 22:22:03 -08:00
DoubleRDDFunctions.scala Added documentation to all the *RDDFunction classes, and moved them into 2012-10-09 18:38:36 -07:00
FetchFailedException.scala Make classes package private 2012-10-02 19:00:19 -07:00
HadoopWriter.scala Support for Hadoop 2 distributions such as cdh4 2012-10-18 16:08:54 -07:00
HttpFileServer.scala Don't download files to master's working directory. 2013-01-21 17:34:17 -08:00
HttpServer.scala Fix for hanging spark.HttpFileServer with kind of virtual network 2013-01-22 23:08:34 +09:00
JavaSerializer.scala More doc updates, and moved Serializer to a subpackage. 2012-10-12 18:19:21 -07:00
KryoSerializer.scala Fix compile error due to cherry-pick 2013-01-23 13:07:27 -08:00
Logging.scala Minor cleanup. 2013-01-21 15:55:46 -06:00
MapOutputTracker.scala Track workers by executor ID instead of hostname to allow multiple 2013-01-27 19:23:49 -08:00
package.scala Scaladoc documentation for some core Spark functionality 2012-10-04 22:59:36 -07:00
PairRDDFunctions.scala Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
ParallelCollection.scala Further simplify getOrElse call. 2013-01-21 21:30:24 -06:00
Partitioner.scala Raise exception when hashing Java arrays (SPARK-597) 2012-12-31 20:20:11 -08:00
RDD.scala Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
RDDCheckpointData.scala Simplify checkpointing code and RDD class a little: 2013-01-28 22:30:12 -08:00
SequenceFileRDDFunctions.scala Update Hadoop dependency to 1.0.3 as 0.20 has Sun specific dependencies. Also 2013-01-07 15:57:33 -08:00
SerializableWritable.scala Fix issue #65: Change @serializable to extends Serializable in 2.9 branch 2011-08-02 10:16:33 +01:00
ShuffleFetcher.scala Change ShuffleFetcher to return an Iterator. 2012-10-13 14:59:20 -07:00
SizeEstimator.scala Remove dependencies on sun jvm classes. Instead use reflection to infer 2013-01-07 15:57:18 -08:00
SoftReferenceCache.scala Make classes package private 2012-10-02 19:00:19 -07:00
SparkContext.scala add long and float accumulatorparams 2013-01-28 20:23:11 -08:00
SparkEnv.scala Track workers by executor ID instead of hostname to allow multiple 2013-01-27 19:23:49 -08:00
SparkException.scala Upgraded to Akka 2 and fixed test execution (which was still parallel 2012-06-28 23:51:28 -07:00
SparkFiles.java Allow PySpark's SparkFiles to be used from driver 2013-01-23 10:58:50 -08:00
Split.scala Various code style fixes, mostly from IntelliJ IDEA 2012-06-29 18:47:12 -07:00
TaskContext.scala Minor cleanup. 2013-01-21 15:55:46 -06:00
TaskEndReason.scala Stylistic changes and Public Accumulable and Broadcast 2012-10-02 19:28:37 -07:00
TaskState.scala Make classes package private 2012-10-02 19:00:19 -07:00
Utils.scala Clean up BlockManagerUI a little (make it not be an object, merge with 2013-01-27 23:56:14 -08:00