spark-instrumented-optimizer/core
Nathan Kronenfeld fba8ec39cc Add caching information to rdd.toDebugString
I find it useful to see where in an RDD's DAG data is cached, so I figured others might too.

I've added both the caching level, and the actual memory state of the RDD.

Some of this is redundant with the web UI (notably the actual memory state), but (a) that is temporary, and (b) putting it in the DAG tree shows some context that can help a lot.

For example:
```
(4) ShuffledRDD[3] at reduceByKey at <console>:14
 +-(4) MappedRDD[2] at map at <console>:14
    |  MapPartitionsRDD[1] at mapPartitions at <console>:12
    |  ParallelCollectionRDD[0] at parallelize at <console>:12
```
should change to
```
(4) ShuffledRDD[3] at reduceByKey at <console>:14 [Memory Deserialized 1x Replicated]
 |       CachedPartitions: 4; MemorySize: 50.8 MB; TachyonSize: 0.0 B; DiskSize: 0.0 B
 +-(4) MappedRDD[2] at map at <console>:14 [Memory Deserialized 1x Replicated]
    |  MapPartitionsRDD[1] at mapPartitions at <console>:12 [Memory Deserialized 1x Replicated]
    |      CachedPartitions: 4; MemorySize: 109.1 MB; TachyonSize: 0.0 B; DiskSize: 0.0 B
    |  ParallelCollectionRDD[0] at parallelize at <console>:12 [Memory Deserialized 1x Replicated]
```

Author: Nathan Kronenfeld <nkronenfeld@oculusinfo.com>

Closes #1535 from nkronenfeld/feature/debug-caching2 and squashes the following commits:

40490bc [Nathan Kronenfeld] Back out DeveloperAPI and arguments to RDD.toDebugString, reinstate memory output
794e6a3 [Nathan Kronenfeld] Attempt to merge mima changes from master
6fe9e80 [Nathan Kronenfeld] Add exclusions to allow for signature change in toDebugString (will back out if necessary)
31d6769 [Nathan Kronenfeld] Attempt to get rid of style errors.  Add comments for the new memory usage parameter.
a0f6f76 [Nathan Kronenfeld] Add parameter to RDD.toDebugString to allow detailed memory info to be shown or not.  Default is for it not to be shown.
f8f565a [Nathan Kronenfeld] Fix code style error
8f54287 [Nathan Kronenfeld] Changed string addition to string interpolation as per PR comments
2a0cd4d [Nathan Kronenfeld] Fixed a small formatting issue I forgot to copy over from the old branch
8fbecb6 [Nathan Kronenfeld] Add caching information to rdd.toDebugString
2014-08-14 22:15:33 -07:00
..
src Add caching information to rdd.toDebugString 2014-08-14 22:15:33 -07:00
pom.xml [SPARK-2806] core - upgrade to json4s-jackson 3.2.10 2014-08-05 21:59:10 -07:00