863ec0cb4d
This patch adds the functionality to display the RDD DAG on the SparkUI. This DAG describes the relationships between - an RDD and its dependencies, - an RDD and its operation scopes, and - an RDD's operation scopes and the stage / job hierarchy An operation scope here refers to the existing public APIs that created the RDDs (e.g. `textFile`, `treeAggregate`). In the future, we can expand this to include higher level operations like SQL queries. *Note: This blatantly stole a few lines of HTML and JavaScript from #5547 (thanks shroffpradyumn!)* Here's what the job page looks like: <img src="https://issues.apache.org/jira/secure/attachment/12730286/job-page.png" width="700px"/> and the stage page: <img src="https://issues.apache.org/jira/secure/attachment/12730287/stage-page.png" width="300px"/> Author: Andrew Or <andrew@databricks.com> Closes #5729 from andrewor14/viz2 and squashes the following commits: 666c03b [Andrew Or] Round corners of RDD boxes on stage page (minor) 01ba336 [Andrew Or] Change RDD cache color to red (minor) 6f9574a [Andrew Or] Add tests for RDDOperationScope 1c310e4 [Andrew Or] Wrap a few more RDD functions in an operation scope 3ffe566 [Andrew Or] Restore "null" as default for RDD name 5fdd89d [Andrew Or] children -> child (minor) 0d07a84 [Andrew Or] Fix python style afb98e2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 0d7aa32 [Andrew Or] Fix python tests 3459ab2 [Andrew Or] Fix tests 832443c [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 429e9e1 [Andrew Or] Display cached RDDs on the viz b1f0fd1 [Andrew Or] Rename OperatorScope -> RDDOperationScope 31aae06 [Andrew Or] Extract visualization logic from listener 83f9c58 [Andrew Or] Implement a programmatic representation of operator scopes 5a7faf4 [Andrew Or] Rename references to viz scopes to viz clusters ee33d52 [Andrew Or] Separate HTML generating code from listener f9830a2 [Andrew Or] Refactor + clean up + document JS visualization code b80cc52 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 0706992 [Andrew Or] Add link from jobs to stages deb48a0 [Andrew Or] Translate stage boxes taking into account the width 5c7ce16 [Andrew Or] Connect RDDs across stages + update style ab91416 [Andrew Or] Introduce visualization to the Job Page 5f07e9c [Andrew Or] Remove more return statements from scopes 5e388ea [Andrew Or] Fix line too long 43de96e [Andrew Or] Add parent IDs to StageInfo 6e2cfea [Andrew Or] Remove all return statements in `withScope` d19c4da [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 7ef957c [Andrew Or] Fix scala style 4310271 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz2 aa868a9 [Andrew Or] Ensure that HadoopRDD is actually serializable c3bfcae [Andrew Or] Re-implement scopes using closures instead of annotations 52187fc [Andrew Or] Rat excludes 09d361e [Andrew Or] Add ID to node label (minor) 71281fa [Andrew Or] Embed the viz in the UI in a toggleable manner 8dd5af2 [Andrew Or] Fill in documentation + miscellaneous minor changes fe7816f [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz 205f838 [Andrew Or] Reimplement rendering with dagre-d3 instead of viz.js 5e22946 [Andrew Or] Merge branch 'master' of github.com:apache/spark into viz 6a7cdca [Andrew Or] Move RDD scope util methods and logic to its own file 494d5c2 [Andrew Or] Revert a few unintended style changes 9fac6f3 [Andrew Or] Re-implement scopes through annotations instead f22f337 [Andrew Or] First working implementation of visualization with vis.js 2184348 [Andrew Or] Translate RDD information to dot file 5143523 [Andrew Or] Expose the necessary information in RDDInfo a9ed4f9 [Andrew Or] Add a few missing scopes to certain RDD methods 6b3403b [Andrew Or] Scope all RDD methods
79 lines
1.1 KiB
Plaintext
79 lines
1.1 KiB
Plaintext
target
|
|
cache
|
|
.gitignore
|
|
.gitattributes
|
|
.project
|
|
.classpath
|
|
.mima-excludes
|
|
.generated-mima-excludes
|
|
.generated-mima-class-excludes
|
|
.generated-mima-member-excludes
|
|
.rat-excludes
|
|
.*md
|
|
derby.log
|
|
TAGS
|
|
RELEASE
|
|
control
|
|
docs
|
|
docker.properties.template
|
|
fairscheduler.xml.template
|
|
spark-defaults.conf.template
|
|
log4j.properties
|
|
log4j.properties.template
|
|
metrics.properties
|
|
metrics.properties.template
|
|
slaves
|
|
slaves.template
|
|
spark-env.sh
|
|
spark-env.cmd
|
|
spark-env.sh.template
|
|
log4j-defaults.properties
|
|
bootstrap-tooltip.js
|
|
jquery-1.11.1.min.js
|
|
d3.min.js
|
|
dagre-d3.min.js
|
|
graphlib-dot.min.js
|
|
sorttable.js
|
|
vis.min.js
|
|
vis.min.css
|
|
vis.map
|
|
.*avsc
|
|
.*txt
|
|
.*json
|
|
.*data
|
|
.*log
|
|
cloudpickle.py
|
|
heapq3.py
|
|
join.py
|
|
SparkExprTyper.scala
|
|
SparkILoop.scala
|
|
SparkILoopInit.scala
|
|
SparkIMain.scala
|
|
SparkImports.scala
|
|
SparkJLineCompletion.scala
|
|
SparkJLineReader.scala
|
|
SparkMemberHandlers.scala
|
|
SparkReplReporter.scala
|
|
sbt
|
|
sbt-launch-lib.bash
|
|
plugins.sbt
|
|
work
|
|
.*\.q
|
|
.*\.qv
|
|
golden
|
|
test.out/*
|
|
.*iml
|
|
service.properties
|
|
db.lck
|
|
build/*
|
|
dist/*
|
|
.*out
|
|
.*ipr
|
|
.*iws
|
|
logs
|
|
.*scalastyle-output.xml
|
|
.*dependency-reduced-pom.xml
|
|
known_translations
|
|
DESCRIPTION
|
|
NAMESPACE
|