spark-instrumented-optimizer/docs/img
Tathagata Das 7930209614 Merge pull request #497 from tdas/docs-update
Updated Spark Streaming Programming Guide

Here is the updated version of the Spark Streaming Programming Guide. This is still a work in progress, but the major changes are in place. So feedback is most welcome.

In general, I have tried to make the guide to easier to understand even if the reader does not know much about Spark. The updated website is hosted here -

http://www.eecs.berkeley.edu/~tdas/spark_docs/streaming-programming-guide.html

The major changes are:
- Overview illustrates the usecases of Spark Streaming - various input sources and various output sources
- An example right after overview to quickly give an idea of what Spark Streaming program looks like
- Made Java API and examples a first class citizen like Scala by using tabs to show both Scala and Java examples (similar to AMPCamp tutorial's code tabs)
- Highlighted the DStream operations updateStateByKey and transform because of their powerful nature
- Updated driver node failure recovery text to highlight automatic recovery in Spark standalone mode
- Added information about linking and using the external input sources like Kafka and Flume
- In general, reorganized the sections to better show the Basic section and the more advanced sections like Tuning and Recovery.

Todos:
- Links to the docs of external Kafka, Flume, etc
- Illustrate window operation with figure as well as example.

Author: Tathagata Das <tathagata.das1565@gmail.com>

== Merge branch commits ==

commit 18ff10556570b39d672beeb0a32075215cfcc944
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Tue Jan 28 21:49:30 2014 -0800

    Fixed a lot of broken links.

commit 34a5a6008dac2e107624c7ff0db0824ee5bae45f
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Tue Jan 28 18:02:28 2014 -0800

    Updated github url to use SPARK_GITHUB_URL variable.

commit f338a60ae8069e0a382d2cb170227e5757cc0b7a
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Mon Jan 27 22:42:42 2014 -0800

    More updates based on Patrick and Harvey's comments.

commit 89a81ff25726bf6d26163e0dd938290a79582c0f
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Mon Jan 27 13:08:34 2014 -0800

    Updated docs based on Patricks PR comments.

commit d5b6196b532b5746e019b959a79ea0cc013a8fc3
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Sun Jan 26 20:15:58 2014 -0800

    Added spark.streaming.unpersist config and info on StreamingListener interface.

commit e3dcb46ab83d7071f611d9b5008ba6bc16c9f951
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Sun Jan 26 18:41:12 2014 -0800

    Fixed docs on StreamingContext.getOrCreate.

commit 6c29524639463f11eec721e4d17a9d7159f2944b
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Thu Jan 23 18:49:39 2014 -0800

    Added example and figure for window operations, and links to Kafka and Flume API docs.

commit f06b964a51bb3b21cde2ff8bdea7d9785f6ce3a9
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Wed Jan 22 22:49:12 2014 -0800

    Fixed missing endhighlight tag in the MLlib guide.

commit 036a7d46187ea3f2a0fb8349ef78f10d6c0b43a9
Merge: eab351d a1cd185
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Wed Jan 22 22:17:42 2014 -0800

    Merge remote-tracking branch 'apache/master' into docs-update

commit eab351d05c0baef1d4b549e1581310087158d78d
Author: Tathagata Das <tathagata.das1565@gmail.com>
Date:   Wed Jan 22 22:17:15 2014 -0800

    Update Spark Streaming Programming Guide.
2014-01-28 21:51:05 -08:00
..
cluster-overview.png Updated cluster diagram to show caches 2013-09-08 13:51:57 -07:00
cluster-overview.pptx Updated cluster diagram to show caches 2013-09-08 13:51:57 -07:00
data_parallel_vs_graph_parallel.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00
edge-cut.png Strating to improve README. 2013-10-29 20:57:55 -07:00
edge_cut_vs_vertex_cut.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00
glyphicons-halflings-white.png Adding docs directory containing documentation currently on the wiki 2012-09-12 13:03:43 -07:00
glyphicons-halflings.png Adding docs directory containing documentation currently on the wiki 2012-09-12 13:03:43 -07:00
graph_analytics_pipeline.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00
graph_parallel.png Strating to improve README. 2013-10-29 20:57:55 -07:00
graphx_figures.pptx Finished docummenting join operators and revised some of the initial presentation. 2014-01-11 13:48:35 -08:00
graphx_logo.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00
graphx_performance_comparison.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00
incubator-logo.png More doc improvements + better warnings when you haven't built Spark 2013-08-30 12:41:25 -07:00
java-sm.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
property_graph.png More edits. 2014-01-10 23:52:24 -08:00
python-sm.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
scala-sm.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
spark-logo-77x40px-hd.png More crisp logo created from vector source (ai) and disabled 2012-09-13 15:27:33 -07:00
spark-logo-77x50px-hd.png Various enhancements to the programming guide and HTML/CSS 2012-09-25 23:26:56 -07:00
spark-logo-100x40px.png Replaces "Spark" word in nav bar with logo. 2012-09-13 12:08:12 -07:00
spark-logo-hd.png Added cluster overview doc, made logo higher-resolution, and added more 2013-09-08 00:29:11 -07:00
streaming-arch.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
streaming-dstream-ops.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
streaming-dstream-window.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
streaming-dstream.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
streaming-figures.pptx Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
streaming-flow.png Merge pull request #497 from tdas/docs-update 2014-01-28 21:51:05 -08:00
tables_and_graphs.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00
triplet.png More edits. 2014-01-10 23:52:24 -08:00
vertex-cut.png Strating to improve README. 2013-10-29 20:57:55 -07:00
vertex_routing_edge_tables.png WIP. Updating figures and cleaning up initial skeleton for GraphX Programming guide. 2014-01-10 00:39:08 -08:00