Commit graph

55 commits

Author SHA1 Message Date
Andrew Ray 7a8e587dc0 [SPARK-12211][DOC][GRAPHX] Fix version number in graphx doc for migration from 1.1
Migration from 1.1 section added to the GraphX doc in 1.2.0 (see https://spark.apache.org/docs/1.2.0/graphx-programming-guide.html#migrating-from-spark-11) uses \{{site.SPARK_VERSION}} as the version where changes were introduced, it should be just 1.2.

Author: Andrew Ray <ray.andrew@gmail.com>

Closes #10206 from aray/graphx-doc-1.1-migration.
2015-12-09 17:16:01 -08:00
Lukasz Piepiora a112d69fdc [SPARK-11174] [DOCS] Fix typo in the GraphX programming guide
This patch fixes a small typo in the GraphX programming guide

Author: Lukasz Piepiora <lpiepiora@gmail.com>

Closes #9160 from lpiepiora/11174-fix-typo-in-graphx-programming-guide.
2015-10-18 14:25:57 +01:00
Alexander Ulanov 1c843e2848 [SPARK-9508] GraphX Pregel docs update with new Pregel code
SPARK-9436 simplifies the Pregel code. graphx-programming-guide needs to be modified accordingly since it lists the old Pregel code

Author: Alexander Ulanov <nashb@yandex.ru>

Closes #7831 from avulanov/SPARK-9508-pregel-doc2.
2015-08-18 22:13:52 -07:00
Alexander Ulanov 90006f3c51 Pregel example type fix
Pregel example to express single source shortest path from https://spark.apache.org/docs/latest/graphx-programming-guide.html#pregel-api does not work due to incorrect type. The reason is that `GraphGenerators.logNormalGraph` returns the graph with `Long` vertices. Fixing `val graph: Graph[Int, Double]` to `val graph: Graph[Long, Double]`.

Author: Alexander Ulanov <nashb@yandex.ru>

Closes #7695 from avulanov/SPARK-9380-pregel-doc and squashes the following commits:

c269429 [Alexander Ulanov] Pregel example type fix
2015-07-28 01:33:31 +09:00
Brennon York 39fb579683 [SPARK-6510][GraphX]: Add Graph#minus method to act as Set#difference
Adds a `Graph#minus` method which will return only unique `VertexId`'s from the calling `VertexRDD`.

To demonstrate a basic example with pseudocode:

```
Set((0L,0),(1L,1)).minus(Set((1L,1),(2L,2)))
> Set((0L,0))
```

Author: Brennon York <brennon.york@capitalone.com>

Closes #5175 from brennonyork/SPARK-6510 and squashes the following commits:

248d5c8 [Brennon York] added minus(VertexRDD[VD]) method to avoid createUsingIndex and updated the mask operations to simplify with andNot call
3fb7cce [Brennon York] updated graphx doc to reflect the addition of minus method
6575d92 [Brennon York] updated mima exclude
aaa030b [Brennon York] completed graph#minus functionality
7227c0f [Brennon York] beginning work on minus functionality
2015-03-26 19:08:09 -07:00
DEBORAH SIEGEL e7d8ae444f aggregateMessages example in graphX doc
Examples illustrating difference between legacy mapReduceTriplets usage and aggregateMessages usage has type issues on the reduce for both operators.

Being just an example-  changed example to reduce the message String by concatenation. Although non-optimal for performance.

Author: DEBORAH SIEGEL <deborahsiegel@DEBORAHs-MacBook-Pro.local>

Closes #4853 from d3borah/master and squashes the following commits:

db54173 [DEBORAH SIEGEL] fixed aggregateMessages example in graphX doc
2015-03-02 10:15:32 -08:00
Benedikt Linse 5b8480e035 [GraphX] fixing 3 typos in the graphx programming guide
Corrected 3 Typos in the GraphX programming guide. I hope this is the correct way to contribute.

Author: Benedikt Linse <benedikt.linse@gmail.com>

Closes #4766 from 1123/master and squashes the following commits:

8a63812 [Benedikt Linse] fixing 3 typos in the graphx programming guide
2015-02-25 14:46:17 +00:00
Matei Zaharia 4d74f0601a [SPARK-5608] Improve SEO of Spark documentation pages
- Add meta description tags on some of the most important doc pages
- Shorten the titles of some pages to have more relevant keywords; for
  example there's no reason to have "Spark SQL Programming Guide - Spark
  1.2.0 documentation", we can just say "Spark SQL - Spark 1.2.0
  documentation".

Author: Matei Zaharia <matei@databricks.com>

Closes #4381 from mateiz/docs-seo and squashes the following commits:

4940563 [Matei Zaharia] [SPARK-5608] Improve SEO of Spark documentation pages
2015-02-05 11:12:50 -08:00
Reynold Xin b97070ec78 [Doc][GraphX] Remove Motivation section and did some minor update. 2014-11-21 00:29:02 -08:00
Joseph E. Gonzalez 377b068209 Updating GraphX programming guide and documentation
This pull request revises the programming guide to reflect changes in the GraphX API as well as the deprecated mapReduceTriplets operator.

Author: Joseph E. Gonzalez <joseph.e.gonzalez@gmail.com>

Closes #3359 from jegonzal/GraphXProgrammingGuide and squashes the following commits:

4421964 [Joseph E. Gonzalez] updating documentation for graphx
2014-11-19 16:53:33 -08:00
Matei Zaharia c8bf4131bc [SPARK-1566] consolidate programming guide, and general doc updates
This is a fairly large PR to clean up and update the docs for 1.0. The major changes are:

* A unified programming guide for all languages replaces language-specific ones and shows language-specific info in tabs
* New programming guide sections on key-value pairs, unit testing, input formats beyond text, migrating from 0.9, and passing functions to Spark
* Spark-submit guide moved to a separate page and expanded slightly
* Various cleanups of the menu system, security docs, and others
* Updated look of title bar to differentiate the docs from previous Spark versions

You can find the updated docs at http://people.apache.org/~matei/1.0-docs/_site/ and in particular http://people.apache.org/~matei/1.0-docs/_site/programming-guide.html.

Author: Matei Zaharia <matei@databricks.com>

Closes #896 from mateiz/1.0-docs and squashes the following commits:

03e6853 [Matei Zaharia] Some tweaks to configuration and YARN docs
0779508 [Matei Zaharia] tweak
ef671d4 [Matei Zaharia] Keep frames in JavaDoc links, and other small tweaks
1bf4112 [Matei Zaharia] Review comments
4414f88 [Matei Zaharia] tweaks
d04e979 [Matei Zaharia] Fix some old links to Java guide
a34ed33 [Matei Zaharia] tweak
541bb3b [Matei Zaharia] miscellaneous changes
fcefdec [Matei Zaharia] Moved submitting apps to separate doc
61d72b4 [Matei Zaharia] stuff
181f217 [Matei Zaharia] migration guide, remove old language guides
e11a0da [Matei Zaharia] Add more API functions
6a030a9 [Matei Zaharia] tweaks
8db0ae3 [Matei Zaharia] Added key-value pairs section
318d2c9 [Matei Zaharia] tweaks
1c81477 [Matei Zaharia] New section on basics and function syntax
e38f559 [Matei Zaharia] Actually added programming guide to Git
a33d6fe [Matei Zaharia] First pass at updating programming guide to support all languages, plus other tweaks throughout
3b6a876 [Matei Zaharia] More CSS tweaks
01ec8bf [Matei Zaharia] More CSS tweaks
e6d252e [Matei Zaharia] Change color of doc title bar to differentiate from 0.9.0
2014-05-30 00:34:33 -07:00
Ankur Dave 905173df57 Unify GraphImpl RDDs + other graph load optimizations
This PR makes the following changes, primarily in e4fbd329aef85fe2c38b0167255d2a712893d683:

1. *Unify RDDs to avoid zipPartitions.* A graph used to be four RDDs: vertices, edges, routing table, and triplet view. This commit merges them down to two: vertices (with routing table), and edges (with replicated vertices).

2. *Avoid duplicate shuffle in graph building.* We used to do two shuffles when building a graph: one to extract routing information from the edges and move it to the vertices, and another to find nonexistent vertices referred to by edges. With this commit, the latter is done as a side effect of the former.

3. *Avoid no-op shuffle when joins are fully eliminated.* This is a side effect of unifying the edges and the triplet view.

4. *Join elimination for mapTriplets.*

5. *Ship only the needed vertex attributes when upgrading the triplet view.* If the triplet view already contains source attributes, and we now need both attributes, only ship destination attributes rather than re-shipping both. This is done in `ReplicatedVertexView#upgrade`.

Author: Ankur Dave <ankurdave@gmail.com>

Closes #497 from ankurdave/unify-rdds and squashes the following commits:

332ab43 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds
4933e2e [Ankur Dave] Exclude RoutingTable from binary compatibility check
5ba8789 [Ankur Dave] Add GraphX upgrade guide from Spark 0.9.1
13ac845 [Ankur Dave] Merge remote-tracking branch 'apache-spark/master' into unify-rdds
a04765c [Ankur Dave] Remove unnecessary toOps call
57202e8 [Ankur Dave] Replace case with pair parameter
75af062 [Ankur Dave] Add explicit return types
04d3ae5 [Ankur Dave] Convert implicit parameter to context bound
c88b269 [Ankur Dave] Revert upgradeIterator to if-in-a-loop
0d3584c [Ankur Dave] EdgePartition.size should be val
2a928b2 [Ankur Dave] Set locality wait
10b3596 [Ankur Dave] Clean up public API
ae36110 [Ankur Dave] Fix style errors
e4fbd32 [Ankur Dave] Unify GraphImpl RDDs + other graph load optimizations
d6d60e2 [Ankur Dave] In GraphLoader, coalesce to minEdgePartitions
62c7b78 [Ankur Dave] In Analytics, take PageRank numIter
d64e8d4 [Ankur Dave] Log current Pregel iteration
2014-05-10 14:48:07 -07:00
Matei Zaharia fc78384704 [SPARK-1439, SPARK-1440] Generate unified Scaladoc across projects and Javadocs
I used the sbt-unidoc plugin (https://github.com/sbt/sbt-unidoc) to create a unified Scaladoc of our public packages, and generate Javadocs as well. One limitation is that I haven't found an easy way to exclude packages in the Javadoc; there is a SBT task that identifies Java sources to run javadoc on, but it's been very difficult to modify it from outside to change what is set in the unidoc package. Some SBT-savvy people should help with this. The Javadoc site also lacks package-level descriptions and things like that, so we may want to look into that. We may decide not to post these right now if it's too limited compared to the Scala one.

Example of the built doc site: http://people.csail.mit.edu/matei/spark-unified-docs/

Author: Matei Zaharia <matei@databricks.com>

This patch had conflicts when merged, resolved by
Committer: Patrick Wendell <pwendell@gmail.com>

Closes #457 from mateiz/better-docs and squashes the following commits:

a63d4a3 [Matei Zaharia] Skip Java/Scala API docs for Python package
5ea1f43 [Matei Zaharia] Fix links to Java classes in Java guide, fix some JS for scrolling to anchors on page load
f05abc0 [Matei Zaharia] Don't include java.lang package names
995e992 [Matei Zaharia] Skip internal packages and class names with $ in JavaDoc
a14a93c [Matei Zaharia] typo
76ce64d [Matei Zaharia] Add groups to Javadoc index page, and a first package-info.java
ed6f994 [Matei Zaharia] Generate JavaDoc as well, add titles, update doc site to use unified docs
acb993d [Matei Zaharia] Add Unidoc plugin for the projects we want Unidoced
2014-04-21 21:57:40 -07:00
Sandy Ryza 698373211e SPARK-1183. Don't use "worker" to mean executor
Author: Sandy Ryza <sandy@cloudera.com>

Closes #120 from sryza/sandy-spark-1183 and squashes the following commits:

5066a4a [Sandy Ryza] Remove "worker" in a couple comments
0bd1e46 [Sandy Ryza] Remove --am-class from usage
bfc8fe0 [Sandy Ryza] Remove am-class from doc and fix yarn-alpha
607539f [Sandy Ryza] Address review comments
74d087a [Sandy Ryza] SPARK-1183. Don't use "worker" to mean executor
2014-03-13 12:11:33 -07:00
Reynold Xin 3d9e66d92a Merge pull request #436 from ankurdave/VertexId-case
Rename VertexID -> VertexId in GraphX
2014-01-14 23:17:05 -08:00
Ankur Dave f4d9019aa8 VertexID -> VertexId 2014-01-14 22:17:18 -08:00
Reynold Xin 3a386e2389 Merge pull request #424 from jegonzal/GraphXProgrammingGuide
Additional edits for clarity in the graphx programming guide.

Added an overview of the Graph and GraphOps functions and fixed numerous typos.
2014-01-14 21:52:50 -08:00
Ankur Dave 1210ec2945 Describe GraphX caching and uncaching in guide 2014-01-14 17:25:38 -08:00
Joseph E. Gonzalez 0bba7738a2 Additional edits for clarity in the graphx programming guide. 2014-01-14 10:31:54 -08:00
Joseph E. Gonzalez 486f37c59c Improving the graphx-programming-guide. 2014-01-14 09:43:33 -08:00
Joseph E. Gonzalez 4bafc4f41f adding documentation about EdgeRDD 2014-01-13 22:55:54 -08:00
Ankur Dave af645be5b8 Fix all code examples in guide 2014-01-13 22:29:45 -08:00
Ankur Dave 2cd9358ccf Finish 6f6f8c928c 2014-01-13 22:29:23 -08:00
Ankur Dave 6f6f8c928c Wrap methods in the appropriate class/object declaration 2014-01-13 21:55:35 -08:00
Ankur Dave 67795dbbfb Write Graph Builders section in guide 2014-01-13 21:45:11 -08:00
Ankur Dave e14a14bcde Remove K-Core and LDA sections from guide; they are unimplemented 2014-01-13 21:12:58 -08:00
Ankur Dave 59e4384e19 Fix Pregel SSSP example in programming guide 2014-01-13 21:02:38 -08:00
Joseph E. Gonzalez ee8931d2c6 Finished documenting vertexrdd. 2014-01-13 19:30:35 -08:00
Joseph E. Gonzalez 552de5d42e Finished second pass on pregel docs. 2014-01-13 18:40:43 -08:00
Joseph E. Gonzalez 622b7f7d39 Minor changes in graphx programming guide. 2014-01-13 18:40:43 -08:00
Joseph E. Gonzalez cfe4a29dcb Improvements in example code for the programming guide as well as adding serialization support for GraphImpl to address issues with failed closure capture. 2014-01-13 17:18:31 -08:00
Ankur Dave 1bd5cefcae Remove aggregateNeighbors 2014-01-13 17:03:03 -08:00
Ankur Dave 8038da2328 Merge pull request #2 from jegonzal/GraphXCCIssue
Improving documentation and identifying potential bug in CC calculation.
2014-01-13 14:59:30 -08:00
Ankur Dave 97cd27e31b Add graph loader links to doc 2014-01-13 14:54:48 -08:00
Ankur Dave 15ca89b11e Fix mapReduceTriplets links in doc 2014-01-13 14:54:33 -08:00
Joseph E. Gonzalez 80e4d98dc6 Improving documentation and identifying potential bug in CC calculation. 2014-01-13 13:40:16 -08:00
Joseph E. Gonzalez 66c9d0092a Tested and corrected all examples up to mask in the graphx-programming-guide. 2014-01-12 22:11:13 -08:00
Ankur Dave 1efe78a101 Use GraphLoader for algorithms examples in doc 2014-01-12 22:03:03 -08:00
Ankur Dave d691e9f47e Move algorithms to GraphOps 2014-01-12 21:47:16 -08:00
Ankur Dave 20c509b805 Add TriangleCount example 2014-01-12 21:41:32 -08:00
Joseph E. Gonzalez c787ff5640 Documenting Pregel API 2014-01-12 20:49:52 -08:00
Ankur Dave 7a4bb863c7 Add connected components example to doc 2014-01-12 16:58:18 -08:00
Ankur Dave 5e35d39e0f Add PageRank example and data 2014-01-12 13:10:53 -08:00
Ankur Dave f096f4eaf1 Link methods in programming guide; document VertexID 2014-01-12 10:55:29 -08:00
Joseph E. Gonzalez cf57b1b055 Correcting typos in documentation. 2014-01-11 17:13:10 -08:00
Joseph E. Gonzalez 64c4593586 Finished docummenting join operators and revised some of the initial presentation. 2014-01-11 13:48:35 -08:00
Ankur Dave 732333d78e Remove GraphLab 2014-01-11 11:49:35 -08:00
Joseph E. Gonzalez fac44bbe2c Finished documenting structural operators and starting join operators. 2014-01-11 11:28:01 -08:00
Joseph E. Gonzalez 1f45e4e572 starting structural operator discussion. 2014-01-11 09:27:00 -08:00
Joseph E. Gonzalez 0c9d39bbaa More organizational changes and dropping the benchmark plot. 2014-01-11 00:09:08 -08:00