2012-09-05 16:24:09 -04:00
|
|
|
Welcome to the Spark documentation!
|
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
This readme will walk you through navigating and building the Spark documentation, which is included
|
|
|
|
here with the Spark source code. You can also find documentation specific to release versions of
|
|
|
|
Spark at http://spark.apache.org/documentation.html.
|
2012-09-05 16:24:09 -04:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
Read on to learn more about viewing documentation in plain text (i.e., markdown) or building the
|
|
|
|
documentation yourself. Why build it yourself? So that you have the docs that corresponds to
|
|
|
|
whichever version of Spark you currently have checked out of revision control.
|
2012-09-05 16:24:09 -04:00
|
|
|
|
2012-09-12 22:27:44 -04:00
|
|
|
## Generating the Documentation HTML
|
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
We include the Spark documentation as part of the source (as opposed to using a hosted wiki, such as
|
|
|
|
the github wiki, as the definitive documentation) to enable the documentation to evolve along with
|
|
|
|
the source code and be captured by revision control (currently git). This way the code automatically
|
|
|
|
includes the version of the documentation that is relevant regardless of which version or release
|
|
|
|
you have checked out or downloaded.
|
2012-09-05 16:24:09 -04:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
In this directory you will find textfiles formatted using Markdown, with an ".md" suffix. You can
|
|
|
|
read those text files directly if you want. Start with index.md.
|
2012-09-05 16:24:09 -04:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
The markdown code can be compiled to HTML using the [Jekyll tool](http://jekyllrb.com).
|
2014-03-02 21:19:01 -05:00
|
|
|
To use the `jekyll` command, you will need to have Jekyll installed.
|
|
|
|
The easiest way to do this is via a Ruby Gem, see the
|
2014-05-06 23:07:22 -04:00
|
|
|
[jekyll installation instructions](http://jekyllrb.com/docs/installation).
|
|
|
|
If not already installed, you need to install `kramdown` with `sudo gem install kramdown`.
|
2014-05-25 20:15:47 -04:00
|
|
|
Execute `jekyll` from the `docs/` directory. Compiling the site with Jekyll will create a directory
|
|
|
|
called `_site` containing index.html as well as the rest of the compiled files.
|
2014-03-02 21:19:01 -05:00
|
|
|
|
|
|
|
You can modify the default Jekyll build as follows:
|
|
|
|
|
|
|
|
# Skip generating API docs (which takes a while)
|
|
|
|
$ SKIP_SCALADOC=1 jekyll build
|
|
|
|
# Serve content locally on port 4000
|
|
|
|
$ jekyll serve --watch
|
|
|
|
# Build the site with extra features used on the live page
|
|
|
|
$ PRODUCTION=1 jekyll build
|
2012-09-05 16:24:09 -04:00
|
|
|
|
2012-09-12 22:27:44 -04:00
|
|
|
## Pygments
|
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
We also use pygments (http://pygments.org) for syntax highlighting in documentation markdown pages,
|
|
|
|
so you will also need to install that (it requires Python) by running `sudo easy_install Pygments`.
|
2012-09-12 22:27:44 -04:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
To mark a block of code in your markdown to be syntax highlighted by jekyll during the compile
|
|
|
|
phase, use the following sytax:
|
2012-09-12 22:27:44 -04:00
|
|
|
|
|
|
|
{% highlight scala %}
|
|
|
|
// Your scala code goes here, you can replace scala with many other
|
|
|
|
// supported languages too.
|
|
|
|
{% endhighlight %}
|
|
|
|
|
2012-12-27 20:55:33 -05:00
|
|
|
## API Docs (Scaladoc and Epydoc)
|
2012-09-12 22:27:44 -04:00
|
|
|
|
2014-01-06 01:05:30 -05:00
|
|
|
You can build just the Spark scaladoc by running `sbt/sbt doc` from the SPARK_PROJECT_ROOT directory.
|
2012-09-13 19:52:53 -04:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
Similarly, you can build just the PySpark epydoc by running `epydoc --config epydoc.conf` from the
|
|
|
|
SPARK_PROJECT_ROOT/pyspark directory. Documentation is only generated for classes that are listed as
|
|
|
|
public in `__init__.py`.
|
2012-10-08 15:15:44 -04:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
When you run `jekyll` in the `docs` directory, it will also copy over the scaladoc for the various
|
|
|
|
Spark subprojects into the `docs` directory (and then also into the `_site` directory). We use a
|
|
|
|
jekyll plugin to run `sbt/sbt doc` before building the site so if you haven't run it (recently) it
|
|
|
|
may take some time as it generates all of the scaladoc. The jekyll plugin also generates the
|
|
|
|
PySpark docs using [epydoc](http://epydoc.sourceforge.net/).
|
2012-12-27 20:55:33 -05:00
|
|
|
|
2014-05-25 20:15:47 -04:00
|
|
|
NOTE: To skip the step of building and copying over the Scala and Python API docs, run `SKIP_API=1
|
|
|
|
jekyll`.
|