spark-instrumented-optimizer/docs/configuration.md
Andy Konwinski 52c29071a4 - Add docs/api to .gitignore
- Rework/expand the nav bar with more of the docs site
- Removing parts of docs about EC2 and Mesos that differentiate between
  running 0.5 and before
    - Merged subheadings from running-on-amazon-ec2.html that are still relevant
      (i.e., "Using a newer version of Spark" and "Accessing Data in S3") into
      ec2-scripts.html and deleted running-on-amazon-ec2.html
- Added some TODO comments to a few docs
- Updated the blurb about AMP Camp
- Renamed programming-guide to spark-programming-guide
- Fixing typos/etc. in Standalone Spark doc
2012-09-16 15:28:52 -07:00

1.5 KiB

layout title
global Spark Configuration

Spark is configured primarily through the conf/spark-env.sh script. This script doesn't exist in the Git repository, but you can create it by copying conf/spark-env.sh.template. Make sure the script is executable.

Inside this script, you can set several environment variables:

  • SCALA_HOME to point to your Scala installation.
  • MESOS_NATIVE_LIBRARY if you are running on a Mesos cluster.
  • SPARK_MEM to set the amount of memory used per node (this should be in the same format as the JVM's -Xmx option, e.g. 300m or 1g)
  • SPARK_JAVA_OPTS to add JVM options. This includes system properties that you'd like to pass with -D.
  • SPARK_CLASSPATH to add elements to Spark's classpath.
  • SPARK_LIBRARY_PATH to add search directories for native libraries.

The spark-env.sh script is executed both when you submit jobs with run, when you start the interpreter with spark-shell, and on each worker node on a Mesos cluster to set up the environment for that worker.

The most important thing to set first will probably be the memory (SPARK_MEM). Make sure you set it high enough to be able to run your job but lower than the total memory on the machines (leave at least 1 GB for the operating system).

Logging Configuration

Spark uses log4j for logging. You can configure it by adding a log4j.properties file in the conf directory. One way to start is to copy the existing log4j.properties.template located there.