Commit graph

2402 commits

Author SHA1 Message Date
Aaron Davidson 42d8b8efe6 Address Matei's comments on documentation
Updates to the documentation and changing some logError()s to logWarning()s.
2013-10-10 00:33:47 -07:00
Prashant Sharma 026ab75661 Merge branch 'master' of github.com:apache/incubator-spark into scala-2.10 2013-10-10 09:42:55 +05:30
Matei Zaharia 478b2b7edc Fix PySpark docs and an overly long line of code after fdbae41e 2013-10-09 12:08:04 -07:00
Aaron Davidson 4ea8ee468f Add docs for standalone scheduler fault tolerance
Also fix a couple HTML/Markdown issues in other files.
2013-10-08 14:18:31 -07:00
Prashant Sharma 7be75682b9 Merge branch 'master' into wip-merge-master
Conflicts:
	bagel/pom.xml
	core/pom.xml
	core/src/test/scala/org/apache/spark/ui/UISuite.scala
	examples/pom.xml
	mllib/pom.xml
	pom.xml
	project/SparkBuild.scala
	repl/pom.xml
	streaming/pom.xml
	tools/pom.xml

In scala 2.10, a shorter representation is used for naming artifacts
 so changed to shorter scala version for artifacts and made it a property in pom.
2013-10-08 11:29:40 +05:30
Nick Pentreath a5e58b8f98 Merge branch 'master' into implicit-als 2013-10-07 11:46:17 +02:00
Patrick Wendell aa9fb84994 Merging build changes in from 0.8 2013-10-05 22:07:00 -07:00
Prashant Sharma c810ee0690 Merge branch 'master' into scala-2.10
Conflicts:
	core/src/test/scala/org/apache/spark/DistributedSuite.scala
	project/SparkBuild.scala
2013-10-05 15:52:57 +05:30
Nick Pentreath 93b96b44d7 Adding implicit feedback ALS to MLlib user guide 2013-10-04 14:39:44 +02:00
tgravescs 0fff4ee852 Adding in the --addJars option to make SparkContext.addJar work on yarn and cleanup
the classpaths
2013-10-03 11:52:16 -05:00
tgravescs bc3b20abdc Allow users to set the application name for Spark on Yarn 2013-10-02 12:54:17 -05:00
Prashant Sharma 5829692885 Merge branch 'master' into scala-2.10
Conflicts:
	core/src/main/scala/org/apache/spark/ui/jobs/JobProgressUI.scala
	docs/_config.yml
	project/SparkBuild.scala
	repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
2013-10-01 11:57:24 +05:30
shane-huang 84849baf88 Merge branch 'reorgscripts' into scripts-reorg 2013-09-27 09:28:33 +08:00
Prashant Sharma 604dc40996 Sync with master and some build fixes 2013-09-26 11:40:02 +05:30
Patrick Wendell 6079721fa1 Update build version in master 2013-09-24 11:41:51 -07:00
Y.CORP.YAHOO.COM\tgraves 9d4246863a Support distributed cache files and archives on spark on yarn and attempt to cleanup the staging directory on exit 2013-09-23 09:09:59 -05:00
shane-huang fcfe4f9204 add admin scripts to sbin
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 12:42:34 +08:00
shane-huang dfbdc9ddb7 added spark-class and spark-executor to sbin
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 11:28:58 +08:00
Jey Kottalam ac0dd99394 Fix typo in Maven build docs 2013-09-15 13:29:22 -07:00
Patrick Wendell dbd2c4fd94 Merge pull request #932 from pwendell/mesos-version
Bumping Mesos version to 0.13.0
2013-09-15 13:20:41 -07:00
Patrick Wendell c856860c5b Bumping Mesos version to 0.13.0 2013-09-15 12:46:26 -07:00
Patrick Wendell 362ea0c051 Explain yarn.version in Maven build docs 2013-09-15 12:40:49 -07:00
Prashant Sharma a90e0eff59 version changed 2.9.3 -> 2.10 in shell script. 2013-09-15 12:47:20 +05:30
Benjamin Hindman 8e2602dd70 More updates to Spark on Mesos documentation. 2013-09-11 16:08:54 -07:00
Benjamin Hindman a0f0c1bed2 Updated Spark on Mesos documentation. 2013-09-11 16:05:25 -07:00
Patrick Wendell bddf135670 Change port from 3030 to 4040 2013-09-11 10:01:38 -07:00
Matei Zaharia 2425eb85ca Update Python API features 2013-09-10 11:12:59 -07:00
Patrick Wendell cefee1ed1a Document fortran dependency for MLBase 2013-09-09 21:45:04 -07:00
Matei Zaharia 7a5c4b647b Small tweaks to MLlib docs 2013-09-08 21:47:24 -07:00
Matei Zaharia 7d3204b056 Merge pull request #905 from mateiz/docs2
Job scheduling and cluster mode docs
2013-09-08 21:39:12 -07:00
Matei Zaharia b458854977 Fix some review comments 2013-09-08 21:25:49 -07:00
Ameet Talwalkar 81a8bd46ac respose to PR comments 2013-09-08 19:21:30 -07:00
Ameet Talwalkar bf280c8b0f Merge remote-tracking branch 'upstream/master' 2013-09-08 18:41:38 -07:00
Patrick Wendell f68848d95d Merge pull request #906 from pwendell/ganglia-sink
Clean-up of Metrics Code/Docs and Add Ganglia Sink
2013-09-08 18:32:16 -07:00
Ameet Talwalkar 5ac62dbbd0 updates based on comments to PR 2013-09-08 17:39:08 -07:00
Matei Zaharia 5a587fb98d Updated cluster diagram to show caches 2013-09-08 13:51:57 -07:00
Patrick Wendell c190b48bf5 Adding more docs and some code cleanup 2013-09-08 13:46:28 -07:00
Matei Zaharia af8ffdb73c Review comments 2013-09-08 13:36:50 -07:00
Matei Zaharia c0d375107f Some tweaks to CDH/HDP doc 2013-09-08 00:44:41 -07:00
Matei Zaharia f261d2a60f Added cluster overview doc, made logo higher-resolution, and added more
details on monitoring
2013-09-08 00:29:11 -07:00
Matei Zaharia 651a96adf7 More fair scheduler docs and property names.
Also changed uses of "job" terminology to "application" when they
referred to an entire Spark program, to avoid confusion.
2013-09-08 00:29:11 -07:00
Matei Zaharia 98fb69822c Work in progress:
- Add job scheduling docs
- Rename some fair scheduler properties
- Organize intro page better
- Link to Apache wiki for "contributing to Spark"
2013-09-08 00:29:11 -07:00
Matei Zaharia 38488aca8a Merge pull request #900 from pwendell/cdh-docs
Provide docs to describe running on CDH/HDP cluster.
2013-09-08 00:28:53 -07:00
Patrick Wendell 22b982d2bc File rename 2013-09-07 14:38:54 -07:00
Matei Zaharia cfde85e395 Merge pull request #901 from ooyala/2013-09/0.8-doc-changes
0.8 Doc changes for make-distribution.sh
2013-09-07 13:53:08 -07:00
Patrick Wendell 61c4762d45 Changes based on feedback 2013-09-07 11:55:10 -07:00
Evan Chan be1ee28ca6 CR feedback from Matei 2013-09-07 08:56:24 -07:00
Matei Zaharia afe46ba36e Merge pull request #892 from jey/fix-yarn-assembly
YARN build fixes
2013-09-07 07:28:51 -07:00
Evan Chan ff1dbf2106 Add references to make-distribution.sh 2013-09-06 14:20:44 -07:00
Evan Chan 88d53f0dff "launch" scripts is more accurate terminology 2013-09-06 14:03:44 -07:00
Evan Chan 5a18b854a7 Easier way to start the master 2013-09-06 13:59:43 -07:00
Evan Chan 76d5d2d3c5 Add notes about starting spark-shell 2013-09-06 13:53:00 -07:00
Patrick Wendell a2a0cf9d68 Docs describing Spark monitoring and instrumentation 2013-09-06 13:52:57 -07:00
Patrick Wendell e653a9d891 Provide docs to describe running on CDH/HDP cluster.
This doc consolidates information relevant to CDH/HDP users in a single place.
2013-09-06 13:49:57 -07:00
Jey Kottalam 35ed09f1d1 Clarify YARN example 2013-09-06 11:31:16 -07:00
Ameet Talwalkar d52edfa753 updated content 2013-09-05 21:06:50 -07:00
Y.CORP.YAHOO.COM\tgraves c8cc276110 Review comment changes and update to org.apache packaging 2013-09-03 10:50:21 -05:00
Y.CORP.YAHOO.COM\tgraves 547fc4a412 Merge remote-tracking branch 'mesos/master' into yarnUILink
Conflicts:
	core/src/main/scala/org/apache/spark/ui/UIUtils.scala
	core/src/main/scala/org/apache/spark/ui/jobs/PoolTable.scala
	core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala
	docs/running-on-yarn.md
2013-09-03 08:36:59 -05:00
Matei Zaharia 2615cad30b Some doc improvements
- List higher-level projects that run on Spark
- Tweak CSS
2013-09-02 13:35:28 -07:00
Matei Zaharia 9329a7d4cd Fix spark.io.compression.codec and change default codec to LZF 2013-09-02 10:15:22 -07:00
Matei Zaharia 9ee1e9db2e Doc improvements 2013-09-01 22:12:03 -07:00
Matei Zaharia 3db404a43a Run script fixes for Windows after package & assembly change 2013-09-01 23:45:57 +00:00
Matei Zaharia 0a8cc30921 Move some classes to more appropriate packages:
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
2013-09-01 14:13:16 -07:00
Matei Zaharia 5b4dea2143 More fixes 2013-09-01 14:13:16 -07:00
Matei Zaharia 5701eb92c7 Fix some URLs 2013-09-01 14:13:16 -07:00
Matei Zaharia debcf24389 Fix over-zealous find-and-replace in HTML 2013-09-01 14:13:16 -07:00
Matei Zaharia d27cd03f30 Fix more URLs in docs 2013-09-01 14:13:16 -07:00
Matei Zaharia 4f422032e5 Update docs for new package 2013-09-01 14:13:15 -07:00
Matei Zaharia 4d1cb59fe1 Small tweak to docs gradient 2013-09-01 14:13:15 -07:00
Matei Zaharia 46eecd110a Initial work to rename package to org.apache.spark 2013-09-01 14:13:13 -07:00
Patrick Wendell 0e375a3cc2 Add assmebly plug in links 2013-09-01 09:43:42 -07:00
Patrick Wendell 6371febe18 Better docs 2013-08-31 19:09:06 -07:00
Matei Zaharia 9ddad0dcb4 Fixes suggested by Patrick 2013-08-31 17:40:33 -07:00
Matei Zaharia 4819baa658 More updates, describing changes to recommended use of environment vars
and new Python stuff
2013-08-31 14:21:10 -07:00
Matei Zaharia 4293533032 Update docs about HDFS versions 2013-08-30 15:04:43 -07:00
Y.CORP.YAHOO.COM\tgraves 96452eea56 fix up minor things 2013-08-30 16:04:31 -05:00
Y.CORP.YAHOO.COM\tgraves bac46266a9 Link the Spark UI to the Yarn UI 2013-08-30 15:55:32 -05:00
Matei Zaharia f3a964848d More doc improvements + better warnings when you haven't built Spark 2013-08-30 12:41:25 -07:00
Matei Zaharia 23762efda2 New hardware provisioning doc, and updates to menus 2013-08-30 10:16:26 -07:00
Matei Zaharia 1b0f69c623 Change docs color theme for 0.8 2013-08-30 10:15:58 -07:00
Matei Zaharia e11bc18294 Update Maven docs 2013-08-29 21:19:07 -07:00
Matei Zaharia 2de756ff19 Update some build instructions because only sbt assembly and mvn package
are now needed
2013-08-29 21:19:06 -07:00
Matei Zaharia 53cd50c069 Change build and run instructions to use assemblies
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.

As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
2013-08-29 21:19:04 -07:00
Matei Zaharia baa84e7e4c Merge pull request #865 from tgravescs/fixtmpdir
Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
2013-08-28 12:44:46 -07:00
Y.CORP.YAHOO.COM\tgraves 63dc635de6 fix typos 2013-08-26 17:06:20 -05:00
Y.CORP.YAHOO.COM\tgraves c9464c74a1 Add ability for user to specify environment variables 2013-08-26 16:44:27 -05:00
Y.CORP.YAHOO.COM\tgraves 6dd64e8bb2 Update docs and remove old reference to --user option 2013-08-26 14:29:24 -05:00
Patrick Wendell 2cfe52ef55 Version bump for ec2 docs 2013-08-24 15:16:53 -07:00
Patrick Wendell 4879685910 Merge remote-tracking branch 'mesos/master' into ec2-updates 2013-08-24 14:50:58 -07:00
Matei Zaharia 5a6ac12840 Merge pull request #701 from ScrapCodes/documentation-suggestions
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma 2bc348e92c Linking custom receiver guide 2013-08-23 09:44:02 +05:30
Prashant Sharma 39a1d58da4 Improved documentation for spark custom receiver 2013-08-23 09:38:50 +05:30
Jey Kottalam f9cc1fbf27 Remove references to unsupported Hadoop versions 2013-08-21 17:14:36 -07:00
Patrick Wendell 6be6b71c8c Merge branch 'master' into ec2-updates
Conflicts:
	ec2/spark_ec2.py
2013-08-21 15:34:31 -07:00
Jey Kottalam 6585f49841 Update build docs 2013-08-21 14:51:56 -07:00
Jey Kottalam 9c6f8df30f Update jekyll plugin to match docs/README.md 2013-08-21 12:57:56 -07:00
Matei Zaharia 53b1c30607 Update docs for Spark UI port 2013-08-20 22:57:11 -07:00
Matei Zaharia aa2b89d98d Merge remote-tracking branch 'jey/hadoop-agnostic'
Conflicts:
	core/src/main/scala/spark/PairRDDFunctions.scala
2013-08-20 10:14:15 -07:00
Matei Zaharia 2a4ed10210 Address some review comments:
- When a resourceOffers() call has multiple offers, force the TaskSets
  to consider them in increasing order of locality levels so that they
  get a chance to launch stuff locally across all offers

- Simplify ClusterScheduler.prioritizeContainers

- Add docs on the new configuration options
2013-08-18 19:51:07 -07:00
Jey Kottalam 14b6bcdf93 update YARN docs 2013-08-15 16:50:37 -07:00
Evan Sparks 4346f0a1e9 Merge pull request #809 from shivaram/sgd-cleanup
Clean up scaladoc in ML Lib.
2013-08-12 12:12:12 -07:00
Shivaram Venkataraman 8b5e3e2eb5 Add ML Lib scaladoc to API dropdown 2013-08-11 23:52:43 -07:00
Patrick Wendell 9244524146 Removing dead docs 2013-08-11 20:33:58 -07:00
Shivaram Venkataraman 4935a2558b Clean up scaladoc in ML Lib.
Also build and copy ML Lib scaladoc in Spark docs build.
Some more minor cleanup with respect to naming, test locations etc.
2013-08-11 19:02:43 -07:00
Matei Zaharia de6c4c995a Merge pull request #787 from ash211/master
Update spark-standalone.md
2013-08-06 17:09:50 -07:00
Andrew Ash afc2c80fdb Update spark-standalone.md 2013-08-07 00:44:43 +01:00
Patrick Wendell 5cc725a0e3 Merge branch 'master' into ec2-updates
Conflicts:
	ec2/deploy.generic/root/mesos-ec2/ec2-variables.sh
2013-07-31 21:35:12 -07:00
Patrick Wendell b7b627d5bb Updating relevant documentation 2013-07-31 21:28:27 -07:00
Matei Zaharia 3097d75d6f Merge remote-tracking branch 'dlyubimov/SPARK-827'
Conflicts:
	docs/configuration.md
2013-07-31 18:36:43 -07:00
Reynold Xin 5227043f84 Documentation update for compression codec. 2013-07-30 17:12:16 -07:00
Matei Zaharia 497f55755f Add docs about ipython 2013-07-29 02:51:43 -04:00
Dmitriy Lyubimov 0862494d44 typo 2013-07-27 23:16:20 -07:00
Dmitriy Lyubimov f5067abe85 changes per comments. 2013-07-27 23:08:00 -07:00
Ubuntu 88a0823c58 Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable (JIRA Ticket SPARK-817) 2013-07-18 00:51:18 +00:00
Matei Zaharia af3c9d5042 Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
Matei Zaharia d47c16f78d Add an option to disable reference tracking in Kryo 2013-07-15 01:55:54 +00:00
Andy Konwinski cd7259b4b8 Fixes typos in Spark Streaming Programming Guide
These typos were reported on the spark-users mailing list, see: https://groups.google.com/d/msg/spark-users/SyLGgJlKCrI/LpeBypOkSMUJ
2013-07-12 11:51:14 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
root 7cd490ef5b Clarify that PySpark is not supported on Windows 2013-07-01 06:26:43 +00:00
Matei Zaharia 5bbd0eec84 Update docs on SCALA_LIBRARY_PATH 2013-06-30 17:00:40 -07:00
Matei Zaharia 03d0b858c8 Made use of spark.executor.memory setting consistent and documented it
Conflicts:

	core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Matei Zaharia aea727f68d Simplify Python docs a little to do substring search 2013-06-26 21:15:09 -07:00
Patrick Wendell a59c15a37e Adding config option for retained stages 2013-06-26 08:54:57 -07:00
Tathagata Das c89af0a7f9 Merge branch 'master' into streaming
Conflicts:
	.gitignore
2013-06-24 23:57:47 -07:00
Matei Zaharia b5df1cd668 ADD_JARS environment variable for spark-shell 2013-06-22 17:14:44 -07:00
Reynold Xin 0eab7a78b9 Fixed a couple typos and formating problems in the YARN documentation. 2013-05-17 18:05:46 -07:00
Reynold Xin 7760d78b3a Merge branch 'master' of https://github.com/mridulm/spark 2013-05-17 17:58:36 -07:00
Mridul Muralidharan da2642bead Fix example jar name 2013-05-17 06:58:46 +05:30
Reynold Xin 3b3300383a Updated Scala version in docs generation ruby script. 2013-05-16 16:51:28 -07:00
Mridul Muralidharan f16c781709 Fix documentation to use yarn-standalone as master 2013-05-16 17:50:22 +05:30
Mridul Muralidharan 87540a7b38 Fix running on yarn documentation 2013-05-16 15:27:58 +05:30
Andrew Ash afcad7b3aa Docs: Mention spark shell's default for MASTER 2013-05-15 14:45:14 -03:00
Mridul Muralidharan ee37612bc9 1) Add support for HADOOP_CONF_DIR (and/or YARN_CONF_DIR - use either) : which is used to specify the client side configuration directory : which needs to be part of the CLASSPATH.
2) Move from var+=".." to var="$var.." : the former does not work on older bash shells unfortunately.
2013-05-11 11:12:22 +05:30
Matei Zaharia cf54b824ff Merge pull request #580 from pwendell/quickstart
SPARK-739 Have quickstart standlone job use README
2013-04-25 11:45:58 -07:00
Patrick Wendell a72134a6ac SPARK-739 Have quickstart standlone job use README 2013-04-25 10:39:28 -07:00
Mridul Muralidharan dd515ca3ee Attempt at fixing merge conflict 2013-04-24 09:24:17 +05:30
Mridul Muralidharan ac2e8e8720 Add some basic documentation 2013-04-19 00:13:19 +05:30
seanm ab0f834dbb adding spark.streaming.blockInterval property 2013-04-16 11:57:05 -06:00
Andy Konwinski 60a91b3b59 Update quick-start.md heading on Operations (not just Transformations). 2013-04-12 12:34:51 -07:00
Andrew Ash 6efc8cae8f Typos: cluser -> cluster 2013-04-10 13:44:10 -03:00
Matei Zaharia 65caa8f711 Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
Conflicts:
	docs/_config.yml
	project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Matei Zaharia a1586412d6 Updated link to SBT 2013-04-07 20:31:19 -04:00
Matei Zaharia 34a47b8bc9 Update Scala version in docs 2013-04-07 20:27:03 -04:00
Matei Zaharia a98996d1fe Merge pull request #545 from ash211/patch-1
Don't use deprecated Application in example
2013-03-29 22:12:15 -07:00
Jey Kottalam bc8ba222ff Bump development version to 0.8.0 2013-03-28 15:42:01 -07:00
Andrew Ash e8f3669c63 Update tuning.md
Make the example more compilable
2013-03-28 19:17:39 -03:00
Andrew Ash 4e2c965383 Don't use deprecated Application in example
As of 2.9.0 extending from Application is not recommended

http://www.scala-lang.org/api/2.9.3/index.html#scala.Application
2013-03-28 17:47:37 -03:00
Andy Konwinski 446b801b3b Fixing typos pointed out by Matei 2013-03-20 17:30:31 -07:00
Andy Konwinski ad7f0452ab Adds page to docs about building using Maven.
Adds links to new instructions in:
* The main Spark project README.md
* The docs nav menu called "More"
* The docs Overview page under the "Building" and "Where to Go from Here" sections
2013-03-17 15:02:40 -07:00
Andy Konwinski c9097628fc Fix broken link to YARN documentation. 2013-03-13 14:51:13 -07:00
Andy Konwinski cf73fbd305 Fix another broken link in quick start. 2013-03-13 02:23:44 -07:00
Andy Konwinski b63109763b Fix broken link in Quick Start. 2013-03-13 02:02:34 -07:00
Matei Zaharia db9b90fdbd Change version to 0.7.1-SNAPSHOT for development branch 2013-02-27 09:15:26 -08:00
Matei Zaharia 4f840f4e54 Added commented-out Google analytics code for website docs 2013-02-27 09:14:11 -08:00
Matei Zaharia fadeb1ddea More doc tweaks 2013-02-26 23:20:49 -08:00
Matei Zaharia 22334eafd9 Some tweaks to docs 2013-02-26 22:52:38 -08:00
Matei Zaharia 9a046e30ac Switch docs to use Akka repo instead of Typesafe 2013-02-25 22:18:47 -08:00
Matei Zaharia 7e67c626ee Change version number to 0.7.0 2013-02-25 20:30:47 -08:00
Tathagata Das 6a78ef0578 Merge pull request #500 from pwendell/streaming-docs
Minor changes based on feedback
2013-02-25 15:28:45 -08:00
Patrick Wendell 8316534eef meta-data 2013-02-25 15:27:04 -08:00
Patrick Wendell 918ee25867 One more change done with TD 2013-02-25 15:24:17 -08:00
Matei Zaharia 351ac5233e Some tweaks to docs 2013-02-25 15:19:05 -08:00
Matei Zaharia 2ae15353a1 Merge branch 'master' of github.com:mesos/spark 2013-02-25 15:14:16 -08:00
Matei Zaharia 490f056cdd Allow passing sparkHome and JARs to StreamingContext constructor
Also warns if spark.cleaner.ttl is not set in the version where you pass
your own SparkContext.
2013-02-25 15:13:30 -08:00
Patrick Wendell 07f2618769 Minor changes based on feedback 2013-02-25 15:09:59 -08:00
Patrick Wendell 50ce0516e6 Some changes to streaming failure docs.
TD gave me the go-ahead to just make these changes:
- Define stateful dstream
- Some minor wording fixes
2013-02-25 14:38:39 -08:00
Matei Zaharia 5d4a0ac794 Some tweaks to docs 2013-02-25 14:23:03 -08:00
Matei Zaharia 848321f910 Change doc color scheme slightly for Spark 0.7 (to differ from 0.6) 2013-02-25 13:15:30 -08:00
Matei Zaharia 3c7dcb61ab Use a single setting for disabling API doc build 2013-02-25 13:15:12 -08:00
Matei Zaharia d6e6abece3 Merge pull request #459 from stephenh/bettersplits
Change defaultPartitioner to use upstream split size.
2013-02-25 09:22:04 -08:00
Stephen Haberman 44032bc476 Merge branch 'master' into bettersplits
Conflicts:
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/test/scala/spark/ShuffleSuite.scala
2013-02-24 22:08:14 -06:00
Tathagata Das abb5471865 Removing duplicate doc. 2013-02-24 16:32:44 -08:00
Tathagata Das 5ab37be983 Fixed class paths and dependencies based on Matei's comments. 2013-02-24 16:24:52 -08:00
Tathagata Das b4eb24de96 Updated streaming programming guide with Java API info, and comments from Patrick. 2013-02-23 23:59:45 -08:00
Tathagata Das d853aa9658 Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs. 2013-02-23 17:42:26 -08:00
Tathagata Das 1cb725e417 Merge branch 'mesos-master' into streaming 2013-02-20 09:55:35 -08:00
Tathagata Das fb9956256d Merge branch 'mesos-master' into streaming
Conflicts:
	core/src/main/scala/spark/rdd/CheckpointRDD.scala
	streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Andy Konwinski ecd137a72d Fixes link to issue tracker in documentation page "Contributing to Spark". 2013-02-19 16:58:02 -08:00
Tathagata Das 9e82be1503 Merge branch 'streaming' into ScrapCodes-streaming-actor
Conflicts:
	docs/plugin-custom-receiver.md
	streaming/src/main/scala/spark/streaming/StreamingContext.scala
	streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
	streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala
	streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
	streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
2013-02-19 02:48:50 -08:00
Tathagata Das 12ea14c211 Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver. 2013-02-18 15:18:34 -08:00
Tathagata Das 6a6e6bda57 Merge branch 'streaming' into ScrapCode-streaming
Conflicts:
	streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
	streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
2013-02-18 13:26:12 -08:00
Tathagata Das 8ad561dc7d Added checkpointing and fault-tolerance semantics to the programming guide. Fixed default checkpoint interval to being a multiple of slide duration. Fixed visibility of some classes and objects to clean up docs. 2013-02-18 02:12:41 -08:00
Stephen Haberman 6cd68c31cb Update default.parallelism docs, have StandaloneSchedulerBackend use it.
Only brand new RDDs (e.g. parallelize and makeRDD) now use default
parallelism, everything else uses their largest parent's partitioner
or partition size.
2013-02-16 00:29:11 -06:00
Matei Zaharia e8663e0fe5 Merge pull request #461 from JoshRosen/fix/issue-tracker-link
Update issue tracker link in contributing guide
2013-02-13 18:42:17 -08:00
Matei Zaharia 05d2e94838 Use a separate memory setting for standalone cluster daemons
Conflicts:
	docs/_config.yml
2013-02-10 21:59:41 -08:00
Josh Rosen 131b56afd0 Update issue tracker link in contributing guide. 2013-02-10 13:28:31 -08:00
Matei Zaharia b1d809913b Merge pull request #460 from markhamstra/404
Fixed a 404 in 'Tuning Spark' -- missing '.html'
2013-02-10 13:01:09 -08:00
Mark Hamstra 4975dcdafc Fixed a 404 -- missing '.html' 2013-02-10 12:55:47 -08:00
Mark Hamstra b8863a79d3 Merge branch 'master' of https://github.com/mesos/spark into commutative
Conflicts:
	core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Mark Hamstra 934a53c8b6 Change docs on 'reduce' since the merging of local reduces no longer preserves
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Matei Zaharia 55327a283e Merge pull request #430 from pwendell/pyspark-guide
Minor improvements to PySpark docs
2013-01-30 15:35:29 -08:00
Patrick Wendell 3f945e3b83 Make module help available in python shell.
Also, adds a line in doc explaining how to use.
2013-01-30 15:04:06 -08:00
Patrick Wendell 58a7d320d7 Inclue packaging and launching pyspark in guide.
It's nicer if all the commands you need are made explicit.
2013-01-30 15:04:02 -08:00
Stephen Haberman 7dfb82a992 Replace old 'master' term with 'driver'. 2013-01-25 11:03:00 -06:00
Tathagata Das 155f31398d Made StorageLevel constructor private, and added StorageLevels.create() to the Java API. Updates scala and java programming guides. 2013-01-23 01:10:26 -08:00
Prashant Sharma d17065c4b5 actor as receiver 2013-01-22 13:28:29 +05:30
Matei Zaharia 76d7c0ce2b Add more Akka settings to docs 2013-01-21 13:10:33 -08:00
Matei Zaharia 2173f6c7ca Clarify the documentation on env variables for standalone mode 2013-01-21 13:03:20 -08:00
Matei Zaharia 86057ec7c8 Merge branch 'master' into streaming
Conflicts:
	core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Josh Rosen 9f54d7e1f5 Merge pull request #387 from mateiz/python-accumulators
Add accumulators to PySpark
2013-01-20 11:00:36 -08:00
Patrick Wendell 5f74ead636 Changes based on Matei's comment 2013-01-20 08:59:20 -08:00
Matei Zaharia ee5a07955c Fix Python guide to say accumulators are available 2013-01-20 02:11:58 -08:00
Patrick Wendell ecdff861f7 Clarifying log directory in EC2 guide 2013-01-19 22:59:35 -08:00
Prashant Sharma 56b9bd197c Plug in actor as stream receiver API 2013-01-19 22:04:07 +05:30
Tathagata Das cd1521cfdb Merge branch 'master' into streaming
Conflicts:
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/FilteredRDD.scala
	docs/_layouts/global.html
	docs/index.md
	run
2013-01-15 12:08:51 -08:00
Tathagata Das 0dbd411a56 Added documentation for PairDStreamFunctions. 2013-01-13 21:08:35 -08:00
Matei Zaharia fbb3fc4143 Merge pull request #346 from JoshRosen/python-api
Python API (PySpark)
2013-01-12 23:49:36 -08:00
Michael Heuer 480c4139bb add repositories section to simple job pom.xml 2013-01-11 11:40:17 -06:00
Josh Rosen b57dd0f160 Add mapPartitionsWithSplit() to PySpark. 2013-01-08 16:05:02 -08:00
Tathagata Das 237bac36e9 Renamed examples and added documentation. 2013-01-07 14:37:21 -08:00
Josh Rosen ce9f1bbe20 Add pyspark script to replace the other scripts.
Expand the PySpark programming guide.
2013-01-01 21:25:49 -08:00
Josh Rosen b58340dbd9 Rename top-level 'pyspark' directory to 'python' 2013-01-01 15:05:00 -08:00
Josh Rosen 170e451fbd Minor documentation and style fixes for PySpark. 2013-01-01 13:52:14 -08:00
Tathagata Das 02497f0cd4 Updated Streaming Programming Guide. 2013-01-01 12:21:32 -08:00
Tathagata Das 9e644402c1 Improved jekyll and scala docs. Made many classes and method private to remove them from scala docs. 2012-12-29 18:31:51 -08:00
Josh Rosen c5cee53f20 Merge remote-tracking branch 'origin/master' into python-api
Conflicts:
	docs/quick-start.md
2012-12-29 16:00:51 -08:00
Josh Rosen c2b105af34 Add documentation for Python API. 2012-12-28 22:51:28 -08:00
Josh Rosen 85b8f2c64f Add epydoc API documentation for PySpark. 2012-12-27 18:04:10 -08:00
Tathagata Das 7c33f76291 Merge branch 'mesos' into dev-merge 2012-12-26 19:19:07 -08:00
Reynold Xin c68a076037 Updated Kryo documentation for Kryo version update. 2012-12-21 16:03:17 -08:00
Reynold Xin eac566a7f4 Merge branch 'master' of github.com:mesos/spark into dev
Conflicts:
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/PairRDDFunctions.scala
	core/src/main/scala/spark/ParallelCollection.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/CartesianRDD.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/CoalescedRDD.scala
	core/src/main/scala/spark/rdd/FilteredRDD.scala
	core/src/main/scala/spark/rdd/FlatMappedRDD.scala
	core/src/main/scala/spark/rdd/GlommedRDD.scala
	core/src/main/scala/spark/rdd/HadoopRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala
	core/src/main/scala/spark/rdd/MappedRDD.scala
	core/src/main/scala/spark/rdd/PipedRDD.scala
	core/src/main/scala/spark/rdd/SampledRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
	core/src/main/scala/spark/rdd/UnionRDD.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockManagerId.scala
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
	core/src/main/scala/spark/storage/StorageLevel.scala
	core/src/main/scala/spark/util/MetadataCleaner.scala
	core/src/main/scala/spark/util/TimeStampedHashMap.scala
	core/src/test/scala/spark/storage/BlockManagerSuite.scala
	run
2012-12-20 14:53:40 -08:00
Josh Rosen 1948f46093 Use spark-env.sh to configure standalone master. See SPARK-638.
Also fixed a typo in the standalone mode documentation.
2012-12-14 01:20:00 +00:00
Patrick Wendell 6ceb559994 Adding multi-jar constructor in quickstart 2012-11-27 23:32:24 -08:00
Tathagata Das 0fe2fc4d5e Merged branch mesos/master to branch dev. 2012-11-26 13:16:59 -08:00
Matei Zaharia 6adc7c965f Doc fix 2012-11-16 20:49:02 -08:00
Patrick Wendell d39ac5fbc1 Streaming programming guide. STREAMING-2 #resolve 2012-11-13 21:19:58 -08:00
Matei Zaharia 51477e8874 Merge pull request #294 from JoshRosen/docs/quickstart
Fix minor typos in quickstart and Scala programming guides
2012-10-27 16:56:39 -07:00
Josh Rosen 33bea24f8e Fix Spark groupId in Scala Programming Guide. 2012-10-26 15:01:28 -07:00
Josh Rosen c4aa10154e Fix minor typos in quick start guide. 2012-10-23 13:49:52 -07:00
Matei Zaharia 0967e71a00 Bump up version to 0.7.0-SNAPSHOT for master branch 2012-10-22 11:49:42 -07:00
Matei Zaharia 902a608187 Update version to 0.6.1-SNAPSHOT to show this is in development 2012-10-22 11:43:57 -07:00
Matei Zaharia 6999724ce8 Fix a path in the web UI 2012-10-20 23:33:37 -07:00
Patrick Wendell 7a03a0e35d Adding dependency repos in quickstart example 2012-10-14 11:48:24 -07:00
Matei Zaharia 4be12d97ec Some doc fixes, including showing version number in nav bar again 2012-10-13 19:05:11 -07:00
Matei Zaharia 19910c00c3 tweaks 2012-10-13 16:22:39 -07:00
Matei Zaharia 4a3e9cf69c Document how to configure SPARK_MEM & co on a per-job basis 2012-10-13 16:20:25 -07:00
Matei Zaharia 8d7b77bcb5 Some doc and usability improvements:
- Added a StorageLevels class for easy access to StorageLevel constants
  in Java
- Added doc comments on Function classes in Java
- Updated Accumulator and HadoopWriter docs slightly
2012-10-12 17:53:20 -07:00
Matei Zaharia 1183b30941 Merge branch 'dev' of github.com:mesos/spark into dev 2012-10-12 14:40:07 -07:00
Matei Zaharia 603b419fdf Tweak 2012-10-12 14:40:00 -07:00
Patrick Wendell 5788a92953 Updating Bagel build instructions 2012-10-09 22:39:29 -07:00
Patrick Wendell 0f760a0bd3 Updating programming guide with new link instructions 2012-10-09 22:39:29 -07:00
Patrick Wendell 4de5cc1ad4 Removing reference to publish-local in the quickstart 2012-10-09 22:39:29 -07:00
Patrick Wendell 8321e7f0c2 Fixing YARN instructions 2012-10-09 22:39:28 -07:00
Patrick Wendell 5013c785fd Adding SNAPSHOT to Spark version in doc config 2012-10-09 22:39:28 -07:00
Matei Zaharia bc0bc672d0 Updates to documentation:
- Edited quick start and tuning guide to simplify them a little
- Simplified top menu bar
- Made private a SparkContext constructor parameter that was left as
  public
- Various small fixes
2012-10-09 14:30:23 -07:00
Matei Zaharia 4780fee887 Merge pull request #260 from andyk/update-docs-to-use-version-vars
Updates docs to use the new version num vars and adds Spark version in nav bar
2012-10-09 09:46:20 -07:00
Andy Konwinski e1a724f39c Updating lots of docs to use the new special version number variables,
also adding the version to the navbar so it is easy to tell which
version of Spark these docs were compiled for.
2012-10-08 17:17:17 -07:00
Patrick Wendell 7887e171f4 Adding new download instructions 2012-10-08 17:05:13 -07:00
Andy Konwinski 56b91ab380 Updates README.md with instructions for running jekyll without building
scaladoc (i.e. run `SKIP_SCALADOC=1 jekyll`).
2012-10-08 12:15:44 -07:00
Andy Konwinski 89f8e1c2e7 Merge remote-tracking branch 'public-spark/dev' into add-version-vars-to-docs
Conflicts:
	docs/quick-start.md
2012-10-08 12:15:38 -07:00
Matei Zaharia 88152e2164 Merge pull request #254 from pwendell/quickstart-fix
Removing one link in quickstart
2012-10-08 11:08:29 -07:00
Andy Konwinski 45d03231d0 Adds liquid variables to docs templating system so that they can be used
throughout the docs: SPARK_VERSION, SCALA_VERSION, and MESOS_VERSION.

To use them, e.g. use {{site.SPARK_VERSION}}.

Also removes uses of {{HOME_PATH}} which were being resolved to ""
by the templating system anyway.
2012-10-08 10:30:38 -07:00
Andy Konwinski cd710a7718 Fixes the small gap above the nav menu dropdown boxes and the hoverable
menu items that was causing the dropdowns to go away when the user
moved their mouse down towards them.
2012-10-08 09:33:40 -07:00
Patrick Wendell 99aac2f6c4 Removing one link in quickstart 2012-10-08 08:53:27 -07:00
Matei Zaharia efc5423210 Made compression configurable separately for shuffle, broadcast and RDDs 2012-10-07 11:30:53 -07:00
Matei Zaharia dc28a3ac0a Modified shuffle to limit the maximum outstanding data size in bytes,
instead of the maximum number of outstanding fetches. This should make
it faster when there are many small map output files, as well as more
robust to overallocating memory on large map outputs.
2012-10-06 20:07:10 -07:00
Matei Zaharia e6e27a05d8 Links quick start from nav bar 2012-10-05 17:06:55 -07:00
Patrick Wendell e84c068fab Some additions to the Tuning Guide.
1. Slight change in organization
2. Added pre-requisites
3. Made a new section about determining memory footprint
   of an RDD
4. Other small changes
2012-10-03 14:06:34 -07:00
Matei Zaharia 42cd148507 Merge pull request #236 from pwendell/quickstart
A Spark "Quick Start" example
2012-10-03 08:31:43 -07:00
Patrick Wendell 35b767f478 Responding to Matei's comments 2012-10-02 23:54:03 -07:00
Shivaram Venkataraman 3d2b900b08 First cut at adding documentation for GC tuning 2012-10-02 20:07:18 -07:00
Patrick Wendell f78edf94cf A Spark "Quick Start" example
This commit includes a quick start example that covers:
1) Basic usage of the Spark shell
2) A simple Spark job in Scala
3) A simple Spark job in Java
2012-10-01 17:24:09 -07:00
Matei Zaharia 802aa8aef9 Some bug fixes and logging fixes for broadcast. 2012-10-01 15:20:42 -07:00
Reynold Xin f5812d0354 Added mapPartitionsWithSplit to the programming guide. 2012-09-29 01:31:36 -07:00
Josh Rosen 37c199bbb0 Allow controlling number of splits in distinct(). 2012-09-28 23:44:19 -07:00
Matei Zaharia 009b0e37e7 Added an option to compress blocks in the block store 2012-09-27 18:45:44 -07:00
Matei Zaharia 7bcb08cef5 Renamed storage levels to something cleaner; fixes #223. 2012-09-27 17:50:59 -07:00
Matei Zaharia bf18e0994e Minor typos 2012-09-26 23:53:38 -07:00
Matei Zaharia a4093f7563 Minor doc fixes 2012-09-26 23:22:15 -07:00
Matei Zaharia ea05fc130b Updates to standalone cluster, web UI and deploy docs. 2012-09-26 22:54:39 -07:00
Matei Zaharia 874a9fd407 More updates to docs, including tuning guide 2012-09-26 19:17:58 -07:00
Matei Zaharia 58eb44acbb Doc tweaks 2012-09-26 00:32:59 -07:00
Matei Zaharia d51d5e0582 Doc fixes 2012-09-25 23:59:04 -07:00
Matei Zaharia c5754bb939 Fixes to Java guide 2012-09-25 23:51:04 -07:00
Matei Zaharia f1246cc7c1 Various enhancements to the programming guide and HTML/CSS 2012-09-25 23:26:56 -07:00
Matei Zaharia 56c90485fd More updates to documentation 2012-09-25 19:31:07 -07:00
Matei Zaharia 1821bf1f1f Merge branch 'dev' of github.com:mesos/spark into dev 2012-09-25 15:46:27 -07:00
Matei Zaharia e47e11720f Documentation updates 2012-09-25 15:46:18 -07:00
Andy Konwinski 098351b735 Makes nav menu dropdowns show on hover instead of on click. 2012-09-25 15:42:50 -07:00
Matei Zaharia 30362a21e7 Update license info on deploy scripts 2012-09-25 14:43:47 -07:00
Matei Zaharia 60bce9574f Update Jekyll plugin to look at Scala 2.9.2 docs 2012-09-25 14:31:53 -07:00
Josh Rosen c94e9cc54a Add Java Programming Guide; fix broken doc links. 2012-09-16 20:46:46 -07:00
Andy Konwinski 52c29071a4 - Add docs/api to .gitignore
- Rework/expand the nav bar with more of the docs site
- Removing parts of docs about EC2 and Mesos that differentiate between
  running 0.5 and before
    - Merged subheadings from running-on-amazon-ec2.html that are still relevant
      (i.e., "Using a newer version of Spark" and "Accessing Data in S3") into
      ec2-scripts.html and deleted running-on-amazon-ec2.html
- Added some TODO comments to a few docs
- Updated the blurb about AMP Camp
- Renamed programming-guide to spark-programming-guide
- Fixing typos/etc. in Standalone Spark doc
2012-09-16 15:28:52 -07:00
Andy Konwinski 6765d9727e Adds a jekyll plugin (written in Ruby) to the _plugins directory
which generates scala doc by calling `sbt/sbt doc`, copies it over
to docs, and updates the links from the api webpage to now point to
the copied over scaladoc (making the _site directory easy to just
copy over to a public website).
2012-09-13 17:17:58 -07:00
Andy Konwinski 462cd8be60 Re-enabling responsive for better looking padding and more sane resizing,
but removed the collapsable stuff from the nav bar.
2012-09-13 15:39:35 -07:00
Andy Konwinski 5ec7a6665b More crisp logo created from vector source (ai) and disabled
responsive css (so nav menu doesn't switch to collapsed version
for narrow viewports).
2012-09-13 15:27:33 -07:00
Andy Konwinski b0207e2bfd Replaces "Spark" word in nav bar with logo. 2012-09-13 12:08:12 -07:00
Andy Konwinski 130f2f2f41 Merge remote-tracking branch 'public-spark/dev' into doc 2012-09-13 11:29:25 -07:00
Denny 638a8511fa Fixed navbar style problem 2012-09-13 09:56:27 -07:00
Denny 6d53b971b9 Added standalone and YARN docs. Merged standalone cluster into standalone doc 2012-09-13 09:47:54 -07:00
Andy Konwinski ca2c999e0f Making the link to api scaladocs work and migrating other code snippets
to use pygments syntax highlighting.
2012-09-12 23:25:07 -07:00
Andy Konwinski c4db09ea76 Adds ec2-scripts.md back (it was mistakenly removed earlier due to
git weirdness).
2012-09-12 20:56:32 -07:00
Matei Zaharia d3db46fdef Remove title from content in Bagel 2012-09-12 19:50:38 -07:00
Matei Zaharia 88181bee9e Small tweaks to generated doc pages 2012-09-12 19:47:31 -07:00
Andy Konwinski 1bcd09e093 Merge remote-tracking branch 'public-spark/dev' into doc 2012-09-12 19:28:55 -07:00
Andy Konwinski 35adccd008 Adds syntax highlighting (via pygments), and some style tweaks to make things
easier to read.
2012-09-12 19:27:44 -07:00
Reynold Xin 4b8b0d5c08 Added a link to AMPCamp website in Spark docs. 2012-09-12 17:54:16 -07:00
Andy Konwinski 95c5a376b5 Fixing a hanging sentence in docs/ec2-scripts.md 2012-09-12 16:17:24 -07:00
Andy Konwinski 3c7e40fa27 Removed the upper-case version of docs/EC2-Scripts.md. 2012-09-12 16:09:34 -07:00
Andy Konwinski 4d3a17c8d7 Fixing lots of broken links. 2012-09-12 16:06:18 -07:00
Andy Konwinski 49e98500a9 Updated base README to point to documentation site instead of wiki, updated
docs/README.md to describe use of Jekyll, and renmaed things to make them
more consistent with the lower-case-with-hyphens convention.
2012-09-12 13:03:43 -07:00
Andy Konwinski 16da942d66 Adding docs directory containing documentation currently on the wiki
which can be compiled via jekyll, using the command `jekyll`. To compile
and run a local webserver to serve the doc as a website, run
`jekyll --server`.
2012-09-12 13:03:43 -07:00