Patrick Wendell
a2a0cf9d68
Docs describing Spark monitoring and instrumentation
2013-09-06 13:52:57 -07:00
Patrick Wendell
e653a9d891
Provide docs to describe running on CDH/HDP cluster.
...
This doc consolidates information relevant to CDH/HDP users in a single place.
2013-09-06 13:49:57 -07:00
Jey Kottalam
35ed09f1d1
Clarify YARN example
2013-09-06 11:31:16 -07:00
Ameet Talwalkar
d52edfa753
updated content
2013-09-05 21:06:50 -07:00
Y.CORP.YAHOO.COM\tgraves
c8cc276110
Review comment changes and update to org.apache packaging
2013-09-03 10:50:21 -05:00
Y.CORP.YAHOO.COM\tgraves
547fc4a412
Merge remote-tracking branch 'mesos/master' into yarnUILink
...
Conflicts:
core/src/main/scala/org/apache/spark/ui/UIUtils.scala
core/src/main/scala/org/apache/spark/ui/jobs/PoolTable.scala
core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala
docs/running-on-yarn.md
2013-09-03 08:36:59 -05:00
Matei Zaharia
2615cad30b
Some doc improvements
...
- List higher-level projects that run on Spark
- Tweak CSS
2013-09-02 13:35:28 -07:00
Matei Zaharia
9329a7d4cd
Fix spark.io.compression.codec and change default codec to LZF
2013-09-02 10:15:22 -07:00
Matei Zaharia
9ee1e9db2e
Doc improvements
2013-09-01 22:12:03 -07:00
Matei Zaharia
3db404a43a
Run script fixes for Windows after package & assembly change
2013-09-01 23:45:57 +00:00
Matei Zaharia
0a8cc30921
Move some classes to more appropriate packages:
...
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
2013-09-01 14:13:16 -07:00
Matei Zaharia
5b4dea2143
More fixes
2013-09-01 14:13:16 -07:00
Matei Zaharia
5701eb92c7
Fix some URLs
2013-09-01 14:13:16 -07:00
Matei Zaharia
debcf24389
Fix over-zealous find-and-replace in HTML
2013-09-01 14:13:16 -07:00
Matei Zaharia
d27cd03f30
Fix more URLs in docs
2013-09-01 14:13:16 -07:00
Matei Zaharia
4f422032e5
Update docs for new package
2013-09-01 14:13:15 -07:00
Matei Zaharia
4d1cb59fe1
Small tweak to docs gradient
2013-09-01 14:13:15 -07:00
Matei Zaharia
46eecd110a
Initial work to rename package to org.apache.spark
2013-09-01 14:13:13 -07:00
Patrick Wendell
0e375a3cc2
Add assmebly plug in links
2013-09-01 09:43:42 -07:00
Patrick Wendell
6371febe18
Better docs
2013-08-31 19:09:06 -07:00
Matei Zaharia
9ddad0dcb4
Fixes suggested by Patrick
2013-08-31 17:40:33 -07:00
Matei Zaharia
4819baa658
More updates, describing changes to recommended use of environment vars
...
and new Python stuff
2013-08-31 14:21:10 -07:00
Matei Zaharia
4293533032
Update docs about HDFS versions
2013-08-30 15:04:43 -07:00
Y.CORP.YAHOO.COM\tgraves
96452eea56
fix up minor things
2013-08-30 16:04:31 -05:00
Y.CORP.YAHOO.COM\tgraves
bac46266a9
Link the Spark UI to the Yarn UI
2013-08-30 15:55:32 -05:00
Matei Zaharia
f3a964848d
More doc improvements + better warnings when you haven't built Spark
2013-08-30 12:41:25 -07:00
Matei Zaharia
23762efda2
New hardware provisioning doc, and updates to menus
2013-08-30 10:16:26 -07:00
Matei Zaharia
1b0f69c623
Change docs color theme for 0.8
2013-08-30 10:15:58 -07:00
Matei Zaharia
e11bc18294
Update Maven docs
2013-08-29 21:19:07 -07:00
Matei Zaharia
2de756ff19
Update some build instructions because only sbt assembly and mvn package
...
are now needed
2013-08-29 21:19:06 -07:00
Matei Zaharia
53cd50c069
Change build and run instructions to use assemblies
...
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.
As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
2013-08-29 21:19:04 -07:00
Matei Zaharia
baa84e7e4c
Merge pull request #865 from tgravescs/fixtmpdir
...
Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
2013-08-28 12:44:46 -07:00
Y.CORP.YAHOO.COM\tgraves
63dc635de6
fix typos
2013-08-26 17:06:20 -05:00
Y.CORP.YAHOO.COM\tgraves
c9464c74a1
Add ability for user to specify environment variables
2013-08-26 16:44:27 -05:00
Y.CORP.YAHOO.COM\tgraves
6dd64e8bb2
Update docs and remove old reference to --user option
2013-08-26 14:29:24 -05:00
Patrick Wendell
2cfe52ef55
Version bump for ec2 docs
2013-08-24 15:16:53 -07:00
Patrick Wendell
4879685910
Merge remote-tracking branch 'mesos/master' into ec2-updates
2013-08-24 14:50:58 -07:00
Matei Zaharia
5a6ac12840
Merge pull request #701 from ScrapCodes/documentation-suggestions
...
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma
2bc348e92c
Linking custom receiver guide
2013-08-23 09:44:02 +05:30
Prashant Sharma
39a1d58da4
Improved documentation for spark custom receiver
2013-08-23 09:38:50 +05:30
Jey Kottalam
f9cc1fbf27
Remove references to unsupported Hadoop versions
2013-08-21 17:14:36 -07:00
Patrick Wendell
6be6b71c8c
Merge branch 'master' into ec2-updates
...
Conflicts:
ec2/spark_ec2.py
2013-08-21 15:34:31 -07:00
Jey Kottalam
6585f49841
Update build docs
2013-08-21 14:51:56 -07:00
Jey Kottalam
9c6f8df30f
Update jekyll plugin to match docs/README.md
2013-08-21 12:57:56 -07:00
Matei Zaharia
53b1c30607
Update docs for Spark UI port
2013-08-20 22:57:11 -07:00
Matei Zaharia
aa2b89d98d
Merge remote-tracking branch 'jey/hadoop-agnostic'
...
Conflicts:
core/src/main/scala/spark/PairRDDFunctions.scala
2013-08-20 10:14:15 -07:00
Matei Zaharia
2a4ed10210
Address some review comments:
...
- When a resourceOffers() call has multiple offers, force the TaskSets
to consider them in increasing order of locality levels so that they
get a chance to launch stuff locally across all offers
- Simplify ClusterScheduler.prioritizeContainers
- Add docs on the new configuration options
2013-08-18 19:51:07 -07:00
Jey Kottalam
14b6bcdf93
update YARN docs
2013-08-15 16:50:37 -07:00
Evan Sparks
4346f0a1e9
Merge pull request #809 from shivaram/sgd-cleanup
...
Clean up scaladoc in ML Lib.
2013-08-12 12:12:12 -07:00
Shivaram Venkataraman
8b5e3e2eb5
Add ML Lib scaladoc to API dropdown
2013-08-11 23:52:43 -07:00
Patrick Wendell
9244524146
Removing dead docs
2013-08-11 20:33:58 -07:00
Shivaram Venkataraman
4935a2558b
Clean up scaladoc in ML Lib.
...
Also build and copy ML Lib scaladoc in Spark docs build.
Some more minor cleanup with respect to naming, test locations etc.
2013-08-11 19:02:43 -07:00
Matei Zaharia
de6c4c995a
Merge pull request #787 from ash211/master
...
Update spark-standalone.md
2013-08-06 17:09:50 -07:00
Andrew Ash
afc2c80fdb
Update spark-standalone.md
2013-08-07 00:44:43 +01:00
Patrick Wendell
5cc725a0e3
Merge branch 'master' into ec2-updates
...
Conflicts:
ec2/deploy.generic/root/mesos-ec2/ec2-variables.sh
2013-07-31 21:35:12 -07:00
Patrick Wendell
b7b627d5bb
Updating relevant documentation
2013-07-31 21:28:27 -07:00
Matei Zaharia
3097d75d6f
Merge remote-tracking branch 'dlyubimov/SPARK-827'
...
Conflicts:
docs/configuration.md
2013-07-31 18:36:43 -07:00
Reynold Xin
5227043f84
Documentation update for compression codec.
2013-07-30 17:12:16 -07:00
Matei Zaharia
497f55755f
Add docs about ipython
2013-07-29 02:51:43 -04:00
Dmitriy Lyubimov
0862494d44
typo
2013-07-27 23:16:20 -07:00
Dmitriy Lyubimov
f5067abe85
changes per comments.
2013-07-27 23:08:00 -07:00
Ubuntu
88a0823c58
Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable (JIRA Ticket SPARK-817)
2013-07-18 00:51:18 +00:00
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Matei Zaharia
d47c16f78d
Add an option to disable reference tracking in Kryo
2013-07-15 01:55:54 +00:00
Andy Konwinski
cd7259b4b8
Fixes typos in Spark Streaming Programming Guide
...
These typos were reported on the spark-users mailing list, see: https://groups.google.com/d/msg/spark-users/SyLGgJlKCrI/LpeBypOkSMUJ
2013-07-12 11:51:14 -07:00
Matei Zaharia
1ffadb2d9e
Merge remote-tracking branch 'pwendell/ui-updates'
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
2013-07-06 15:51:41 -07:00
root
7cd490ef5b
Clarify that PySpark is not supported on Windows
2013-07-01 06:26:43 +00:00
Matei Zaharia
5bbd0eec84
Update docs on SCALA_LIBRARY_PATH
2013-06-30 17:00:40 -07:00
Matei Zaharia
03d0b858c8
Made use of spark.executor.memory setting consistent and documented it
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Matei Zaharia
aea727f68d
Simplify Python docs a little to do substring search
2013-06-26 21:15:09 -07:00
Patrick Wendell
a59c15a37e
Adding config option for retained stages
2013-06-26 08:54:57 -07:00
Tathagata Das
c89af0a7f9
Merge branch 'master' into streaming
...
Conflicts:
.gitignore
2013-06-24 23:57:47 -07:00
Matei Zaharia
b5df1cd668
ADD_JARS environment variable for spark-shell
2013-06-22 17:14:44 -07:00
Reynold Xin
0eab7a78b9
Fixed a couple typos and formating problems in the YARN documentation.
2013-05-17 18:05:46 -07:00
Reynold Xin
7760d78b3a
Merge branch 'master' of https://github.com/mridulm/spark
2013-05-17 17:58:36 -07:00
Mridul Muralidharan
da2642bead
Fix example jar name
2013-05-17 06:58:46 +05:30
Reynold Xin
3b3300383a
Updated Scala version in docs generation ruby script.
2013-05-16 16:51:28 -07:00
Mridul Muralidharan
f16c781709
Fix documentation to use yarn-standalone as master
2013-05-16 17:50:22 +05:30
Mridul Muralidharan
87540a7b38
Fix running on yarn documentation
2013-05-16 15:27:58 +05:30
Andrew Ash
afcad7b3aa
Docs: Mention spark shell's default for MASTER
2013-05-15 14:45:14 -03:00
Mridul Muralidharan
ee37612bc9
1) Add support for HADOOP_CONF_DIR (and/or YARN_CONF_DIR - use either) : which is used to specify the client side configuration directory : which needs to be part of the CLASSPATH.
...
2) Move from var+=".." to var="$var.." : the former does not work on older bash shells unfortunately.
2013-05-11 11:12:22 +05:30
Matei Zaharia
cf54b824ff
Merge pull request #580 from pwendell/quickstart
...
SPARK-739 Have quickstart standlone job use README
2013-04-25 11:45:58 -07:00
Patrick Wendell
a72134a6ac
SPARK-739 Have quickstart standlone job use README
2013-04-25 10:39:28 -07:00
Mridul Muralidharan
dd515ca3ee
Attempt at fixing merge conflict
2013-04-24 09:24:17 +05:30
Mridul Muralidharan
ac2e8e8720
Add some basic documentation
2013-04-19 00:13:19 +05:30
seanm
ab0f834dbb
adding spark.streaming.blockInterval property
2013-04-16 11:57:05 -06:00
Andy Konwinski
60a91b3b59
Update quick-start.md heading on Operations (not just Transformations).
2013-04-12 12:34:51 -07:00
Andrew Ash
6efc8cae8f
Typos: cluser -> cluster
2013-04-10 13:44:10 -03:00
Matei Zaharia
65caa8f711
Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
...
Conflicts:
docs/_config.yml
project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Matei Zaharia
a1586412d6
Updated link to SBT
2013-04-07 20:31:19 -04:00
Matei Zaharia
34a47b8bc9
Update Scala version in docs
2013-04-07 20:27:03 -04:00
Matei Zaharia
a98996d1fe
Merge pull request #545 from ash211/patch-1
...
Don't use deprecated Application in example
2013-03-29 22:12:15 -07:00
Jey Kottalam
bc8ba222ff
Bump development version to 0.8.0
2013-03-28 15:42:01 -07:00
Andrew Ash
e8f3669c63
Update tuning.md
...
Make the example more compilable
2013-03-28 19:17:39 -03:00
Andrew Ash
4e2c965383
Don't use deprecated Application in example
...
As of 2.9.0 extending from Application is not recommended
http://www.scala-lang.org/api/2.9.3/index.html#scala.Application
2013-03-28 17:47:37 -03:00
Andy Konwinski
446b801b3b
Fixing typos pointed out by Matei
2013-03-20 17:30:31 -07:00
Andy Konwinski
ad7f0452ab
Adds page to docs about building using Maven.
...
Adds links to new instructions in:
* The main Spark project README.md
* The docs nav menu called "More"
* The docs Overview page under the "Building" and "Where to Go from Here" sections
2013-03-17 15:02:40 -07:00
Andy Konwinski
c9097628fc
Fix broken link to YARN documentation.
2013-03-13 14:51:13 -07:00
Andy Konwinski
cf73fbd305
Fix another broken link in quick start.
2013-03-13 02:23:44 -07:00
Andy Konwinski
b63109763b
Fix broken link in Quick Start.
2013-03-13 02:02:34 -07:00
Matei Zaharia
db9b90fdbd
Change version to 0.7.1-SNAPSHOT for development branch
2013-02-27 09:15:26 -08:00
Matei Zaharia
4f840f4e54
Added commented-out Google analytics code for website docs
2013-02-27 09:14:11 -08:00
Matei Zaharia
fadeb1ddea
More doc tweaks
2013-02-26 23:20:49 -08:00
Matei Zaharia
22334eafd9
Some tweaks to docs
2013-02-26 22:52:38 -08:00
Matei Zaharia
9a046e30ac
Switch docs to use Akka repo instead of Typesafe
2013-02-25 22:18:47 -08:00
Matei Zaharia
7e67c626ee
Change version number to 0.7.0
2013-02-25 20:30:47 -08:00
Tathagata Das
6a78ef0578
Merge pull request #500 from pwendell/streaming-docs
...
Minor changes based on feedback
2013-02-25 15:28:45 -08:00
Patrick Wendell
8316534eef
meta-data
2013-02-25 15:27:04 -08:00
Patrick Wendell
918ee25867
One more change done with TD
2013-02-25 15:24:17 -08:00
Matei Zaharia
351ac5233e
Some tweaks to docs
2013-02-25 15:19:05 -08:00
Matei Zaharia
2ae15353a1
Merge branch 'master' of github.com:mesos/spark
2013-02-25 15:14:16 -08:00
Matei Zaharia
490f056cdd
Allow passing sparkHome and JARs to StreamingContext constructor
...
Also warns if spark.cleaner.ttl is not set in the version where you pass
your own SparkContext.
2013-02-25 15:13:30 -08:00
Patrick Wendell
07f2618769
Minor changes based on feedback
2013-02-25 15:09:59 -08:00
Patrick Wendell
50ce0516e6
Some changes to streaming failure docs.
...
TD gave me the go-ahead to just make these changes:
- Define stateful dstream
- Some minor wording fixes
2013-02-25 14:38:39 -08:00
Matei Zaharia
5d4a0ac794
Some tweaks to docs
2013-02-25 14:23:03 -08:00
Matei Zaharia
848321f910
Change doc color scheme slightly for Spark 0.7 (to differ from 0.6)
2013-02-25 13:15:30 -08:00
Matei Zaharia
3c7dcb61ab
Use a single setting for disabling API doc build
2013-02-25 13:15:12 -08:00
Matei Zaharia
d6e6abece3
Merge pull request #459 from stephenh/bettersplits
...
Change defaultPartitioner to use upstream split size.
2013-02-25 09:22:04 -08:00
Stephen Haberman
44032bc476
Merge branch 'master' into bettersplits
...
Conflicts:
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/test/scala/spark/ShuffleSuite.scala
2013-02-24 22:08:14 -06:00
Tathagata Das
abb5471865
Removing duplicate doc.
2013-02-24 16:32:44 -08:00
Tathagata Das
5ab37be983
Fixed class paths and dependencies based on Matei's comments.
2013-02-24 16:24:52 -08:00
Tathagata Das
b4eb24de96
Updated streaming programming guide with Java API info, and comments from Patrick.
2013-02-23 23:59:45 -08:00
Tathagata Das
d853aa9658
Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs.
2013-02-23 17:42:26 -08:00
Tathagata Das
1cb725e417
Merge branch 'mesos-master' into streaming
2013-02-20 09:55:35 -08:00
Tathagata Das
fb9956256d
Merge branch 'mesos-master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Andy Konwinski
ecd137a72d
Fixes link to issue tracker in documentation page "Contributing to Spark".
2013-02-19 16:58:02 -08:00
Tathagata Das
9e82be1503
Merge branch 'streaming' into ScrapCodes-streaming-actor
...
Conflicts:
docs/plugin-custom-receiver.md
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
2013-02-19 02:48:50 -08:00
Tathagata Das
12ea14c211
Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver.
2013-02-18 15:18:34 -08:00
Tathagata Das
6a6e6bda57
Merge branch 'streaming' into ScrapCode-streaming
...
Conflicts:
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
2013-02-18 13:26:12 -08:00
Tathagata Das
8ad561dc7d
Added checkpointing and fault-tolerance semantics to the programming guide. Fixed default checkpoint interval to being a multiple of slide duration. Fixed visibility of some classes and objects to clean up docs.
2013-02-18 02:12:41 -08:00
Stephen Haberman
6cd68c31cb
Update default.parallelism docs, have StandaloneSchedulerBackend use it.
...
Only brand new RDDs (e.g. parallelize and makeRDD) now use default
parallelism, everything else uses their largest parent's partitioner
or partition size.
2013-02-16 00:29:11 -06:00
Matei Zaharia
e8663e0fe5
Merge pull request #461 from JoshRosen/fix/issue-tracker-link
...
Update issue tracker link in contributing guide
2013-02-13 18:42:17 -08:00
Matei Zaharia
05d2e94838
Use a separate memory setting for standalone cluster daemons
...
Conflicts:
docs/_config.yml
2013-02-10 21:59:41 -08:00
Josh Rosen
131b56afd0
Update issue tracker link in contributing guide.
2013-02-10 13:28:31 -08:00
Matei Zaharia
b1d809913b
Merge pull request #460 from markhamstra/404
...
Fixed a 404 in 'Tuning Spark' -- missing '.html'
2013-02-10 13:01:09 -08:00
Mark Hamstra
4975dcdafc
Fixed a 404 -- missing '.html'
2013-02-10 12:55:47 -08:00
Mark Hamstra
b8863a79d3
Merge branch 'master' of https://github.com/mesos/spark into commutative
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Mark Hamstra
934a53c8b6
Change docs on 'reduce' since the merging of local reduces no longer preserves
...
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Matei Zaharia
55327a283e
Merge pull request #430 from pwendell/pyspark-guide
...
Minor improvements to PySpark docs
2013-01-30 15:35:29 -08:00
Patrick Wendell
3f945e3b83
Make module help available in python shell.
...
Also, adds a line in doc explaining how to use.
2013-01-30 15:04:06 -08:00
Patrick Wendell
58a7d320d7
Inclue packaging and launching pyspark in guide.
...
It's nicer if all the commands you need are made explicit.
2013-01-30 15:04:02 -08:00
Stephen Haberman
7dfb82a992
Replace old 'master' term with 'driver'.
2013-01-25 11:03:00 -06:00
Tathagata Das
155f31398d
Made StorageLevel constructor private, and added StorageLevels.create() to the Java API. Updates scala and java programming guides.
2013-01-23 01:10:26 -08:00
Prashant Sharma
d17065c4b5
actor as receiver
2013-01-22 13:28:29 +05:30
Matei Zaharia
76d7c0ce2b
Add more Akka settings to docs
2013-01-21 13:10:33 -08:00
Matei Zaharia
2173f6c7ca
Clarify the documentation on env variables for standalone mode
2013-01-21 13:03:20 -08:00
Matei Zaharia
86057ec7c8
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Josh Rosen
9f54d7e1f5
Merge pull request #387 from mateiz/python-accumulators
...
Add accumulators to PySpark
2013-01-20 11:00:36 -08:00
Patrick Wendell
5f74ead636
Changes based on Matei's comment
2013-01-20 08:59:20 -08:00
Matei Zaharia
ee5a07955c
Fix Python guide to say accumulators are available
2013-01-20 02:11:58 -08:00