Patrick Wendell
5d460253d6
Merge pull request #228 from pwendell/master
...
Document missing configs and set shuffle consolidation to false.
2013-12-05 12:31:24 -08:00
Patrick Wendell
1450b8ef87
Small changes from Matei review
2013-12-04 18:49:32 -08:00
Patrick Wendell
b1c6fa1584
Document missing configs and set shuffle consolidation to false.
2013-12-04 18:39:34 -08:00
Andrew Ash
0c5af38b86
Typo: applicaton
2013-12-04 12:30:25 -08:00
Prashant Sharma
17987778da
Merge branch 'master' into wip-scala-2.10
...
Conflicts:
core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala
core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithContextRDD.scala
core/src/main/scala/org/apache/spark/rdd/RDD.scala
python/pyspark/rdd.py
2013-11-27 14:44:12 +05:30
Prashant Sharma
54862af5ee
Improvements from the review comments and followed Boy Scout Rule.
2013-11-27 14:26:28 +05:30
Prashant Sharma
dca946ff67
Documenting the newly added spark properties.
2013-11-26 20:47:38 +05:30
Andrew Ash
08afef37a0
Update tuning.md
...
Clarify when serializer is used based on recent user@ mailing list discussion.
2013-11-25 17:08:52 -08:00
Matei Zaharia
eb4296c8f7
Merge pull request #101 from colorant/yarn-client-scheduler
...
For SPARK-527, Support spark-shell when running on YARN
sync to trunk and resubmit here
In current YARN mode approaching, the application is run in the Application Master as a user program thus the whole spark context is on remote.
This approaching won't support application that involve local interaction and need to be run on where it is launched.
So In this pull request I have a YarnClientClusterScheduler and backend added.
With this scheduler, the user application is launched locally,While the executor will be launched by YARN on remote nodes with a thin AM which only launch the executor and monitor the Driver Actor status, so that when client app is done, it can finish the YARN Application as well.
This enables spark-shell to run upon YARN.
This also enable other Spark applications to have the spark context to run locally with a master-url "yarn-client". Thus e.g. SparkPi could have the result output locally on console instead of output in the log of the remote machine where AM is running on.
Docs also updated to show how to use this yarn-client mode.
2013-11-25 15:25:29 -08:00
Prashant Sharma
44fd30d3fb
Merge branch 'master' into scala-2.10-wip
...
Conflicts:
core/src/main/scala/org/apache/spark/rdd/RDD.scala
project/SparkBuild.scala
2013-11-25 18:10:54 +05:30
Reynold Xin
6bcac986b2
Merge branch 'master' of github.com:apache/incubator-spark
2013-11-25 15:47:47 +08:00
Matei Zaharia
859d62dc2a
Merge pull request #151 from russellcardullo/add-graphite-sink
...
Add graphite sink for metrics
This adds a metrics sink for graphite. The sink must
be configured with the host and port of a graphite node
and optionally may be configured with a prefix that will
be prepended to all metrics that are sent to graphite.
2013-11-24 16:19:51 -08:00
Raymond Liu
ab3cefde53
Add YarnClientClusterScheduler and Backend.
...
With this scheduler, the user application is launched locally,
While the executor will be launched by YARN on remote nodes.
This enables spark-shell to run upon YARN.
2013-11-22 09:23:27 +08:00
Prashant Sharma
95d8dbce91
Merge branch 'master' of github.com:apache/incubator-spark into scala-2.10-temp
...
Conflicts:
core/src/main/scala/org/apache/spark/util/collection/PrimitiveVector.scala
streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala
2013-11-21 12:34:46 +05:30
Neal Wiggins
21b5478ed6
Fix Kryo Serializer buffer inconsistency
...
The documentation here is inconsistent with the coded default and other documentation.
2013-11-20 16:19:25 -08:00
tgravescs
4093e9393a
Impove Spark on Yarn Error handling
2013-11-19 12:44:00 -06:00
Aaron Davidson
f629ba95b6
Various merge corrections
...
I've diff'd this patch against my own -- since they were both created
independently, this means that two sets of eyes have gone over all the
merge conflicts that were created, so I'm feeling significantly more
confident in the resulting PR.
@rxin has looked at the changes to the repl and is resoundingly
confident that they are correct.
2013-11-14 22:13:09 -08:00
RIA-pierre-borckmans
bef398e572
Fixed typos in the CDH4 distributions version codes.
2013-11-14 11:33:48 +01:00
Raymond Liu
a60620b76a
Merge branch 'master' into scala-2.10
2013-11-14 12:44:19 +08:00
Raymond Liu
0f2e3c6e31
Merge branch 'master' into scala-2.10
2013-11-13 16:55:11 +08:00
Russell Cardullo
ef85a51f85
Add graphite sink for metrics
...
This adds a metrics sink for graphite. The sink must
be configured with the host and port of a graphite node
and optionally may be configured with a prefix that will
be prepended to all metrics that are sent to graphite.
2013-11-08 16:36:03 -08:00
Reynold Xin
551a43fd3d
Merge branch 'master' of github.com:apache/incubator-spark into mergemerge
...
Conflicts:
README.md
core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala
core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala
core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala
2013-11-04 21:02:36 -08:00
tgravescs
a35472e1dd
Allow spark on yarn to be run from HDFS. Allows the spark.jar, app.jar, and log4j.properties to be put into hdfs.
2013-11-04 16:16:28 -06:00
Fabrizio (Misto) Milo
3f89354c45
fix persistent-hdfs
2013-11-01 17:47:37 -07:00
Evan Chan
e54a37fe15
Document all the URIs for addJar/addFile
2013-11-01 10:58:11 -07:00
Ankur Dave
5064f9b2d2
Merge remote-tracking branch 'spark-upstream/master'
...
Conflicts:
project/SparkBuild.scala
2013-10-30 15:59:09 -07:00
Joseph E. Gonzalez
41b3122120
Strating to improve README.
2013-10-29 20:57:55 -07:00
Patrick Wendell
08c1a42d7d
Add a repartition
operator.
...
This patch adds an operator called repartition with more straightforward
semantics than the current `coalesce` operator. There are a few use cases
where this operator is useful:
1. If a user wants to increase the number of partitions in the RDD. This
is more common now with streaming. E.g. a user is ingesting data on one
node but they want to add more partitions to ensure parallelism of
subsequent operations across threads or the cluster.
Right now they have to call rdd.coalesce(numSplits, shuffle=true) - that's
super confusing.
2. If a user has input data where the number of partitions is not known. E.g.
> sc.textFile("some file").coalesce(50)....
This is both vague semantically (am I growing or shrinking this RDD) but also,
may not work correctly if the base RDD has fewer than 50 partitions.
The new operator forces shuffles every time, so it will always produce exactly
the number of new partitions. It also throws an exception rather than silently
not-working if a bad input is passed.
I am currently adding streaming tests (requires refactoring some of the test
suite to allow testing at partition granularity), so this is not ready for
merge yet. But feedback is welcome.
2013-10-24 14:31:33 -07:00
Matei Zaharia
452aa36d67
Merge pull request #97 from ewencp/pyspark-system-properties
...
Add classmethod to SparkContext to set system properties.
Add a new classmethod to SparkContext to set system properties like is
possible in Scala/Java. Unlike the Java/Scala implementations, there's
no access to System until the JVM bridge is created. Since
SparkContext handles that, move the initialization of the JVM
connection to a separate classmethod that can safely be called
repeatedly as long as the same instance (or no instance) is provided.
2013-10-22 23:15:33 -07:00
Ewen Cheslack-Postava
c8748c25eb
Add notes to python documentation about using SparkContext.setSystemProperty.
2013-10-22 11:49:52 -07:00
Aaron Davidson
962bec97ee
Docs: Fix links to RDD API documentation
2013-10-22 09:39:36 -07:00
Reynold Xin
f628804c02
Merge pull request #76 from pwendell/master
...
Clarify compression property.
Clarifies that this governs compression of internal data, not input
data or output data.
2013-10-18 23:19:42 -07:00
Patrick Wendell
6b62836285
Clarify compression property.
...
Clarifies that this governs compression of internal data, not input
data or output data.
2013-10-18 23:08:44 -07:00
Mosharaf Chowdhury
35b2415fb3
Code styling. Updated doc.
2013-10-17 13:14:12 -07:00
Matei Zaharia
8f11c36fe1
Merge remote-tracking branch 'tgravescs/sparkYarnDistCache'
...
Closes #11
Conflicts:
docs/running-on-yarn.md
yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala
2013-10-10 19:34:33 -07:00
Matei Zaharia
c71499b779
Merge pull request #19 from aarondav/master-zk
...
Standalone Scheduler fault tolerance using ZooKeeper
This patch implements full distributed fault tolerance for standalone scheduler Masters.
There is only one master Leader at a time, which is actively serving scheduling
requests. If this Leader crashes, another master will eventually be elected, reconstruct
the state from the first Master, and continue serving scheduling requests.
Leader election is performed using the ZooKeeper leader election pattern. We try to minimize
the use of ZooKeeper and the assumptions about ZooKeeper's behavior, so there is a layer of
retries and session monitoring on top of the ZooKeeper client.
Master failover follows directly from the single-node Master recovery via the file
system (patch d5a96fe
), save that the Master state is stored in ZooKeeper instead.
Configuration:
By default, no recovery mechanism is enabled (spark.deploy.recoveryMode = NONE).
By setting spark.deploy.recoveryMode to ZOOKEEPER and setting spark.deploy.zookeeper.url
to an appropriate ZooKeeper URL, ZooKeeper recovery mode is enabled.
By setting spark.deploy.recoveryMode to FILESYSTEM and setting spark.deploy.recoveryDirectory
to an appropriate directory accessible by the Master, we will keep the behavior of from d5a96fe
.
Additionally, places where a Master could be specificied by a spark:// url can now take
comma-delimited lists to specify backup masters. Note that this is only used for registration
of NEW Workers and application Clients. Once a Worker or Client has registered with the
Master Leader, it is "in the system" and will never need to register again.
2013-10-10 17:16:42 -07:00
Aaron Davidson
66c20635fa
Minor clarification and cleanup to spark-standalone.md
2013-10-10 14:45:12 -07:00
Aaron Davidson
42d8b8efe6
Address Matei's comments on documentation
...
Updates to the documentation and changing some logError()s to logWarning()s.
2013-10-10 00:33:47 -07:00
Prashant Sharma
026ab75661
Merge branch 'master' of github.com:apache/incubator-spark into scala-2.10
2013-10-10 09:42:55 +05:30
Matei Zaharia
478b2b7edc
Fix PySpark docs and an overly long line of code after fdbae41e
2013-10-09 12:08:04 -07:00
Aaron Davidson
4ea8ee468f
Add docs for standalone scheduler fault tolerance
...
Also fix a couple HTML/Markdown issues in other files.
2013-10-08 14:18:31 -07:00
Prashant Sharma
7be75682b9
Merge branch 'master' into wip-merge-master
...
Conflicts:
bagel/pom.xml
core/pom.xml
core/src/test/scala/org/apache/spark/ui/UISuite.scala
examples/pom.xml
mllib/pom.xml
pom.xml
project/SparkBuild.scala
repl/pom.xml
streaming/pom.xml
tools/pom.xml
In scala 2.10, a shorter representation is used for naming artifacts
so changed to shorter scala version for artifacts and made it a property in pom.
2013-10-08 11:29:40 +05:30
Nick Pentreath
a5e58b8f98
Merge branch 'master' into implicit-als
2013-10-07 11:46:17 +02:00
Patrick Wendell
aa9fb84994
Merging build changes in from 0.8
2013-10-05 22:07:00 -07:00
Prashant Sharma
c810ee0690
Merge branch 'master' into scala-2.10
...
Conflicts:
core/src/test/scala/org/apache/spark/DistributedSuite.scala
project/SparkBuild.scala
2013-10-05 15:52:57 +05:30
Nick Pentreath
93b96b44d7
Adding implicit feedback ALS to MLlib user guide
2013-10-04 14:39:44 +02:00
tgravescs
0fff4ee852
Adding in the --addJars option to make SparkContext.addJar work on yarn and cleanup
...
the classpaths
2013-10-03 11:52:16 -05:00
tgravescs
bc3b20abdc
Allow users to set the application name for Spark on Yarn
2013-10-02 12:54:17 -05:00
Prashant Sharma
5829692885
Merge branch 'master' into scala-2.10
...
Conflicts:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressUI.scala
docs/_config.yml
project/SparkBuild.scala
repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala
2013-10-01 11:57:24 +05:30
shane-huang
84849baf88
Merge branch 'reorgscripts' into scripts-reorg
2013-09-27 09:28:33 +08:00
Prashant Sharma
604dc40996
Sync with master and some build fixes
2013-09-26 11:40:02 +05:30
Patrick Wendell
6079721fa1
Update build version in master
2013-09-24 11:41:51 -07:00
Y.CORP.YAHOO.COM\tgraves
9d4246863a
Support distributed cache files and archives on spark on yarn and attempt to cleanup the staging directory on exit
2013-09-23 09:09:59 -05:00
shane-huang
fcfe4f9204
add admin scripts to sbin
...
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 12:42:34 +08:00
shane-huang
dfbdc9ddb7
added spark-class and spark-executor to sbin
...
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-09-23 11:28:58 +08:00
Jey Kottalam
ac0dd99394
Fix typo in Maven build docs
2013-09-15 13:29:22 -07:00
Patrick Wendell
dbd2c4fd94
Merge pull request #932 from pwendell/mesos-version
...
Bumping Mesos version to 0.13.0
2013-09-15 13:20:41 -07:00
Patrick Wendell
c856860c5b
Bumping Mesos version to 0.13.0
2013-09-15 12:46:26 -07:00
Patrick Wendell
362ea0c051
Explain yarn.version in Maven build docs
2013-09-15 12:40:49 -07:00
Prashant Sharma
a90e0eff59
version changed 2.9.3 -> 2.10 in shell script.
2013-09-15 12:47:20 +05:30
Benjamin Hindman
8e2602dd70
More updates to Spark on Mesos documentation.
2013-09-11 16:08:54 -07:00
Benjamin Hindman
a0f0c1bed2
Updated Spark on Mesos documentation.
2013-09-11 16:05:25 -07:00
Patrick Wendell
bddf135670
Change port from 3030 to 4040
2013-09-11 10:01:38 -07:00
Matei Zaharia
2425eb85ca
Update Python API features
2013-09-10 11:12:59 -07:00
Patrick Wendell
cefee1ed1a
Document fortran dependency for MLBase
2013-09-09 21:45:04 -07:00
Matei Zaharia
7a5c4b647b
Small tweaks to MLlib docs
2013-09-08 21:47:24 -07:00
Matei Zaharia
7d3204b056
Merge pull request #905 from mateiz/docs2
...
Job scheduling and cluster mode docs
2013-09-08 21:39:12 -07:00
Matei Zaharia
b458854977
Fix some review comments
2013-09-08 21:25:49 -07:00
Ameet Talwalkar
81a8bd46ac
respose to PR comments
2013-09-08 19:21:30 -07:00
Ameet Talwalkar
bf280c8b0f
Merge remote-tracking branch 'upstream/master'
2013-09-08 18:41:38 -07:00
Patrick Wendell
f68848d95d
Merge pull request #906 from pwendell/ganglia-sink
...
Clean-up of Metrics Code/Docs and Add Ganglia Sink
2013-09-08 18:32:16 -07:00
Ameet Talwalkar
5ac62dbbd0
updates based on comments to PR
2013-09-08 17:39:08 -07:00
Matei Zaharia
5a587fb98d
Updated cluster diagram to show caches
2013-09-08 13:51:57 -07:00
Patrick Wendell
c190b48bf5
Adding more docs and some code cleanup
2013-09-08 13:46:28 -07:00
Matei Zaharia
af8ffdb73c
Review comments
2013-09-08 13:36:50 -07:00
Matei Zaharia
c0d375107f
Some tweaks to CDH/HDP doc
2013-09-08 00:44:41 -07:00
Matei Zaharia
f261d2a60f
Added cluster overview doc, made logo higher-resolution, and added more
...
details on monitoring
2013-09-08 00:29:11 -07:00
Matei Zaharia
651a96adf7
More fair scheduler docs and property names.
...
Also changed uses of "job" terminology to "application" when they
referred to an entire Spark program, to avoid confusion.
2013-09-08 00:29:11 -07:00
Matei Zaharia
98fb69822c
Work in progress:
...
- Add job scheduling docs
- Rename some fair scheduler properties
- Organize intro page better
- Link to Apache wiki for "contributing to Spark"
2013-09-08 00:29:11 -07:00
Matei Zaharia
38488aca8a
Merge pull request #900 from pwendell/cdh-docs
...
Provide docs to describe running on CDH/HDP cluster.
2013-09-08 00:28:53 -07:00
Patrick Wendell
22b982d2bc
File rename
2013-09-07 14:38:54 -07:00
Matei Zaharia
cfde85e395
Merge pull request #901 from ooyala/2013-09/0.8-doc-changes
...
0.8 Doc changes for make-distribution.sh
2013-09-07 13:53:08 -07:00
Patrick Wendell
61c4762d45
Changes based on feedback
2013-09-07 11:55:10 -07:00
Evan Chan
be1ee28ca6
CR feedback from Matei
2013-09-07 08:56:24 -07:00
Matei Zaharia
afe46ba36e
Merge pull request #892 from jey/fix-yarn-assembly
...
YARN build fixes
2013-09-07 07:28:51 -07:00
Evan Chan
ff1dbf2106
Add references to make-distribution.sh
2013-09-06 14:20:44 -07:00
Evan Chan
88d53f0dff
"launch" scripts is more accurate terminology
2013-09-06 14:03:44 -07:00
Evan Chan
5a18b854a7
Easier way to start the master
2013-09-06 13:59:43 -07:00
Evan Chan
76d5d2d3c5
Add notes about starting spark-shell
2013-09-06 13:53:00 -07:00
Patrick Wendell
a2a0cf9d68
Docs describing Spark monitoring and instrumentation
2013-09-06 13:52:57 -07:00
Patrick Wendell
e653a9d891
Provide docs to describe running on CDH/HDP cluster.
...
This doc consolidates information relevant to CDH/HDP users in a single place.
2013-09-06 13:49:57 -07:00
Jey Kottalam
35ed09f1d1
Clarify YARN example
2013-09-06 11:31:16 -07:00
Ameet Talwalkar
d52edfa753
updated content
2013-09-05 21:06:50 -07:00
Y.CORP.YAHOO.COM\tgraves
c8cc276110
Review comment changes and update to org.apache packaging
2013-09-03 10:50:21 -05:00
Y.CORP.YAHOO.COM\tgraves
547fc4a412
Merge remote-tracking branch 'mesos/master' into yarnUILink
...
Conflicts:
core/src/main/scala/org/apache/spark/ui/UIUtils.scala
core/src/main/scala/org/apache/spark/ui/jobs/PoolTable.scala
core/src/main/scala/org/apache/spark/ui/jobs/StageTable.scala
docs/running-on-yarn.md
2013-09-03 08:36:59 -05:00
Matei Zaharia
2615cad30b
Some doc improvements
...
- List higher-level projects that run on Spark
- Tweak CSS
2013-09-02 13:35:28 -07:00
Matei Zaharia
9329a7d4cd
Fix spark.io.compression.codec and change default codec to LZF
2013-09-02 10:15:22 -07:00
Matei Zaharia
9ee1e9db2e
Doc improvements
2013-09-01 22:12:03 -07:00
Matei Zaharia
3db404a43a
Run script fixes for Windows after package & assembly change
2013-09-01 23:45:57 +00:00
Matei Zaharia
0a8cc30921
Move some classes to more appropriate packages:
...
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
2013-09-01 14:13:16 -07:00
Matei Zaharia
5b4dea2143
More fixes
2013-09-01 14:13:16 -07:00
Matei Zaharia
5701eb92c7
Fix some URLs
2013-09-01 14:13:16 -07:00
Matei Zaharia
debcf24389
Fix over-zealous find-and-replace in HTML
2013-09-01 14:13:16 -07:00
Matei Zaharia
d27cd03f30
Fix more URLs in docs
2013-09-01 14:13:16 -07:00
Matei Zaharia
4f422032e5
Update docs for new package
2013-09-01 14:13:15 -07:00
Matei Zaharia
4d1cb59fe1
Small tweak to docs gradient
2013-09-01 14:13:15 -07:00
Matei Zaharia
46eecd110a
Initial work to rename package to org.apache.spark
2013-09-01 14:13:13 -07:00
Patrick Wendell
0e375a3cc2
Add assmebly plug in links
2013-09-01 09:43:42 -07:00
Patrick Wendell
6371febe18
Better docs
2013-08-31 19:09:06 -07:00
Matei Zaharia
9ddad0dcb4
Fixes suggested by Patrick
2013-08-31 17:40:33 -07:00
Matei Zaharia
4819baa658
More updates, describing changes to recommended use of environment vars
...
and new Python stuff
2013-08-31 14:21:10 -07:00
Matei Zaharia
4293533032
Update docs about HDFS versions
2013-08-30 15:04:43 -07:00
Y.CORP.YAHOO.COM\tgraves
96452eea56
fix up minor things
2013-08-30 16:04:31 -05:00
Y.CORP.YAHOO.COM\tgraves
bac46266a9
Link the Spark UI to the Yarn UI
2013-08-30 15:55:32 -05:00
Matei Zaharia
f3a964848d
More doc improvements + better warnings when you haven't built Spark
2013-08-30 12:41:25 -07:00
Matei Zaharia
23762efda2
New hardware provisioning doc, and updates to menus
2013-08-30 10:16:26 -07:00
Matei Zaharia
1b0f69c623
Change docs color theme for 0.8
2013-08-30 10:15:58 -07:00
Matei Zaharia
e11bc18294
Update Maven docs
2013-08-29 21:19:07 -07:00
Matei Zaharia
2de756ff19
Update some build instructions because only sbt assembly and mvn package
...
are now needed
2013-08-29 21:19:06 -07:00
Matei Zaharia
53cd50c069
Change build and run instructions to use assemblies
...
This commit makes Spark invocation saner by using an assembly JAR to
find all of Spark's dependencies instead of adding all the JARs in
lib_managed. It also packages the examples into an assembly and uses
that as SPARK_EXAMPLES_JAR. Finally, it replaces the old "run" script
with two better-named scripts: "run-examples" for examples, and
"spark-class" for Spark internal classes (e.g. REPL, master, etc). This
is also designed to minimize the confusion people have in trying to use
"run" to run their own classes; it's not meant to do that, but now at
least if they look at it, they can modify run-examples to do a decent
job for them.
As part of this, Bagel's examples are also now properly moved to the
examples package instead of bagel.
2013-08-29 21:19:04 -07:00
Matei Zaharia
baa84e7e4c
Merge pull request #865 from tgravescs/fixtmpdir
...
Spark on Yarn should use yarn approved directories for spark.local.dir and tmp
2013-08-28 12:44:46 -07:00
Y.CORP.YAHOO.COM\tgraves
63dc635de6
fix typos
2013-08-26 17:06:20 -05:00
Y.CORP.YAHOO.COM\tgraves
c9464c74a1
Add ability for user to specify environment variables
2013-08-26 16:44:27 -05:00
Y.CORP.YAHOO.COM\tgraves
6dd64e8bb2
Update docs and remove old reference to --user option
2013-08-26 14:29:24 -05:00
Patrick Wendell
2cfe52ef55
Version bump for ec2 docs
2013-08-24 15:16:53 -07:00
Patrick Wendell
4879685910
Merge remote-tracking branch 'mesos/master' into ec2-updates
2013-08-24 14:50:58 -07:00
Matei Zaharia
5a6ac12840
Merge pull request #701 from ScrapCodes/documentation-suggestions
...
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma
2bc348e92c
Linking custom receiver guide
2013-08-23 09:44:02 +05:30
Prashant Sharma
39a1d58da4
Improved documentation for spark custom receiver
2013-08-23 09:38:50 +05:30
Jey Kottalam
f9cc1fbf27
Remove references to unsupported Hadoop versions
2013-08-21 17:14:36 -07:00
Patrick Wendell
6be6b71c8c
Merge branch 'master' into ec2-updates
...
Conflicts:
ec2/spark_ec2.py
2013-08-21 15:34:31 -07:00
Jey Kottalam
6585f49841
Update build docs
2013-08-21 14:51:56 -07:00
Jey Kottalam
9c6f8df30f
Update jekyll plugin to match docs/README.md
2013-08-21 12:57:56 -07:00
Matei Zaharia
53b1c30607
Update docs for Spark UI port
2013-08-20 22:57:11 -07:00
Matei Zaharia
aa2b89d98d
Merge remote-tracking branch 'jey/hadoop-agnostic'
...
Conflicts:
core/src/main/scala/spark/PairRDDFunctions.scala
2013-08-20 10:14:15 -07:00
Matei Zaharia
2a4ed10210
Address some review comments:
...
- When a resourceOffers() call has multiple offers, force the TaskSets
to consider them in increasing order of locality levels so that they
get a chance to launch stuff locally across all offers
- Simplify ClusterScheduler.prioritizeContainers
- Add docs on the new configuration options
2013-08-18 19:51:07 -07:00
Jey Kottalam
14b6bcdf93
update YARN docs
2013-08-15 16:50:37 -07:00
Evan Sparks
4346f0a1e9
Merge pull request #809 from shivaram/sgd-cleanup
...
Clean up scaladoc in ML Lib.
2013-08-12 12:12:12 -07:00
Shivaram Venkataraman
8b5e3e2eb5
Add ML Lib scaladoc to API dropdown
2013-08-11 23:52:43 -07:00
Patrick Wendell
9244524146
Removing dead docs
2013-08-11 20:33:58 -07:00
Shivaram Venkataraman
4935a2558b
Clean up scaladoc in ML Lib.
...
Also build and copy ML Lib scaladoc in Spark docs build.
Some more minor cleanup with respect to naming, test locations etc.
2013-08-11 19:02:43 -07:00
Matei Zaharia
de6c4c995a
Merge pull request #787 from ash211/master
...
Update spark-standalone.md
2013-08-06 17:09:50 -07:00
Andrew Ash
afc2c80fdb
Update spark-standalone.md
2013-08-07 00:44:43 +01:00
Patrick Wendell
5cc725a0e3
Merge branch 'master' into ec2-updates
...
Conflicts:
ec2/deploy.generic/root/mesos-ec2/ec2-variables.sh
2013-07-31 21:35:12 -07:00
Patrick Wendell
b7b627d5bb
Updating relevant documentation
2013-07-31 21:28:27 -07:00
Matei Zaharia
3097d75d6f
Merge remote-tracking branch 'dlyubimov/SPARK-827'
...
Conflicts:
docs/configuration.md
2013-07-31 18:36:43 -07:00
Reynold Xin
5227043f84
Documentation update for compression codec.
2013-07-30 17:12:16 -07:00
Matei Zaharia
497f55755f
Add docs about ipython
2013-07-29 02:51:43 -04:00
Dmitriy Lyubimov
0862494d44
typo
2013-07-27 23:16:20 -07:00
Dmitriy Lyubimov
f5067abe85
changes per comments.
2013-07-27 23:08:00 -07:00
Ubuntu
88a0823c58
Consistently invoke bash with /usr/bin/env bash in scripts to make code more portable (JIRA Ticket SPARK-817)
2013-07-18 00:51:18 +00:00
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Matei Zaharia
d47c16f78d
Add an option to disable reference tracking in Kryo
2013-07-15 01:55:54 +00:00
Andy Konwinski
cd7259b4b8
Fixes typos in Spark Streaming Programming Guide
...
These typos were reported on the spark-users mailing list, see: https://groups.google.com/d/msg/spark-users/SyLGgJlKCrI/LpeBypOkSMUJ
2013-07-12 11:51:14 -07:00
Matei Zaharia
1ffadb2d9e
Merge remote-tracking branch 'pwendell/ui-updates'
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
2013-07-06 15:51:41 -07:00
root
7cd490ef5b
Clarify that PySpark is not supported on Windows
2013-07-01 06:26:43 +00:00
Matei Zaharia
5bbd0eec84
Update docs on SCALA_LIBRARY_PATH
2013-06-30 17:00:40 -07:00
Matei Zaharia
03d0b858c8
Made use of spark.executor.memory setting consistent and documented it
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Matei Zaharia
aea727f68d
Simplify Python docs a little to do substring search
2013-06-26 21:15:09 -07:00
Patrick Wendell
a59c15a37e
Adding config option for retained stages
2013-06-26 08:54:57 -07:00
Tathagata Das
c89af0a7f9
Merge branch 'master' into streaming
...
Conflicts:
.gitignore
2013-06-24 23:57:47 -07:00
Matei Zaharia
b5df1cd668
ADD_JARS environment variable for spark-shell
2013-06-22 17:14:44 -07:00
Reynold Xin
0eab7a78b9
Fixed a couple typos and formating problems in the YARN documentation.
2013-05-17 18:05:46 -07:00
Reynold Xin
7760d78b3a
Merge branch 'master' of https://github.com/mridulm/spark
2013-05-17 17:58:36 -07:00
Mridul Muralidharan
da2642bead
Fix example jar name
2013-05-17 06:58:46 +05:30
Reynold Xin
3b3300383a
Updated Scala version in docs generation ruby script.
2013-05-16 16:51:28 -07:00
Mridul Muralidharan
f16c781709
Fix documentation to use yarn-standalone as master
2013-05-16 17:50:22 +05:30
Mridul Muralidharan
87540a7b38
Fix running on yarn documentation
2013-05-16 15:27:58 +05:30
Andrew Ash
afcad7b3aa
Docs: Mention spark shell's default for MASTER
2013-05-15 14:45:14 -03:00
Mridul Muralidharan
ee37612bc9
1) Add support for HADOOP_CONF_DIR (and/or YARN_CONF_DIR - use either) : which is used to specify the client side configuration directory : which needs to be part of the CLASSPATH.
...
2) Move from var+=".." to var="$var.." : the former does not work on older bash shells unfortunately.
2013-05-11 11:12:22 +05:30
Matei Zaharia
cf54b824ff
Merge pull request #580 from pwendell/quickstart
...
SPARK-739 Have quickstart standlone job use README
2013-04-25 11:45:58 -07:00
Patrick Wendell
a72134a6ac
SPARK-739 Have quickstart standlone job use README
2013-04-25 10:39:28 -07:00
Mridul Muralidharan
dd515ca3ee
Attempt at fixing merge conflict
2013-04-24 09:24:17 +05:30
Mridul Muralidharan
ac2e8e8720
Add some basic documentation
2013-04-19 00:13:19 +05:30
seanm
ab0f834dbb
adding spark.streaming.blockInterval property
2013-04-16 11:57:05 -06:00
Andy Konwinski
60a91b3b59
Update quick-start.md heading on Operations (not just Transformations).
2013-04-12 12:34:51 -07:00
Andrew Ash
6efc8cae8f
Typos: cluser -> cluster
2013-04-10 13:44:10 -03:00
Matei Zaharia
65caa8f711
Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
...
Conflicts:
docs/_config.yml
project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Matei Zaharia
a1586412d6
Updated link to SBT
2013-04-07 20:31:19 -04:00
Matei Zaharia
34a47b8bc9
Update Scala version in docs
2013-04-07 20:27:03 -04:00
Matei Zaharia
a98996d1fe
Merge pull request #545 from ash211/patch-1
...
Don't use deprecated Application in example
2013-03-29 22:12:15 -07:00
Jey Kottalam
bc8ba222ff
Bump development version to 0.8.0
2013-03-28 15:42:01 -07:00
Andrew Ash
e8f3669c63
Update tuning.md
...
Make the example more compilable
2013-03-28 19:17:39 -03:00
Andrew Ash
4e2c965383
Don't use deprecated Application in example
...
As of 2.9.0 extending from Application is not recommended
http://www.scala-lang.org/api/2.9.3/index.html#scala.Application
2013-03-28 17:47:37 -03:00
Andy Konwinski
446b801b3b
Fixing typos pointed out by Matei
2013-03-20 17:30:31 -07:00
Andy Konwinski
ad7f0452ab
Adds page to docs about building using Maven.
...
Adds links to new instructions in:
* The main Spark project README.md
* The docs nav menu called "More"
* The docs Overview page under the "Building" and "Where to Go from Here" sections
2013-03-17 15:02:40 -07:00
Andy Konwinski
c9097628fc
Fix broken link to YARN documentation.
2013-03-13 14:51:13 -07:00
Andy Konwinski
cf73fbd305
Fix another broken link in quick start.
2013-03-13 02:23:44 -07:00
Andy Konwinski
b63109763b
Fix broken link in Quick Start.
2013-03-13 02:02:34 -07:00
Matei Zaharia
db9b90fdbd
Change version to 0.7.1-SNAPSHOT for development branch
2013-02-27 09:15:26 -08:00
Matei Zaharia
4f840f4e54
Added commented-out Google analytics code for website docs
2013-02-27 09:14:11 -08:00
Matei Zaharia
fadeb1ddea
More doc tweaks
2013-02-26 23:20:49 -08:00
Matei Zaharia
22334eafd9
Some tweaks to docs
2013-02-26 22:52:38 -08:00
Matei Zaharia
9a046e30ac
Switch docs to use Akka repo instead of Typesafe
2013-02-25 22:18:47 -08:00
Matei Zaharia
7e67c626ee
Change version number to 0.7.0
2013-02-25 20:30:47 -08:00
Tathagata Das
6a78ef0578
Merge pull request #500 from pwendell/streaming-docs
...
Minor changes based on feedback
2013-02-25 15:28:45 -08:00
Patrick Wendell
8316534eef
meta-data
2013-02-25 15:27:04 -08:00
Patrick Wendell
918ee25867
One more change done with TD
2013-02-25 15:24:17 -08:00
Matei Zaharia
351ac5233e
Some tweaks to docs
2013-02-25 15:19:05 -08:00
Matei Zaharia
2ae15353a1
Merge branch 'master' of github.com:mesos/spark
2013-02-25 15:14:16 -08:00
Matei Zaharia
490f056cdd
Allow passing sparkHome and JARs to StreamingContext constructor
...
Also warns if spark.cleaner.ttl is not set in the version where you pass
your own SparkContext.
2013-02-25 15:13:30 -08:00
Patrick Wendell
07f2618769
Minor changes based on feedback
2013-02-25 15:09:59 -08:00
Patrick Wendell
50ce0516e6
Some changes to streaming failure docs.
...
TD gave me the go-ahead to just make these changes:
- Define stateful dstream
- Some minor wording fixes
2013-02-25 14:38:39 -08:00
Matei Zaharia
5d4a0ac794
Some tweaks to docs
2013-02-25 14:23:03 -08:00
Matei Zaharia
848321f910
Change doc color scheme slightly for Spark 0.7 (to differ from 0.6)
2013-02-25 13:15:30 -08:00
Matei Zaharia
3c7dcb61ab
Use a single setting for disabling API doc build
2013-02-25 13:15:12 -08:00
Matei Zaharia
d6e6abece3
Merge pull request #459 from stephenh/bettersplits
...
Change defaultPartitioner to use upstream split size.
2013-02-25 09:22:04 -08:00
Stephen Haberman
44032bc476
Merge branch 'master' into bettersplits
...
Conflicts:
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/test/scala/spark/ShuffleSuite.scala
2013-02-24 22:08:14 -06:00
Tathagata Das
abb5471865
Removing duplicate doc.
2013-02-24 16:32:44 -08:00
Tathagata Das
5ab37be983
Fixed class paths and dependencies based on Matei's comments.
2013-02-24 16:24:52 -08:00
Tathagata Das
b4eb24de96
Updated streaming programming guide with Java API info, and comments from Patrick.
2013-02-23 23:59:45 -08:00
Tathagata Das
d853aa9658
Change spark.cleaner.delay to spark.cleaner.ttl. Updated docs.
2013-02-23 17:42:26 -08:00
Tathagata Das
1cb725e417
Merge branch 'mesos-master' into streaming
2013-02-20 09:55:35 -08:00
Tathagata Das
fb9956256d
Merge branch 'mesos-master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Andy Konwinski
ecd137a72d
Fixes link to issue tracker in documentation page "Contributing to Spark".
2013-02-19 16:58:02 -08:00
Tathagata Das
9e82be1503
Merge branch 'streaming' into ScrapCodes-streaming-actor
...
Conflicts:
docs/plugin-custom-receiver.md
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
2013-02-19 02:48:50 -08:00
Tathagata Das
12ea14c211
Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver.
2013-02-18 15:18:34 -08:00
Tathagata Das
6a6e6bda57
Merge branch 'streaming' into ScrapCode-streaming
...
Conflicts:
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
2013-02-18 13:26:12 -08:00
Tathagata Das
8ad561dc7d
Added checkpointing and fault-tolerance semantics to the programming guide. Fixed default checkpoint interval to being a multiple of slide duration. Fixed visibility of some classes and objects to clean up docs.
2013-02-18 02:12:41 -08:00
Stephen Haberman
6cd68c31cb
Update default.parallelism docs, have StandaloneSchedulerBackend use it.
...
Only brand new RDDs (e.g. parallelize and makeRDD) now use default
parallelism, everything else uses their largest parent's partitioner
or partition size.
2013-02-16 00:29:11 -06:00
Matei Zaharia
e8663e0fe5
Merge pull request #461 from JoshRosen/fix/issue-tracker-link
...
Update issue tracker link in contributing guide
2013-02-13 18:42:17 -08:00
Matei Zaharia
05d2e94838
Use a separate memory setting for standalone cluster daemons
...
Conflicts:
docs/_config.yml
2013-02-10 21:59:41 -08:00
Josh Rosen
131b56afd0
Update issue tracker link in contributing guide.
2013-02-10 13:28:31 -08:00
Matei Zaharia
b1d809913b
Merge pull request #460 from markhamstra/404
...
Fixed a 404 in 'Tuning Spark' -- missing '.html'
2013-02-10 13:01:09 -08:00
Mark Hamstra
4975dcdafc
Fixed a 404 -- missing '.html'
2013-02-10 12:55:47 -08:00
Mark Hamstra
b8863a79d3
Merge branch 'master' of https://github.com/mesos/spark into commutative
...
Conflicts:
core/src/main/scala/spark/RDD.scala
2013-02-08 18:26:00 -08:00
Mark Hamstra
934a53c8b6
Change docs on 'reduce' since the merging of local reduces no longer preserves
...
ordering, so the reduce function must also be commutative.
2013-02-05 22:19:58 -08:00
Matei Zaharia
55327a283e
Merge pull request #430 from pwendell/pyspark-guide
...
Minor improvements to PySpark docs
2013-01-30 15:35:29 -08:00
Patrick Wendell
3f945e3b83
Make module help available in python shell.
...
Also, adds a line in doc explaining how to use.
2013-01-30 15:04:06 -08:00
Patrick Wendell
58a7d320d7
Inclue packaging and launching pyspark in guide.
...
It's nicer if all the commands you need are made explicit.
2013-01-30 15:04:02 -08:00
Stephen Haberman
7dfb82a992
Replace old 'master' term with 'driver'.
2013-01-25 11:03:00 -06:00
Tathagata Das
155f31398d
Made StorageLevel constructor private, and added StorageLevels.create() to the Java API. Updates scala and java programming guides.
2013-01-23 01:10:26 -08:00
Prashant Sharma
d17065c4b5
actor as receiver
2013-01-22 13:28:29 +05:30
Matei Zaharia
76d7c0ce2b
Add more Akka settings to docs
2013-01-21 13:10:33 -08:00
Matei Zaharia
2173f6c7ca
Clarify the documentation on env variables for standalone mode
2013-01-21 13:03:20 -08:00
Matei Zaharia
86057ec7c8
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Josh Rosen
9f54d7e1f5
Merge pull request #387 from mateiz/python-accumulators
...
Add accumulators to PySpark
2013-01-20 11:00:36 -08:00
Patrick Wendell
5f74ead636
Changes based on Matei's comment
2013-01-20 08:59:20 -08:00
Matei Zaharia
ee5a07955c
Fix Python guide to say accumulators are available
2013-01-20 02:11:58 -08:00
Patrick Wendell
ecdff861f7
Clarifying log directory in EC2 guide
2013-01-19 22:59:35 -08:00
Prashant Sharma
56b9bd197c
Plug in actor as stream receiver API
2013-01-19 22:04:07 +05:30
Tathagata Das
cd1521cfdb
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/rdd/FilteredRDD.scala
docs/_layouts/global.html
docs/index.md
run
2013-01-15 12:08:51 -08:00
Tathagata Das
0dbd411a56
Added documentation for PairDStreamFunctions.
2013-01-13 21:08:35 -08:00
Matei Zaharia
fbb3fc4143
Merge pull request #346 from JoshRosen/python-api
...
Python API (PySpark)
2013-01-12 23:49:36 -08:00
Michael Heuer
480c4139bb
add repositories section to simple job pom.xml
2013-01-11 11:40:17 -06:00
Josh Rosen
b57dd0f160
Add mapPartitionsWithSplit() to PySpark.
2013-01-08 16:05:02 -08:00
Tathagata Das
237bac36e9
Renamed examples and added documentation.
2013-01-07 14:37:21 -08:00
Josh Rosen
ce9f1bbe20
Add pyspark
script to replace the other scripts.
...
Expand the PySpark programming guide.
2013-01-01 21:25:49 -08:00
Josh Rosen
b58340dbd9
Rename top-level 'pyspark' directory to 'python'
2013-01-01 15:05:00 -08:00
Josh Rosen
170e451fbd
Minor documentation and style fixes for PySpark.
2013-01-01 13:52:14 -08:00
Tathagata Das
02497f0cd4
Updated Streaming Programming Guide.
2013-01-01 12:21:32 -08:00
Tathagata Das
9e644402c1
Improved jekyll and scala docs. Made many classes and method private to remove them from scala docs.
2012-12-29 18:31:51 -08:00
Josh Rosen
c5cee53f20
Merge remote-tracking branch 'origin/master' into python-api
...
Conflicts:
docs/quick-start.md
2012-12-29 16:00:51 -08:00
Josh Rosen
c2b105af34
Add documentation for Python API.
2012-12-28 22:51:28 -08:00
Josh Rosen
85b8f2c64f
Add epydoc API documentation for PySpark.
2012-12-27 18:04:10 -08:00
Tathagata Das
7c33f76291
Merge branch 'mesos' into dev-merge
2012-12-26 19:19:07 -08:00
Reynold Xin
c68a076037
Updated Kryo documentation for Kryo version update.
2012-12-21 16:03:17 -08:00
Reynold Xin
eac566a7f4
Merge branch 'master' of github.com:mesos/spark into dev
...
Conflicts:
core/src/main/scala/spark/MapOutputTracker.scala
core/src/main/scala/spark/PairRDDFunctions.scala
core/src/main/scala/spark/ParallelCollection.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/rdd/BlockRDD.scala
core/src/main/scala/spark/rdd/CartesianRDD.scala
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/rdd/CoalescedRDD.scala
core/src/main/scala/spark/rdd/FilteredRDD.scala
core/src/main/scala/spark/rdd/FlatMappedRDD.scala
core/src/main/scala/spark/rdd/GlommedRDD.scala
core/src/main/scala/spark/rdd/HadoopRDD.scala
core/src/main/scala/spark/rdd/MapPartitionsRDD.scala
core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala
core/src/main/scala/spark/rdd/MappedRDD.scala
core/src/main/scala/spark/rdd/PipedRDD.scala
core/src/main/scala/spark/rdd/SampledRDD.scala
core/src/main/scala/spark/rdd/ShuffledRDD.scala
core/src/main/scala/spark/rdd/UnionRDD.scala
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/BlockManagerId.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/StorageLevel.scala
core/src/main/scala/spark/util/MetadataCleaner.scala
core/src/main/scala/spark/util/TimeStampedHashMap.scala
core/src/test/scala/spark/storage/BlockManagerSuite.scala
run
2012-12-20 14:53:40 -08:00
Josh Rosen
1948f46093
Use spark-env.sh to configure standalone master. See SPARK-638.
...
Also fixed a typo in the standalone mode documentation.
2012-12-14 01:20:00 +00:00
Patrick Wendell
6ceb559994
Adding multi-jar constructor in quickstart
2012-11-27 23:32:24 -08:00
Tathagata Das
0fe2fc4d5e
Merged branch mesos/master to branch dev.
2012-11-26 13:16:59 -08:00
Matei Zaharia
6adc7c965f
Doc fix
2012-11-16 20:49:02 -08:00
Patrick Wendell
d39ac5fbc1
Streaming programming guide. STREAMING-2 #resolve
2012-11-13 21:19:58 -08:00
Matei Zaharia
51477e8874
Merge pull request #294 from JoshRosen/docs/quickstart
...
Fix minor typos in quickstart and Scala programming guides
2012-10-27 16:56:39 -07:00
Josh Rosen
33bea24f8e
Fix Spark groupId in Scala Programming Guide.
2012-10-26 15:01:28 -07:00
Josh Rosen
c4aa10154e
Fix minor typos in quick start guide.
2012-10-23 13:49:52 -07:00
Matei Zaharia
0967e71a00
Bump up version to 0.7.0-SNAPSHOT for master branch
2012-10-22 11:49:42 -07:00
Matei Zaharia
902a608187
Update version to 0.6.1-SNAPSHOT to show this is in development
2012-10-22 11:43:57 -07:00
Matei Zaharia
6999724ce8
Fix a path in the web UI
2012-10-20 23:33:37 -07:00
Patrick Wendell
7a03a0e35d
Adding dependency repos in quickstart example
2012-10-14 11:48:24 -07:00
Matei Zaharia
4be12d97ec
Some doc fixes, including showing version number in nav bar again
2012-10-13 19:05:11 -07:00
Matei Zaharia
19910c00c3
tweaks
2012-10-13 16:22:39 -07:00
Matei Zaharia
4a3e9cf69c
Document how to configure SPARK_MEM & co on a per-job basis
2012-10-13 16:20:25 -07:00
Matei Zaharia
8d7b77bcb5
Some doc and usability improvements:
...
- Added a StorageLevels class for easy access to StorageLevel constants
in Java
- Added doc comments on Function classes in Java
- Updated Accumulator and HadoopWriter docs slightly
2012-10-12 17:53:20 -07:00
Matei Zaharia
1183b30941
Merge branch 'dev' of github.com:mesos/spark into dev
2012-10-12 14:40:07 -07:00
Matei Zaharia
603b419fdf
Tweak
2012-10-12 14:40:00 -07:00
Patrick Wendell
5788a92953
Updating Bagel build instructions
2012-10-09 22:39:29 -07:00
Patrick Wendell
0f760a0bd3
Updating programming guide with new link instructions
2012-10-09 22:39:29 -07:00
Patrick Wendell
4de5cc1ad4
Removing reference to publish-local in the quickstart
2012-10-09 22:39:29 -07:00
Patrick Wendell
8321e7f0c2
Fixing YARN instructions
2012-10-09 22:39:28 -07:00
Patrick Wendell
5013c785fd
Adding SNAPSHOT to Spark version in doc config
2012-10-09 22:39:28 -07:00
Matei Zaharia
bc0bc672d0
Updates to documentation:
...
- Edited quick start and tuning guide to simplify them a little
- Simplified top menu bar
- Made private a SparkContext constructor parameter that was left as
public
- Various small fixes
2012-10-09 14:30:23 -07:00
Matei Zaharia
4780fee887
Merge pull request #260 from andyk/update-docs-to-use-version-vars
...
Updates docs to use the new version num vars and adds Spark version in nav bar
2012-10-09 09:46:20 -07:00
Andy Konwinski
e1a724f39c
Updating lots of docs to use the new special version number variables,
...
also adding the version to the navbar so it is easy to tell which
version of Spark these docs were compiled for.
2012-10-08 17:17:17 -07:00
Patrick Wendell
7887e171f4
Adding new download instructions
2012-10-08 17:05:13 -07:00
Andy Konwinski
56b91ab380
Updates README.md with instructions for running jekyll without building
...
scaladoc (i.e. run `SKIP_SCALADOC=1 jekyll`).
2012-10-08 12:15:44 -07:00
Andy Konwinski
89f8e1c2e7
Merge remote-tracking branch 'public-spark/dev' into add-version-vars-to-docs
...
Conflicts:
docs/quick-start.md
2012-10-08 12:15:38 -07:00
Matei Zaharia
88152e2164
Merge pull request #254 from pwendell/quickstart-fix
...
Removing one link in quickstart
2012-10-08 11:08:29 -07:00
Andy Konwinski
45d03231d0
Adds liquid variables to docs templating system so that they can be used
...
throughout the docs: SPARK_VERSION, SCALA_VERSION, and MESOS_VERSION.
To use them, e.g. use {{site.SPARK_VERSION}}.
Also removes uses of {{HOME_PATH}} which were being resolved to ""
by the templating system anyway.
2012-10-08 10:30:38 -07:00
Andy Konwinski
cd710a7718
Fixes the small gap above the nav menu dropdown boxes and the hoverable
...
menu items that was causing the dropdowns to go away when the user
moved their mouse down towards them.
2012-10-08 09:33:40 -07:00
Patrick Wendell
99aac2f6c4
Removing one link in quickstart
2012-10-08 08:53:27 -07:00
Matei Zaharia
efc5423210
Made compression configurable separately for shuffle, broadcast and RDDs
2012-10-07 11:30:53 -07:00
Matei Zaharia
dc28a3ac0a
Modified shuffle to limit the maximum outstanding data size in bytes,
...
instead of the maximum number of outstanding fetches. This should make
it faster when there are many small map output files, as well as more
robust to overallocating memory on large map outputs.
2012-10-06 20:07:10 -07:00
Matei Zaharia
e6e27a05d8
Links quick start from nav bar
2012-10-05 17:06:55 -07:00
Patrick Wendell
e84c068fab
Some additions to the Tuning Guide.
...
1. Slight change in organization
2. Added pre-requisites
3. Made a new section about determining memory footprint
of an RDD
4. Other small changes
2012-10-03 14:06:34 -07:00
Matei Zaharia
42cd148507
Merge pull request #236 from pwendell/quickstart
...
A Spark "Quick Start" example
2012-10-03 08:31:43 -07:00
Patrick Wendell
35b767f478
Responding to Matei's comments
2012-10-02 23:54:03 -07:00
Shivaram Venkataraman
3d2b900b08
First cut at adding documentation for GC tuning
2012-10-02 20:07:18 -07:00
Patrick Wendell
f78edf94cf
A Spark "Quick Start" example
...
This commit includes a quick start example that covers:
1) Basic usage of the Spark shell
2) A simple Spark job in Scala
3) A simple Spark job in Java
2012-10-01 17:24:09 -07:00
Matei Zaharia
802aa8aef9
Some bug fixes and logging fixes for broadcast.
2012-10-01 15:20:42 -07:00
Reynold Xin
f5812d0354
Added mapPartitionsWithSplit to the programming guide.
2012-09-29 01:31:36 -07:00
Josh Rosen
37c199bbb0
Allow controlling number of splits in distinct().
2012-09-28 23:44:19 -07:00
Matei Zaharia
009b0e37e7
Added an option to compress blocks in the block store
2012-09-27 18:45:44 -07:00
Matei Zaharia
7bcb08cef5
Renamed storage levels to something cleaner; fixes #223 .
2012-09-27 17:50:59 -07:00
Matei Zaharia
bf18e0994e
Minor typos
2012-09-26 23:53:38 -07:00
Matei Zaharia
a4093f7563
Minor doc fixes
2012-09-26 23:22:15 -07:00
Matei Zaharia
ea05fc130b
Updates to standalone cluster, web UI and deploy docs.
2012-09-26 22:54:39 -07:00
Matei Zaharia
874a9fd407
More updates to docs, including tuning guide
2012-09-26 19:17:58 -07:00
Matei Zaharia
58eb44acbb
Doc tweaks
2012-09-26 00:32:59 -07:00
Matei Zaharia
d51d5e0582
Doc fixes
2012-09-25 23:59:04 -07:00
Matei Zaharia
c5754bb939
Fixes to Java guide
2012-09-25 23:51:04 -07:00
Matei Zaharia
f1246cc7c1
Various enhancements to the programming guide and HTML/CSS
2012-09-25 23:26:56 -07:00
Matei Zaharia
56c90485fd
More updates to documentation
2012-09-25 19:31:07 -07:00
Matei Zaharia
1821bf1f1f
Merge branch 'dev' of github.com:mesos/spark into dev
2012-09-25 15:46:27 -07:00
Matei Zaharia
e47e11720f
Documentation updates
2012-09-25 15:46:18 -07:00
Andy Konwinski
098351b735
Makes nav menu dropdowns show on hover instead of on click.
2012-09-25 15:42:50 -07:00
Matei Zaharia
30362a21e7
Update license info on deploy scripts
2012-09-25 14:43:47 -07:00
Matei Zaharia
60bce9574f
Update Jekyll plugin to look at Scala 2.9.2 docs
2012-09-25 14:31:53 -07:00
Josh Rosen
c94e9cc54a
Add Java Programming Guide; fix broken doc links.
2012-09-16 20:46:46 -07:00
Andy Konwinski
52c29071a4
- Add docs/api to .gitignore
...
- Rework/expand the nav bar with more of the docs site
- Removing parts of docs about EC2 and Mesos that differentiate between
running 0.5 and before
- Merged subheadings from running-on-amazon-ec2.html that are still relevant
(i.e., "Using a newer version of Spark" and "Accessing Data in S3") into
ec2-scripts.html and deleted running-on-amazon-ec2.html
- Added some TODO comments to a few docs
- Updated the blurb about AMP Camp
- Renamed programming-guide to spark-programming-guide
- Fixing typos/etc. in Standalone Spark doc
2012-09-16 15:28:52 -07:00
Andy Konwinski
6765d9727e
Adds a jekyll plugin (written in Ruby) to the _plugins directory
...
which generates scala doc by calling `sbt/sbt doc`, copies it over
to docs, and updates the links from the api webpage to now point to
the copied over scaladoc (making the _site directory easy to just
copy over to a public website).
2012-09-13 17:17:58 -07:00
Andy Konwinski
462cd8be60
Re-enabling responsive for better looking padding and more sane resizing,
...
but removed the collapsable stuff from the nav bar.
2012-09-13 15:39:35 -07:00
Andy Konwinski
5ec7a6665b
More crisp logo created from vector source (ai) and disabled
...
responsive css (so nav menu doesn't switch to collapsed version
for narrow viewports).
2012-09-13 15:27:33 -07:00
Andy Konwinski
b0207e2bfd
Replaces "Spark" word in nav bar with logo.
2012-09-13 12:08:12 -07:00
Andy Konwinski
130f2f2f41
Merge remote-tracking branch 'public-spark/dev' into doc
2012-09-13 11:29:25 -07:00
Denny
638a8511fa
Fixed navbar style problem
2012-09-13 09:56:27 -07:00
Denny
6d53b971b9
Added standalone and YARN docs. Merged standalone cluster into standalone doc
2012-09-13 09:47:54 -07:00
Andy Konwinski
ca2c999e0f
Making the link to api scaladocs work and migrating other code snippets
...
to use pygments syntax highlighting.
2012-09-12 23:25:07 -07:00
Andy Konwinski
c4db09ea76
Adds ec2-scripts.md back (it was mistakenly removed earlier due to
...
git weirdness).
2012-09-12 20:56:32 -07:00
Matei Zaharia
d3db46fdef
Remove title from content in Bagel
2012-09-12 19:50:38 -07:00
Matei Zaharia
88181bee9e
Small tweaks to generated doc pages
2012-09-12 19:47:31 -07:00
Andy Konwinski
1bcd09e093
Merge remote-tracking branch 'public-spark/dev' into doc
2012-09-12 19:28:55 -07:00
Andy Konwinski
35adccd008
Adds syntax highlighting (via pygments), and some style tweaks to make things
...
easier to read.
2012-09-12 19:27:44 -07:00
Reynold Xin
4b8b0d5c08
Added a link to AMPCamp website in Spark docs.
2012-09-12 17:54:16 -07:00
Andy Konwinski
95c5a376b5
Fixing a hanging sentence in docs/ec2-scripts.md
2012-09-12 16:17:24 -07:00
Andy Konwinski
3c7e40fa27
Removed the upper-case version of docs/EC2-Scripts.md.
2012-09-12 16:09:34 -07:00
Andy Konwinski
4d3a17c8d7
Fixing lots of broken links.
2012-09-12 16:06:18 -07:00
Andy Konwinski
49e98500a9
Updated base README to point to documentation site instead of wiki, updated
...
docs/README.md to describe use of Jekyll, and renmaed things to make them
more consistent with the lower-case-with-hyphens convention.
2012-09-12 13:03:43 -07:00
Andy Konwinski
16da942d66
Adding docs directory containing documentation currently on the wiki
...
which can be compiled via jekyll, using the command `jekyll`. To compile
and run a local webserver to serve the doc as a website, run
`jekyll --server`.
2012-09-12 13:03:43 -07:00