Commit graph

1248 commits

Author SHA1 Message Date
Tom Graves 49aff7b9ad [SPARK-10432] spark.port.maxRetries documentation is unclear
Author: Tom Graves <tgraves@yahoo-inc.com>

Closes #8585 from tgravescs/SPARK-10432.
2015-09-03 13:46:16 -07:00
zhuol ec01280533 [SPARK-4223] [CORE] Support * in acls.
SPARK-4223.

Currently we support setting view and modify acls but you have to specify a list of users. It would be nice to support * meaning all users have access.

Manual tests to verify that: "*" works for any user in:
a. Spark ui: view and kill stage.     Done.
b. Spark history server.                  Done.
c. Yarn application killing.  Done.

Author: zhuol <zhuol@yahoo-inc.com>

Closes #8398 from zhuoliu/4223.
2015-09-01 11:14:59 -10:00
Sean Owen 3f63bd6023 [SPARK-10398] [DOCS] Migrate Spark download page to use new lua mirroring scripts
Migrate Apache download closer.cgi refs to new closer.lua

This is the bit of the change that affects the project docs; I'm implementing the changes to the Apache site separately.

Author: Sean Owen <sowen@cloudera.com>

Closes #8557 from srowen/SPARK-10398.
2015-09-01 20:06:01 +01:00
Xiangrui Meng ca69fc8efd [SPARK-10331] [MLLIB] Update example code in ml-guide
* The example code was added in 1.2, before `createDataFrame`. This PR switches to `createDataFrame`. Java code still uses JavaBean.
* assume `sqlContext` is available
* fix some minor issues from previous code review

jkbradley srowen feynmanliang

Author: Xiangrui Meng <meng@databricks.com>

Closes #8518 from mengxr/SPARK-10331.
2015-08-29 23:57:09 -07:00
Xiangrui Meng 905fbe498b [SPARK-10348] [MLLIB] updates ml-guide
* replace `ML Dataset` by `DataFrame` to unify the abstraction
* ML algorithms -> pipeline components to describe the main concept
* remove Scala API doc links from the main guide
* `Section Title` -> `Section tile` to be consistent with other section titles in MLlib guide
* modified lines break at 100 chars or periods

jkbradley feynmanliang

Author: Xiangrui Meng <meng@databricks.com>

Closes #8517 from mengxr/SPARK-10348.
2015-08-29 23:26:23 -07:00
GuoQiang Li 5369be8068 [SPARK-10350] [DOC] [SQL] Removed duplicated option description from SQL guide
Author: GuoQiang Li <witgo@qq.com>

Closes #8520 from witgo/SPARK-10350.
2015-08-29 13:20:27 -07:00
martinzapletal e8ea5bafee [SPARK-9910] [ML] User guide for train validation split
Author: martinzapletal <zapletal-martin@email.cz>

Closes #8377 from zapletal-martin/SPARK-9910.
2015-08-28 21:03:48 -07:00
Xiangrui Meng 88032ecaf0 [SPARK-9671] [MLLIB] re-org user guide and add migration guide
This PR updates the MLlib user guide and adds migration guide for 1.4->1.5.

* merge migration guide for `spark.mllib` and `spark.ml` packages
* remove dependency section from `spark.ml` guide
* move the paragraph about `spark.mllib` and `spark.ml` to the top and recommend `spark.ml`
* move Sam's talk to footnote to make the section focus on dependencies

Minor changes to code examples and other wording will be in a separate PR.

jkbradley srowen feynmanliang

Author: Xiangrui Meng <meng@databricks.com>

Closes #8498 from mengxr/SPARK-9671.
2015-08-28 13:53:31 -07:00
Yuhao Yang e2a843090c [SPARK-9890] [DOC] [ML] User guide for CountVectorizer
jira: https://issues.apache.org/jira/browse/SPARK-9890

document with Scala and java examples

Author: Yuhao Yang <hhbyyh@gmail.com>

Closes #8487 from hhbyyh/cvDoc.
2015-08-28 08:00:44 -07:00
Keiji Yoshida 18294cd871 Fix DynamodDB/DynamoDB typo in Kinesis Integration doc
Fix DynamodDB/DynamoDB typo in Kinesis Integration doc

Author: Keiji Yoshida <yoshida.keiji.84@gmail.com>

Closes #8501 from yosssi/patch-1.
2015-08-28 09:36:50 +01:00
Feynman Liang af0e1249b1 [SPARK-9905] [ML] [DOC] Adds LinearRegressionSummary user guide
* Adds user guide for `LinearRegressionSummary`
* Fixes unresolved issues in  #8197

CC jkbradley mengxr

Author: Feynman Liang <fliang@databricks.com>

Closes #8491 from feynmanliang/SPARK-9905.
2015-08-27 21:55:20 -07:00
MechCoder 30734d45fb [SPARK-9911] [DOC] [ML] Update Userguide for Evaluator
I added a small note about the different types of evaluator and the metrics used.

Author: MechCoder <manojkumarsivaraj334@gmail.com>

Closes #8304 from MechCoder/multiclass_evaluator.
2015-08-27 21:44:06 -07:00
Yin Huai b3dd569ad4 [SPARK-10287] [SQL] Fixes JSONRelation refreshing on read path
https://issues.apache.org/jira/browse/SPARK-10287

After porting json to HadoopFsRelation, it seems hard to keep the behavior of picking up new files automatically for JSON. This PR removes this behavior, so JSON is consistent with others (ORC and Parquet).

Author: Yin Huai <yhuai@databricks.com>

Closes #8469 from yhuai/jsonRefresh.
2015-08-27 16:11:25 -07:00
Feynman Liang 5bfe9e1111 [SPARK-9680] [MLLIB] [DOC] StopWordsRemovers user guide and Java compatibility test
* Adds user guide for ml.feature.StopWordsRemovers, ran code examples on my machine
* Cleans up scaladocs for public methods
* Adds test for Java compatibility
* Follow up Python user guide code example is tracked by SPARK-10249

Author: Feynman Liang <fliang@databricks.com>

Closes #8436 from feynmanliang/SPARK-10230.
2015-08-27 16:10:37 -07:00
MechCoder c94ecdfc5b [SPARK-9906] [ML] User guide for LogisticRegressionSummary
User guide for LogisticRegression summaries

Author: MechCoder <manojkumarsivaraj334@gmail.com>
Author: Manoj Kumar <mks542@nyu.edu>
Author: Feynman Liang <fliang@databricks.com>

Closes #8197 from MechCoder/log_summary_user_guide.
2015-08-27 15:33:43 -07:00
Yuhao Yang 6185cdd2af [SPARK-9901] User guide for RowMatrix Tall-and-skinny QR
jira: https://issues.apache.org/jira/browse/SPARK-9901

The jira covers only the document update. I can further provide example code for QR (like the ones for SVD and PCA) in a separate PR.

Author: Yuhao Yang <hhbyyh@gmail.com>

Closes #8462 from hhbyyh/qrDoc.
2015-08-27 13:57:20 -07:00
CodingCat 84baa5e9b5 [SPARK-10315] remove document on spark.akka.failure-detector.threshold
https://issues.apache.org/jira/browse/SPARK-10315

this parameter is not used any longer and there is some mistake in the current document , should be 'akka.remote.watch-failure-detector.threshold'

Author: CodingCat <zhunansjtu@gmail.com>

Closes #8483 from CodingCat/SPARK_10315.
2015-08-27 20:19:09 +01:00
Michael Armbrust dc86a227e4 [SPARK-9148] [SPARK-10252] [SQL] Update SQL Programming Guide
Author: Michael Armbrust <michael@databricks.com>

Closes #8441 from marmbrus/documentation.
2015-08-27 11:45:15 -07:00
Moussa Taifi 9625d13d57 [DOCS] [STREAMING] [KAFKA] Fix typo in exactly once semantics
Fix Typo in exactly once semantics
[Semantics of output operations] link

Author: Moussa Taifi <moutai10@gmail.com>

Closes #8468 from moutai/patch-3.
2015-08-27 10:34:47 +01:00
Cheng Lian 0fac144f6b [SPARK-9424] [SQL] Parquet programming guide updates for 1.5
Author: Cheng Lian <lian@databricks.com>

Closes #8467 from liancheng/spark-9424/parquet-docs-for-1.5.
2015-08-26 18:14:54 -07:00
Feynman Liang 125205cdb3 [SPARK-9888] [MLLIB] User guide for new LDA features
* Adds two new sections to LDA's user guide; one for each optimizer/model
 * Documents new features added to LDA (e.g. topXXXperXXX, asymmetric priors, hyperpam optimization)
 * Cleans up a TODO and sets a default parameter in LDA code

jkbradley hhbyyh

Author: Feynman Liang <fliang@databricks.com>

Closes #8254 from feynmanliang/SPARK-9888.
2015-08-25 17:39:20 -07:00
Yuhao Yang b37f0cc1b4 [SPARK-8531] [ML] Update ML user guide for MinMaxScaler
jira: https://issues.apache.org/jira/browse/SPARK-8531

Update ML user guide for MinMaxScaler

Author: Yuhao Yang <hhbyyh@gmail.com>
Author: unknown <yuhaoyan@yuhaoyan-MOBL1.ccr.corp.intel.com>

Closes #7211 from hhbyyh/minmaxdoc.
2015-08-25 10:54:03 -07:00
Joseph K. Bradley 13db11cb08 [SPARK-10061] [DOC] ML ensemble docs
User guide for spark.ml GBTs and Random Forests.
The examples are copied from the decision tree guide and modified to run.

I caught some issues I had somehow missed in the tree guide as well.

I have run all examples, including Java ones.  (Of course, I thought I had previously as well...)

CC: mengxr manishamde yanboliang

Author: Joseph K. Bradley <joseph@databricks.com>

Closes #8369 from jkbradley/ml-ensemble-docs.
2015-08-24 15:38:54 -07:00
Keiji Yoshida 623c675fde Update streaming-programming-guide.md
Update `See the Scala example` to `See the Java example`.

Author: Keiji Yoshida <yoshida.keiji.84@gmail.com>

Closes #8376 from yosssi/patch-1.
2015-08-23 11:04:29 +01:00
Keiji Yoshida 46fcb9e0db Update programming-guide.md
Update `lineLengths.persist();` to `lineLengths.persist(StorageLevel.MEMORY_ONLY());` because `JavaRDD#persist` needs a parameter of `StorageLevel`.

Author: Keiji Yoshida <yoshida.keiji.84@gmail.com>

Closes #8372 from yosssi/patch-1.
2015-08-22 02:38:10 -07:00
Xusen Yin 630a994e6a [SPARK-9893] User guide with Java test suite for VectorSlicer
Add user guide for `VectorSlicer`, with Java test suite and Python version VectorSlicer.

Note that Python version does not support selecting by names now.

Author: Xusen Yin <yinxusen@gmail.com>

Closes #8267 from yinxusen/SPARK-9893.
2015-08-21 16:30:12 -07:00
Alexander Ulanov dcfe0c5cde [SPARK-9846] [DOCS] User guide for Multilayer Perceptron Classifier
Added user guide for multilayer perceptron classifier:
  - Simplified description of the multilayer perceptron classifier
  - Example code for Scala and Java

Author: Alexander Ulanov <nashb@yandex.ru>

Closes #8262 from avulanov/SPARK-9846-mlpc-docs.
2015-08-20 20:02:27 -07:00
Eric Liang 8e0a072f78 [SPARK-9895] User Guide for RFormula Feature Transformer
mengxr

Author: Eric Liang <ekl@databricks.com>

Closes #8293 from ericl/docs-2.
2015-08-19 15:43:08 -07:00
Marcelo Vanzin 5fd53c64bb [SPARK-9833] [YARN] Add options to disable delegation token retrieval.
This allows skipping the code that tries to talk to Hive and HBase to
fetch delegation tokens, in case that somehow conflicts with the application
being run.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #8134 from vanzin/SPARK-9833.
2015-08-19 10:51:59 -07:00
Yanbo Liang 802b5b8791 [SPARK-10084] [MLLIB] [DOC] Add Python example for mllib FP-growth user guide
1, Add Python example for mllib FP-growth user guide.
2, Correct mistakes of Scala and Java examples.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #8279 from yanboliang/spark-10084.
2015-08-19 08:53:34 -07:00
Joseph K. Bradley 39e4ebd521 [SPARK-10060] [ML] [DOC] spark.ml DecisionTree user guide
New user guide section ml-decision-tree.md, including code examples.

I have run all examples, including the Java ones.

CC: manishamde yanboliang mengxr

Author: Joseph K. Bradley <joseph@databricks.com>

Closes #8244 from jkbradley/ml-dt-docs.
2015-08-19 07:38:27 -07:00
lewuathe ba2a07e2b6 [SPARK-9977] [DOCS] Update documentation for StringIndexer
By using `StringIndexer`, we can obtain indexed label on new column. So a following estimator should use this new column through pipeline if it wants to use string indexed label.
I think it is better to make it explicit on documentation.

Author: lewuathe <lewuathe@me.com>

Closes #8205 from Lewuathe/SPARK-9977.
2015-08-19 09:54:03 +01:00
Sean Owen f141efeafb [SPARK-10070] [DOCS] Remove Guava dependencies in user guides
`Lists.newArrayList` -> `Arrays.asList`

CC jkbradley feynmanliang

Anybody into replacing usages of `Lists.newArrayList` in the examples / source code too? this method isn't useful in Java 7 and beyond.

Author: Sean Owen <sowen@cloudera.com>

Closes #8272 from srowen/SPARK-10070.
2015-08-19 09:41:09 +01:00
Bill Chambers b23c4d3ffc Fix Broken Link
Link was broken because it included tick marks.

Author: Bill Chambers <wchambers@ischool.berkeley.edu>

Closes #8302 from anabranch/patch-1.
2015-08-19 00:05:01 -07:00
Alexander Ulanov 1c843e2848 [SPARK-9508] GraphX Pregel docs update with new Pregel code
SPARK-9436 simplifies the Pregel code. graphx-programming-guide needs to be modified accordingly since it lists the old Pregel code

Author: Alexander Ulanov <nashb@yandex.ru>

Closes #7831 from avulanov/SPARK-9508-pregel-doc2.
2015-08-18 22:13:52 -07:00
Davies Liu de3223872a [SPARK-9705] [DOC] fix docs about Python version
cc JoshRosen

Author: Davies Liu <davies@databricks.com>

Closes #8245 from davies/python_doc.
2015-08-18 22:11:27 -07:00
Feynman Liang badf7fa650 [SPARK-8473] [SPARK-9889] [ML] User guide and example code for DCT
mengxr jkbradley

Author: Feynman Liang <fliang@databricks.com>

Closes #8184 from feynmanliang/SPARK-9889-DCT-docs.
2015-08-18 17:54:49 -07:00
Dennis Huo 9b731fad2b [SPARK-9782] [YARN] Support YARN application tags via SparkConf
Add a new test case in yarn/ClientSuite which checks how the various SparkConf
and ClientArguments propagate into the ApplicationSubmissionContext.

Author: Dennis Huo <dhuo@google.com>

Closes #8072 from dennishuo/dhuo-yarn-application-tags.
2015-08-18 14:34:20 -07:00
Piotr Migdal 8bae9015b7 [SPARK-10085] [MLLIB] [DOCS] removed unnecessary numpy array import
See https://issues.apache.org/jira/browse/SPARK-10085

Author: Piotr Migdal <pmigdal@gmail.com>

Closes #8284 from stared/spark-10085.
2015-08-18 12:59:28 -07:00
Yanbo Liang 747c2ba800 [SPARK-10032] [PYSPARK] [DOC] Add Python example for mllib LDAModel user guide
Add Python example for mllib LDAModel user guide

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #8227 from yanboliang/spark-10032.
2015-08-18 12:56:36 -07:00
Yanbo Liang f4fa61effe [SPARK-10029] [MLLIB] [DOC] Add Python examples for mllib IsotonicRegression user guide
Add Python examples for mllib IsotonicRegression user guide

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #8225 from yanboliang/spark-10029.
2015-08-18 12:55:36 -07:00
Feynman Liang f5ea391290 [SPARK-9900] [MLLIB] User guide for Association Rules
Updates FPM user guide to include Association Rules.

Author: Feynman Liang <fliang@databricks.com>

Closes #8207 from feynmanliang/SPARK-9900-arules.
2015-08-18 12:53:57 -07:00
jose.cambronero c90c605dc6 [SPARK-9902] [MLLIB] Add Java and Python examples to user guide for 1-sample KS test
added doc examples for python.

Author: jose.cambronero <jose.cambronero@cloudera.com>

Closes #8154 from josepablocam/spark_9902.
2015-08-17 19:09:45 -07:00
Sandy Ryza f9d1a92aa1 [SPARK-7707] User guide and example code for KernelDensity
Author: Sandy Ryza <sandy@cloudera.com>

Closes #8230 from sryza/sandy-spark-7707.
2015-08-17 17:57:51 -07:00
Feynman Liang 0b6b017613 [SPARK-9898] [MLLIB] Prefix Span user guide
Adds user guide for `PrefixSpan`, including Scala and Java example code.

mengxr zhangjiajin

Author: Feynman Liang <fliang@databricks.com>

Closes #8253 from feynmanliang/SPARK-9898.
2015-08-17 17:53:24 -07:00
Yanbo Liang 0076e82123 [SPARK-9768] [PYSPARK] [ML] Add Python API and user guide for ml.feature.ElementwiseProduct
Add Python API, user guide and example for ml.feature.ElementwiseProduct.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #8061 from yanboliang/SPARK-9768.
2015-08-17 17:25:41 -07:00
Feynman Liang fdaf17f63f [SPARK-10068] [MLLIB] Adds links to MLlib types, algos, utilities listing
mengxr jkbradley

Author: Feynman Liang <fliang@databricks.com>

Closes #8255 from feynmanliang/SPARK-10068.
2015-08-17 15:42:14 -07:00
Reynold Xin e5fd60415f [SPARK-9934] Deprecate NIO ConnectionManager.
Deprecate NIO ConnectionManager in Spark 1.5.0, before removing it in Spark 1.6.0.

Author: Reynold Xin <rxin@databricks.com>

Closes #8162 from rxin/SPARK-9934.
2015-08-14 20:55:32 -07:00
Rosstin 7a539ef3b1 [SPARK-8965] [DOCS] Add ml-guide Python Example: Estimator, Transformer, and Param
Added ml-guide Python Example: Estimator, Transformer, and Param
/docs/_site/ml-guide.html

Author: Rosstin <asterazul@gmail.com>

Closes #8081 from Rosstin/SPARK-8965.
2015-08-13 09:18:39 -07:00
Niranjan Padmanabhan 738f353988 [SPARK-9092] Fixed incompatibility when both num-executors and dynamic...
… allocation are set. Now, dynamic allocation is set to false when num-executors is explicitly specified as an argument. Consequently, executorAllocationManager in not initialized in the SparkContext.

Author: Niranjan Padmanabhan <niranjan.padmanabhan@cloudera.com>

Closes #7657 from neurons/SPARK-9092.
2015-08-12 16:10:21 -07:00