[SPARK-20011][ML][DOCS] Clarify documentation for ALS 'rank' parameter

## What changes were proposed in this pull request?

API documentation and collaborative filtering documentation page changes to clarify inconsistent description of ALS rank parameter.

 - [DOCS] was previously: "rank is the number of latent factors in the model."
 - [API] was previously:  "rank - number of features to use"

This change describes rank in both places consistently as:

 - "Number of features to use (also referred to as the number of latent factors)"

Author: Chris Snow <chris.snowuk.ibm.com>

Author: christopher snow <chsnow123@gmail.com>

Closes #17345 from snowch/SPARK-20011.
This commit is contained in:
christopher snow 2017-03-21 13:23:59 +00:00 committed by Sean Owen
parent d2dcd6792f
commit 7620aed828
3 changed files with 11 additions and 11 deletions

View file

@ -20,7 +20,7 @@ algorithm to learn these latent factors. The implementation in `spark.mllib` has
following parameters: following parameters:
* *numBlocks* is the number of blocks used to parallelize computation (set to -1 to auto-configure). * *numBlocks* is the number of blocks used to parallelize computation (set to -1 to auto-configure).
* *rank* is the number of latent factors in the model. * *rank* is the number of features to use (also referred to as the number of latent factors).
* *iterations* is the number of iterations of ALS to run. ALS typically converges to a reasonable * *iterations* is the number of iterations of ALS to run. ALS typically converges to a reasonable
solution in 20 iterations or less. solution in 20 iterations or less.
* *lambda* specifies the regularization parameter in ALS. * *lambda* specifies the regularization parameter in ALS.

View file

@ -301,7 +301,7 @@ object ALS {
* level of parallelism. * level of parallelism.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
* @param lambda regularization parameter * @param lambda regularization parameter
* @param blocks level of parallelism to split computation into * @param blocks level of parallelism to split computation into
@ -326,7 +326,7 @@ object ALS {
* level of parallelism. * level of parallelism.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
* @param lambda regularization parameter * @param lambda regularization parameter
* @param blocks level of parallelism to split computation into * @param blocks level of parallelism to split computation into
@ -349,7 +349,7 @@ object ALS {
* parallelism automatically based on the number of partitions in `ratings`. * parallelism automatically based on the number of partitions in `ratings`.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
* @param lambda regularization parameter * @param lambda regularization parameter
*/ */
@ -366,7 +366,7 @@ object ALS {
* parallelism automatically based on the number of partitions in `ratings`. * parallelism automatically based on the number of partitions in `ratings`.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
*/ */
@Since("0.8.0") @Since("0.8.0")
@ -383,7 +383,7 @@ object ALS {
* a level of parallelism given by `blocks`. * a level of parallelism given by `blocks`.
* *
* @param ratings RDD of (userID, productID, rating) pairs * @param ratings RDD of (userID, productID, rating) pairs
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
* @param lambda regularization parameter * @param lambda regularization parameter
* @param blocks level of parallelism to split computation into * @param blocks level of parallelism to split computation into
@ -410,7 +410,7 @@ object ALS {
* iteratively with a configurable level of parallelism. * iteratively with a configurable level of parallelism.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
* @param lambda regularization parameter * @param lambda regularization parameter
* @param blocks level of parallelism to split computation into * @param blocks level of parallelism to split computation into
@ -436,7 +436,7 @@ object ALS {
* partitions in `ratings`. * partitions in `ratings`.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
* @param lambda regularization parameter * @param lambda regularization parameter
* @param alpha confidence parameter * @param alpha confidence parameter
@ -455,7 +455,7 @@ object ALS {
* partitions in `ratings`. * partitions in `ratings`.
* *
* @param ratings RDD of [[Rating]] objects with userID, productID, and rating * @param ratings RDD of [[Rating]] objects with userID, productID, and rating
* @param rank number of features to use * @param rank number of features to use (also referred to as the number of latent factors)
* @param iterations number of iterations of ALS * @param iterations number of iterations of ALS
*/ */
@Since("0.8.1") @Since("0.8.1")

View file

@ -249,7 +249,7 @@ class ALS(object):
:param ratings: :param ratings:
RDD of `Rating` or (userID, productID, rating) tuple. RDD of `Rating` or (userID, productID, rating) tuple.
:param rank: :param rank:
Rank of the feature matrices computed (number of features). Number of features to use (also referred to as the number of latent factors).
:param iterations: :param iterations:
Number of iterations of ALS. Number of iterations of ALS.
(default: 5) (default: 5)
@ -287,7 +287,7 @@ class ALS(object):
:param ratings: :param ratings:
RDD of `Rating` or (userID, productID, rating) tuple. RDD of `Rating` or (userID, productID, rating) tuple.
:param rank: :param rank:
Rank of the feature matrices computed (number of features). Number of features to use (also referred to as the number of latent factors).
:param iterations: :param iterations:
Number of iterations of ALS. Number of iterations of ALS.
(default: 5) (default: 5)