[SPARK-6266] [MLLIB] PySpark SparseVector missing doc for size, indices, values

Write missing pydocs in `SparseVector` attributes.

Author: lewuathe <lewuathe@me.com>

Closes #7290 from Lewuathe/SPARK-6266 and squashes the following commits:

51d9895 [lewuathe] Update docs
0480d35 [lewuathe] Merge branch 'master' into SPARK-6266
ba42cf3 [lewuathe] [SPARK-6266] PySpark SparseVector missing doc for size, indices, values
This commit is contained in:
lewuathe 2015-07-09 08:16:26 -07:00 committed by Xiangrui Meng
parent 09cb0d9c2d
commit f88b12537e

View file

@ -445,8 +445,10 @@ class SparseVector(Vector):
values (sorted by index).
:param size: Size of the vector.
:param args: Non-zero entries, as a dictionary, list of tupes,
or two sorted lists containing indices and values.
:param args: Active entries, as a dictionary {index: value, ...},
a list of tuples [(index, value), ...], or a list of strictly i
ncreasing indices and a list of corresponding values [index, ...],
[value, ...]. Inactive entries are treated as zeros.
>>> SparseVector(4, {1: 1.0, 3: 5.5})
SparseVector(4, {1: 1.0, 3: 5.5})
@ -456,6 +458,7 @@ class SparseVector(Vector):
SparseVector(4, {1: 1.0, 3: 5.5})
"""
self.size = int(size)
""" Size of the vector. """
assert 1 <= len(args) <= 2, "must pass either 2 or 3 arguments"
if len(args) == 1:
pairs = args[0]
@ -463,7 +466,9 @@ class SparseVector(Vector):
pairs = pairs.items()
pairs = sorted(pairs)
self.indices = np.array([p[0] for p in pairs], dtype=np.int32)
""" A list of indices corresponding to active entries. """
self.values = np.array([p[1] for p in pairs], dtype=np.float64)
""" A list of values corresponding to active entries. """
else:
if isinstance(args[0], bytes):
assert isinstance(args[1], bytes), "values should be string too"