spark-instrumented-optimizer/python/pyspark/mllib/linalg
Henry D a32c92c0cd [SPARK-28140][MLLIB][PYTHON] Accept DataFrames in RowMatrix and IndexedRowMatrix constructors
## What changes were proposed in this pull request?

In both cases, the input `DataFrame` schema must contain only the information that's required for the matrix object, so a vector column in the case of `RowMatrix` and long and vector columns for `IndexedRowMatrix`.

## How was this patch tested?

Unit tests that verify:
- `RowMatrix` and `IndexedRowMatrix` can be created from `DataFrame`s
- If the schema does not match expectations, we throw an `IllegalArgumentException`

Please review https://spark.apache.org/contributing.html before opening a pull request.

Closes #24953 from henrydavidge/row-matrix-df.

Authored-by: Henry D <henrydavidge@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-07-09 16:39:21 -05:00
..
__init__.py [SPARK-28206][PYTHON] Remove the legacy Epydoc in PySpark API documentation 2019-07-05 10:08:22 -07:00
distributed.py [SPARK-28140][MLLIB][PYTHON] Accept DataFrames in RowMatrix and IndexedRowMatrix constructors 2019-07-09 16:39:21 -05:00