2ecbe02d5b
Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark). It also removes some files that I forgot to delete with #10207 Author: Timothy Hunter <timhunter@databricks.com> Closes #10234 from thunterdb/12212.
1 KiB
1 KiB
layout | title | displayTitle |
---|---|---|
global | Clustering - spark.ml | Clustering - spark.ml |
In this section, we introduce the pipeline API for clustering in mllib.
Table of Contents
- This will become a table of contents (this text will be scraped). {:toc}
Latent Dirichlet allocation (LDA)
LDA
is implemented as an Estimator
that supports both EMLDAOptimizer
and OnlineLDAOptimizer
,
and generates a LDAModel
as the base models. Expert users may cast a LDAModel
generated by
EMLDAOptimizer
to a DistributedLDAModel
if needed.
Refer to the Scala API docs for more details.
{% include_example scala/org/apache/spark/examples/ml/LDAExample.scala %}
Refer to the Java API docs for more details.
{% include_example java/org/apache/spark/examples/ml/JavaLDAExample.java %}