2015-11-30 17:56:51 -05:00
|
|
|
---
|
|
|
|
layout: global
|
2015-12-10 15:50:46 -05:00
|
|
|
title: Clustering - spark.ml
|
|
|
|
displayTitle: Clustering - spark.ml
|
2015-11-30 17:56:51 -05:00
|
|
|
---
|
|
|
|
|
|
|
|
In this section, we introduce the pipeline API for [clustering in mllib](mllib-clustering.html).
|
|
|
|
|
2015-12-08 21:40:21 -05:00
|
|
|
**Table of Contents**
|
|
|
|
|
|
|
|
* This will become a table of contents (this text will be scraped).
|
|
|
|
{:toc}
|
|
|
|
|
2015-11-30 17:56:51 -05:00
|
|
|
## Latent Dirichlet allocation (LDA)
|
|
|
|
|
|
|
|
`LDA` is implemented as an `Estimator` that supports both `EMLDAOptimizer` and `OnlineLDAOptimizer`,
|
|
|
|
and generates a `LDAModel` as the base models. Expert users may cast a `LDAModel` generated by
|
|
|
|
`EMLDAOptimizer` to a `DistributedLDAModel` if needed.
|
|
|
|
|
|
|
|
<div class="codetabs">
|
|
|
|
|
|
|
|
<div data-lang="scala" markdown="1">
|
|
|
|
|
|
|
|
Refer to the [Scala API docs](api/scala/index.html#org.apache.spark.ml.clustering.LDA) for more details.
|
|
|
|
|
|
|
|
{% include_example scala/org/apache/spark/examples/ml/LDAExample.scala %}
|
|
|
|
</div>
|
|
|
|
|
|
|
|
<div data-lang="java" markdown="1">
|
|
|
|
|
|
|
|
Refer to the [Java API docs](api/java/org/apache/spark/ml/clustering/LDA.html) for more details.
|
|
|
|
|
|
|
|
{% include_example java/org/apache/spark/examples/ml/JavaLDAExample.java %}
|
|
|
|
</div>
|
|
|
|
|
|
|
|
</div>
|