5ffd5d3838
## What changes were proposed in this pull request? Made DataFrame-based API primary * Spark doc menu bar and other places now link to ml-guide.html, not mllib-guide.html * mllib-guide.html keeps RDD-specific list of features, with a link at the top redirecting people to ml-guide.html * ml-guide.html includes a "maintenance mode" announcement about the RDD-based API * **Reviewers: please check this carefully** * (minor) Titles for DF API no longer include "- spark.ml" suffix. Titles for RDD API have "- RDD-based API" suffix * Moved migration guide to ml-guide from mllib-guide * Also moved past guides from mllib-migration-guides to ml-migration-guides, with a redirect link on mllib-migration-guides * **Reviewers**: I did not change any of the content of the migration guides. Reorganized DataFrame-based guide: * ml-guide.html mimics the old mllib-guide.html page in terms of content: overview, migration guide, etc. * Moved Pipeline description into ml-pipeline.html and moved tuning into ml-tuning.html * **Reviewers**: I did not change the content of these guides, except some intro text. * Sidebar remains the same, but with pipeline and tuning sections added Other: * ml-classification-regression.html: Moved text about linear methods to new section in page ## How was this patch tested? Generated docs locally Author: Joseph K. Bradley <joseph@databricks.com> Closes #14213 from jkbradley/ml-guide-2.0.
60 lines
2 KiB
Markdown
60 lines
2 KiB
Markdown
---
|
|
layout: global
|
|
title: PMML model export - RDD-based API
|
|
displayTitle: PMML model export - RDD-based API
|
|
---
|
|
|
|
* Table of contents
|
|
{:toc}
|
|
|
|
## `spark.mllib` supported models
|
|
|
|
`spark.mllib` supports model export to Predictive Model Markup Language ([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)).
|
|
|
|
The table below outlines the `spark.mllib` models that can be exported to PMML and their equivalent PMML model.
|
|
|
|
<table class="table">
|
|
<thead>
|
|
<tr><th>`spark.mllib` model</th><th>PMML model</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>KMeansModel</td><td>ClusteringModel</td>
|
|
</tr>
|
|
<tr>
|
|
<td>LinearRegressionModel</td><td>RegressionModel (functionName="regression")</td>
|
|
</tr>
|
|
<tr>
|
|
<td>RidgeRegressionModel</td><td>RegressionModel (functionName="regression")</td>
|
|
</tr>
|
|
<tr>
|
|
<td>LassoModel</td><td>RegressionModel (functionName="regression")</td>
|
|
</tr>
|
|
<tr>
|
|
<td>SVMModel</td><td>RegressionModel (functionName="classification" normalizationMethod="none")</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Binary LogisticRegressionModel</td><td>RegressionModel (functionName="classification" normalizationMethod="logit")</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
## Examples
|
|
<div class="codetabs">
|
|
|
|
<div data-lang="scala" markdown="1">
|
|
To export a supported `model` (see table above) to PMML, simply call `model.toPMML`.
|
|
|
|
As well as exporting the PMML model to a String (`model.toPMML` as in the example above), you can export the PMML model to other formats.
|
|
|
|
Refer to the [`KMeans` Scala docs](api/scala/index.html#org.apache.spark.mllib.clustering.KMeans) and [`Vectors` Scala docs](api/scala/index.html#org.apache.spark.mllib.linalg.Vectors$) for details on the API.
|
|
|
|
Here a complete example of building a KMeansModel and print it out in PMML format:
|
|
{% include_example scala/org/apache/spark/examples/mllib/PMMLModelExportExample.scala %}
|
|
|
|
For unsupported models, either you will not find a `.toPMML` method or an `IllegalArgumentException` will be thrown.
|
|
|
|
</div>
|
|
|
|
</div>
|