spark-instrumented-optimizer/docs/mllib-classification-regression.md
Joseph K. Bradley d20559b157 [SPARK-5974] [SPARK-5980] [mllib] [python] [docs] Update ML guide with save/load, Python GBT
* Add GradientBoostedTrees Python examples to ML guide
  * I ran these in the pyspark shell, and they worked.
* Add save/load to examples in ML guide
* Added note to python docs about predict,transform not working within RDD actions,transformations in some cases (See SPARK-5981)

CC: mengxr

Author: Joseph K. Bradley <joseph@databricks.com>

Closes #4750 from jkbradley/SPARK-5974 and squashes the following commits:

c410e38 [Joseph K. Bradley] Added note to LabeledPoint about attributes
bcae18b [Joseph K. Bradley] Added import of models for save/load examples in ml guide.  Fixed line length for tree.py, feature.py (but not other ML Pyspark files yet).
6d81c3e [Joseph K. Bradley] completed python GBT examples
9903309 [Joseph K. Bradley] Added note to python docs about predict,transform not working within RDD actions,transformations in some cases
c7dfad8 [Joseph K. Bradley] Added model save/load to ML guide.  Added GBT examples to ML guide
2015-02-25 16:13:17 -08:00

42 lines
1.7 KiB
Markdown

---
layout: global
title: Classification and Regression - MLlib
displayTitle: <a href="mllib-guide.html">MLlib</a> - Classification and Regression
---
MLlib supports various methods for
[binary classification](http://en.wikipedia.org/wiki/Binary_classification),
[multiclass
classification](http://en.wikipedia.org/wiki/Multiclass_classification), and
[regression analysis](http://en.wikipedia.org/wiki/Regression_analysis). The table below outlines
the supported algorithms for each type of problem.
<table class="table">
<thead>
<tr><th>Problem Type</th><th>Supported Methods</th></tr>
</thead>
<tbody>
<tr>
<td>Binary Classification</td><td>linear SVMs, logistic regression, decision trees, random forests, gradient-boosted trees, naive Bayes</td>
</tr>
<tr>
<td>Multiclass Classification</td><td>decision trees, random forests, naive Bayes</td>
</tr>
<tr>
<td>Regression</td><td>linear least squares, Lasso, ridge regression, decision trees, random forests, gradient-boosted trees, isotonic regression</td>
</tr>
</tbody>
</table>
More details for these methods can be found here:
* [Linear models](mllib-linear-methods.html)
* [binary classification (SVMs, logistic regression)](mllib-linear-methods.html#binary-classification)
* [linear regression (least squares, Lasso, ridge)](mllib-linear-methods.html#linear-least-squares-lasso-and-ridge-regression)
* [Decision trees](mllib-decision-tree.html)
* [Ensembles of decision trees](mllib-ensembles.html)
* [random forests](mllib-ensembles.html#random-forests)
* [gradient-boosted trees](mllib-ensembles.html#gradient-boosted-trees-gbts)
* [Naive Bayes](mllib-naive-bayes.html)
* [Isotonic regression](mllib-isotonic-regression.html)