spark-instrumented-optimizer/docs/mllib-optimization.md

---
layout: global
title: MLlib - Optimization
---

* Table of contents
{:toc}


# Gradient Descent Primitive

[Gradient descent](http://en.wikipedia.org/wiki/Gradient_descent) (along with
stochastic variants thereof) are first-order optimization methods that are
well-suited for large-scale and distributed computation. Gradient descent
methods aim to find a local minimum of a function by iteratively taking steps
in the direction of the negative gradient of the function at the current point,
i.e., the current parameter value. Gradient descent is included as a low-level
primitive in MLlib, upon which various ML algorithms are developed, and has the
following parameters:

* *gradient* is a class that computes the stochastic gradient of the function
being optimized, i.e., with respect to a single training example, at the
current parameter value. MLlib includes gradient classes for common loss
functions, e.g., hinge, logistic, least-squares.  The gradient class takes as
input a training example, its label, and the current parameter value. 
* *updater* is a class that updates weights in each iteration of gradient
descent. MLlib includes updaters for cases without regularization, as well as
L1 and L2 regularizers.
* *stepSize* is a scalar value denoting the initial step size for gradient
descent. All updaters in MLlib use a step size at the t-th step equal to
stepSize / sqrt(t). 
* *numIterations* is the number of iterations to run.
* *regParam* is the regularization parameter when using L1 or L2 regularization.
* *miniBatchFraction* is the fraction of the data used to compute the gradient
at each iteration.

Available algorithms for gradient descent:

* [GradientDescent](api/mllib/index.html#org.apache.spark.mllib.optimization.GradientDescent)
Merge pull request #552 from martinjaggi/master. Closes #552. tex formulas in the documentation using mathjax. and spliting the MLlib documentation by techniques see jira https://spark-project.atlassian.net/browse/MLLIB-19 and https://github.com/shivaram/spark/compare/mathjax Author: Martin Jaggi <m.jaggi@gmail.com> == Merge branch commits == commit 0364bfabbfc347f917216057a20c39b631842481 Author: Martin Jaggi <m.jaggi@gmail.com> Date: Fri Feb 7 03:19:38 2014 +0100 minor polishing, as suggested by @pwendell commit dcd2142c164b2f602bf472bb152ad55bae82d31a Author: Martin Jaggi <m.jaggi@gmail.com> Date: Thu Feb 6 18:04:26 2014 +0100 enabling inline latex formulas with $.$ same mathjax configuration as used in math.stackexchange.com sample usage in the linear algebra (SVD) documentation commit bbafafd2b497a5acaa03a140bb9de1fbb7d67ffa Author: Martin Jaggi <m.jaggi@gmail.com> Date: Thu Feb 6 17:31:29 2014 +0100 split MLlib documentation by techniques and linked from the main mllib-guide.md site commit d1c5212b93c67436543c2d8ddbbf610fdf0a26eb Author: Martin Jaggi <m.jaggi@gmail.com> Date: Thu Feb 6 16:59:43 2014 +0100 enable mathjax formula in the .md documentation files code by @shivaram commit d73948db0d9bc36296054e79fec5b1a657b4eab4 Author: Martin Jaggi <m.jaggi@gmail.com> Date: Thu Feb 6 16:57:23 2014 +0100 minor update on how to compile the documentation 2014-02-08 14:39:13 -05:00			`---`
			`layout: global`
			`title: MLlib - Optimization`
			`---`

			`* Table of contents`
			`{:toc}`


			`# Gradient Descent Primitive`

			`[Gradient descent](http://en.wikipedia.org/wiki/Gradient_descent) (along with`
			`stochastic variants thereof) are first-order optimization methods that are`
			`well-suited for large-scale and distributed computation. Gradient descent`
			`methods aim to find a local minimum of a function by iteratively taking steps`
			`in the direction of the negative gradient of the function at the current point,`
			`i.e., the current parameter value. Gradient descent is included as a low-level`
			`primitive in MLlib, upon which various ML algorithms are developed, and has the`
			`following parameters:`

			`* gradient is a class that computes the stochastic gradient of the function`
			`being optimized, i.e., with respect to a single training example, at the`
			`current parameter value. MLlib includes gradient classes for common loss`
			`functions, e.g., hinge, logistic, least-squares. The gradient class takes as`
			`input a training example, its label, and the current parameter value.`
			`* updater is a class that updates weights in each iteration of gradient`
			`descent. MLlib includes updaters for cases without regularization, as well as`
			`L1 and L2 regularizers.`
			`* stepSize is a scalar value denoting the initial step size for gradient`
			`descent. All updaters in MLlib use a step size at the t-th step equal to`
			`stepSize / sqrt(t).`
			`* numIterations is the number of iterations to run.`
			`* regParam is the regularization parameter when using L1 or L2 regularization.`
			`* miniBatchFraction is the fraction of the data used to compute the gradient`
			`at each iteration.`

			`Available algorithms for gradient descent:`

			`* [GradientDescent](api/mllib/index.html#org.apache.spark.mllib.optimization.GradientDescent)`