# What form of regularization is this?

Discussion in 'Numerical Analysis' started by spasmous, Jul 30, 2006.

1. ### spasmousGuest

Tikhonov regularization is ( A'*A + w^2*I )x = A'*b. What happens if
the identity matrix I in this equation is replaced with a diagonal
matrix containing the diagonal of A'*A? Is that a sensible to do?
Thanks for any pointers.

spasmous, Jul 30, 2006

2. ### Brian BorchersGuest

I've seen this idea before in the context of the Levenberg-Marquardt
method for nonlinear least squares problems, where you solve a series
of regularized linear least squares problems involving the Jacobian J.
I've seen this described as "multiplying the diagonal of J'*J by
(1+alpha)." Here, your w^2 would be alpha. This is sometimes called
"multiplicative damping", as opposed to the "additive damping" that

Unfortunately, when one of the diagonal elements of J'*J is zero,
multiplicative damping produces a matrix that is still singular, no
matter how large the alpha parameter gets. Thus the method can fail.
It's easy to produce specific examples of this. Additive damping does
not fail in the same way. Despite this, some authors (e.g. Press's
Numerical Recipes book) still describe the LM algorithm with
multiplicative damping.

In the context of a linear least squares problem, the additive damping
Tikhonov regularization approach solves the constrained minimization
problem

min || x ||^2
|| Ax-b ||^2 <= delta^2

or equivalently

min || Ax-b ||^2
|| x ||^2 <= epsilon^2

It's easy to show that w^2 (respective 1/w^2) is a Lagrange
multiplier. The multiplicative damping approach doesn't have this
interpretation.

I can't imagine any situation where the multiplicative damping
approach would be appropriate for an ill-conditioned linear least
squares problem even if you believe it works adequately within the LM
algorithm for nonlinear least squares problems.

All of this is discussed in the book "Parameter Estimation and Inverse
Problems" by Aster, Borchers, and Thurber.

Brian Borchers, Jul 31, 2006

3. ### Ray KoopmanGuest

Solving (A'A + w^2 C)x = A'b, where C is symmetric positive definite,
miminizes (Ax-b)'(Ax-b) + w^2 x'Cx. Does taking C = diag(A'A) make
sense as some sort of penalty function?

Ray Koopman, Jul 31, 2006
4. ### Arnold NeumaierGuest

Generally, yes, namely as long as A has no zero column, so that all
diagonal entries of A^TA are positive. (If A has zero columns,
simply delete them before solving the equation...)

This choice makes the regularization scaling invariant.
For the interpretation of the regularization term in terms of
solving ensembles of similar problems, see
A. Neumaier,
Solving ill-conditioned and singular linear systems:
A tutorial on regularization
SIAM Review 40 (1998), 636-666.
http://www.mat.univie.ac.at/~neum/papers.html#regtutorial

The constrained problem
min || Ax-b ||^2
x^TCx<= const
also produces the equation (A'A + w^2 C)x = A'b, with w^2 as a
multiplier. Thus there is nothing special to recommend the identity
over a scaling invariant choice of C, unless you are sure to
have solid information about the size of x^Tx but none about that
of x^TCx.

Arnold Neumaier

Arnold Neumaier, Jul 31, 2006