What form of regularization is this?

Discussion in 'Numerical Analysis' started by spasmous, Jul 30, 2006.

  1. spasmous

    spasmous Guest

    Tikhonov regularization is ( A'*A + w^2*I )x = A'*b. What happens if
    the identity matrix I in this equation is replaced with a diagonal
    matrix containing the diagonal of A'*A? Is that a sensible to do?
    Thanks for any pointers.
     
    spasmous, Jul 30, 2006
    #1
    1. Advertisements

  2. I've seen this idea before in the context of the Levenberg-Marquardt
    method for nonlinear least squares problems, where you solve a series
    of regularized linear least squares problems involving the Jacobian J.
    I've seen this described as "multiplying the diagonal of J'*J by
    (1+alpha)." Here, your w^2 would be alpha. This is sometimes called
    "multiplicative damping", as opposed to the "additive damping" that
    you get by adding w^2*I.

    Unfortunately, when one of the diagonal elements of J'*J is zero,
    multiplicative damping produces a matrix that is still singular, no
    matter how large the alpha parameter gets. Thus the method can fail.
    It's easy to produce specific examples of this. Additive damping does
    not fail in the same way. Despite this, some authors (e.g. Press's
    Numerical Recipes book) still describe the LM algorithm with
    multiplicative damping.

    In the context of a linear least squares problem, the additive damping
    Tikhonov regularization approach solves the constrained minimization
    problem

    min || x ||^2
    || Ax-b ||^2 <= delta^2

    or equivalently

    min || Ax-b ||^2
    || x ||^2 <= epsilon^2

    It's easy to show that w^2 (respective 1/w^2) is a Lagrange
    multiplier. The multiplicative damping approach doesn't have this
    interpretation.

    I can't imagine any situation where the multiplicative damping
    approach would be appropriate for an ill-conditioned linear least
    squares problem even if you believe it works adequately within the LM
    algorithm for nonlinear least squares problems.

    All of this is discussed in the book "Parameter Estimation and Inverse
    Problems" by Aster, Borchers, and Thurber.
     
    Brian Borchers, Jul 31, 2006
    #2
    1. Advertisements

  3. spasmous

    Ray Koopman Guest

    Solving (A'A + w^2 C)x = A'b, where C is symmetric positive definite,
    miminizes (Ax-b)'(Ax-b) + w^2 x'Cx. Does taking C = diag(A'A) make
    sense as some sort of penalty function?
     
    Ray Koopman, Jul 31, 2006
    #3
  4. Generally, yes, namely as long as A has no zero column, so that all
    diagonal entries of A^TA are positive. (If A has zero columns,
    simply delete them before solving the equation...)

    This choice makes the regularization scaling invariant.
    For the interpretation of the regularization term in terms of
    solving ensembles of similar problems, see
    A. Neumaier,
    Solving ill-conditioned and singular linear systems:
    A tutorial on regularization
    SIAM Review 40 (1998), 636-666.
    http://www.mat.univie.ac.at/~neum/papers.html#regtutorial

    The constrained problem
    min || Ax-b ||^2
    x^TCx<= const
    also produces the equation (A'A + w^2 C)x = A'b, with w^2 as a
    multiplier. Thus there is nothing special to recommend the identity
    over a scaling invariant choice of C, unless you are sure to
    have solid information about the size of x^Tx but none about that
    of x^TCx.


    Arnold Neumaier
     
    Arnold Neumaier, Jul 31, 2006
    #4
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.