Simple formula for Moore-Penrose pseudoinverse

Discussion in 'Math Research' started by Stephen Parrott, Dec 2, 2011.

  1. Engineers and physicists find use for the so-called
    "Moore-Penrose pseudoinverse" of an operator F: H --> K
    between Hilbert spaces H and K (which can be real or complex).
    Typically, H and K are finite dimensional
    (which will be assumed below for simplicity),
    and F is viewed as a matrix.

    A common notation for the Moore-Penrose pseudoinverse
    (henceforth simply called "pseudoinverse" is F with a superscript + , F^+,
    but for ASCII simplicity I will call the pseudoinverse G : K --> H.

    Let Null(F) denote the nullspace of F and In(F) its initial space
    (defined as the orthogonal complement of Null(F)). As usual, Range(F)
    will denote its range. Note for future reference that In(F) = In(F*F),
    where F* denotes the adjoint of F.

    Most mathematicians would probably define the pseudoinverse G of F
    as the unique operator K --> H such that:

    (i) G is a left inverse for the restriction of F to its initial
    space, i.e.,

    GFx = x for all x in In(F),

    which defines G on Range(F), and

    (ii) G annihilates the orthogonal complement of Range(F).

    Definitions in the engineering literature are typically expressed in terms
    of matrices and are much more complicated, e.g.,
    http://en.wikipedia.org/wiki/Moore-Penrose_pseudoinverse.

    It is almost immediate that the pseudoinverse G is given by
    the simple formula

    (*) G = (F*F)^(-1) F*,

    where (F*F)^(-1) denotes the inverse of the restriction of F*F to
    In(F*F) = In(F). To see this, just multiply on the right by F to get (i),
    and check (ii) separately.

    Oddly, formulas for G in the engineering literatures are typically much
    more
    complicated and generally require diagonalizing F*F, which can be difficult
    to do explicitly for F*F large (because its characteristic polynomial
    will typically have no algebraically expressible roots).
    However, F*F^(-1) can be obtained without diagonalizing F*F. (Just
    choose a basis for In(F), represent F*F|In(F) as a matrix with respect
    to this basis, and invert.)

    I was recently surprised to discover the simple formula (*) when
    I had occasion to actually calculate a pseudo-inverse. It is inconceivable
    to me that it could be previously unknown, and I would be grateful
    for a reference, which I shall need for a physics paper.

    For an audience of mathematicians, one could simply state (*)
    with minimal comment, but for people who may be more comfortable thinking
    in terms of of matrices, (*) may not seem obvious, and a reference is
    probably
    necessary. Of course, I could devote a few paragraphs to proving (*),
    but the paper will have a space limitation which I'd rather use for
    the subjects of the paper.

    A reference accessible online would be the most helpful. I have
    retired to a rural area of Nevada which is 200 miles from the nearest
    research library, so to access a book I generally have to buy it.

    I thank in advance anyone kind enough to be of help.

    Stephen Parrott
     
    Stephen Parrott, Dec 2, 2011
    #1
    1. Advertisements

  2. Your formula is indeed well known, you might look
    into books on numerical analysis dealing with least squares,
    I recommend the book of Ake Bjoerck
    ''numerical solution of least squares problems'' (SIAM)
    or Lawson-Hanson: ''solving least squares problems'' (Prentice Hall)
    It has however severe disadvantages: firstly you assume that F has full column
    rank (otherwise the inverse does not exist), secondly, this approach
    known in connection with the ''normal equations solution'' of linear
    least squares problems, is subject to an unnatural (i.e. avoidable) bad
    rounding error amplification (known as ''bad conditioning'').
    and moreover, F* F might loose ''sparsity'' even if F is very sparse.
    Hence the proper way of using the Moore-Penrose pseudoinverse in
    numerical work is using the singular value decomposition, which completely
    avoids building F* F . In the large scale case this can be computed also
    approximately using only the dominant subspaces.
    hth
    peter
     
    Peter Spellucci, Dec 4, 2011
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.