Correct Offset in Logistic Regression

Discussion in 'Scientific Statistics Math' started by R, Dec 7, 2011.

  1. R

    R Guest

    Hi Ray and others,

    Yes, another logistic regression equation, but very brief, I
    promise. :)

    Suppose we have the following dataset (just an illustration):

    group x # events trials
    1 24000 2 10
    0 20000 3 20

    (Note that group is dummy coded)

    If we wanted to hand-calculate an adjusted rate ("probability of
    event per 1000 units of x"), we would apply the following formula:

    rate per 1000 x | group1 = 2 / (10*24) = 0.0083
    rate per 1000 x | group0 = 3 / (20*20) = 0.0075

    Now, if I were to roll out the data with x repeating within group, it
    would look like this:

    group x event
    1 24000 0
    1 24000 1
    1 24000 1
    1 24000 0
    ..
    ..
    ..
    0 10000 1
    0 10000 1
    0 10000 0
    0 10000 1
    ..
    ..
    ..

    If I fit a log-binomial model with ln(x/1000) as the offset on the
    rolled out data, I get the same group-specific hand calculated rates
    above.

    Here's the catch. I want to obtain the hand calculated rates from the
    parameter estimates derived from a standard binary logistic regression
    on the rolled out data.

    So, I thought the correct approach would be to apply the logit
    transformation to x/100,000 before entering it into the linear
    predictor as an offset:

    offset= ln[x_per100,000 / (1 - x_per100,000)]

    (Note that if I use x_per1000, I am not able to calculate the x logits
    because I end up trying to take the natural log of a negative value.)

    I then fit the logistic regression model and calculate the rates:

    logit[pr(event=1)] = b0 + b1*group + offset

    rate per 1000 x | group 1 = (exp(b0 + b1) / [1 + exp(b0 + b1)]) / 100
    rate per 1000 x | group 0 = (exp(b0) / [1 + exp(b0)]) / 100

    But the rates are no longer the same. What am I doing wrong? I realize
    there are several assumptions I'm making here--much could be said
    about this approach. Right now, however, I want to figure out the
    correct transformation such that I am able to obtain the same rates
    via logistic regression.

    Any help would be much appreciated. Thanks so much for any guidance
    provided.

    Ryan
     
    R, Dec 7, 2011
    #1
    1. Advertisements

  2. R

    R Guest

    Small correction--rolled out of illustration data for group 0 should
    show that x is 20000, not 10000. Sorry.

    Ryan
     
    R, Dec 7, 2011
    #2
    1. Advertisements

  3. R

    Rich Ulrich Guest

    [snip, a bunch]
    You don't ever "apply the logit transformation" to anything
    other than P/Q where Q is 1-P. That is the definition of a logit.
    So if I understand what you are saying here, it is not sensible.
    ... and that is not correct arithmetic.

    x as a rate per 100,000 is a 10 times larger number than the
    same x per 10,000; where the *latter*, in the example given, was a
    fraction, and lets you compute that (meaningless) natural log.

    ... and, as a general principal, if your model leads you to taking
    the log of a negative value (either for data-in-hand, or for
    conceivable data), then you need a new model.
     
    Rich Ulrich, Dec 7, 2011
    #3
  4. R

    R Guest

    Hi Rich,

    Thank you for your reply.

    I misspoke by calling the term "x_per100000". It really is event per
    100,000 x units. Regardless, the general question stands. Is it
    possible to obtain the desired "rate" via logistic regression? Suppose
    you were presented with the data above and were asked to obtain the
    probability of event per 1000 "x" units using parameter estimates from
    logistic regression. Clearly a log-binomial or poisson model would
    *work* by transforming x, ln(x/1000), and enetering it into the linear
    predictor as an offset. Can a transformation be applied to "x" such
    that one could obtain the same "rate" via logistic question?

    Ryan
     
    R, Dec 7, 2011
    #4
  5. R

    R Guest

    Please disregard this thread. It wasn't fully thought through, and
    there are far more direct, pragmatic ways to construct a model which
    serve my purposes. It was really more of an intellectual exercise
    which requires further thought.

    Thanks,

    Ryan
     
    R, Dec 8, 2011
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.