# Correct Offset in Logistic Regression

Discussion in 'Scientific Statistics Math' started by R, Dec 7, 2011.

1. ### RGuest

Hi Ray and others,

Yes, another logistic regression equation, but very brief, I
promise. Suppose we have the following dataset (just an illustration):

group x # events trials
1 24000 2 10
0 20000 3 20

(Note that group is dummy coded)

If we wanted to hand-calculate an adjusted rate ("probability of
event per 1000 units of x"), we would apply the following formula:

rate per 1000 x | group1 = 2 / (10*24) = 0.0083
rate per 1000 x | group0 = 3 / (20*20) = 0.0075

Now, if I were to roll out the data with x repeating within group, it
would look like this:

group x event
1 24000 0
1 24000 1
1 24000 1
1 24000 0
..
..
..
0 10000 1
0 10000 1
0 10000 0
0 10000 1
..
..
..

If I fit a log-binomial model with ln(x/1000) as the offset on the
rolled out data, I get the same group-specific hand calculated rates
above.

Here's the catch. I want to obtain the hand calculated rates from the
parameter estimates derived from a standard binary logistic regression
on the rolled out data.

So, I thought the correct approach would be to apply the logit
transformation to x/100,000 before entering it into the linear
predictor as an offset:

offset= ln[x_per100,000 / (1 - x_per100,000)]

(Note that if I use x_per1000, I am not able to calculate the x logits
because I end up trying to take the natural log of a negative value.)

I then fit the logistic regression model and calculate the rates:

logit[pr(event=1)] = b0 + b1*group + offset

rate per 1000 x | group 1 = (exp(b0 + b1) / [1 + exp(b0 + b1)]) / 100
rate per 1000 x | group 0 = (exp(b0) / [1 + exp(b0)]) / 100

But the rates are no longer the same. What am I doing wrong? I realize
there are several assumptions I'm making here--much could be said
correct transformation such that I am able to obtain the same rates
via logistic regression.

Any help would be much appreciated. Thanks so much for any guidance
provided.

Ryan

R, Dec 7, 2011

2. ### RGuest

Small correction--rolled out of illustration data for group 0 should
show that x is 20000, not 10000. Sorry.

Ryan

R, Dec 7, 2011

3. ### Rich UlrichGuest

[snip, a bunch]
You don't ever "apply the logit transformation" to anything
other than P/Q where Q is 1-P. That is the definition of a logit.
So if I understand what you are saying here, it is not sensible.
... and that is not correct arithmetic.

x as a rate per 100,000 is a 10 times larger number than the
same x per 10,000; where the *latter*, in the example given, was a
fraction, and lets you compute that (meaningless) natural log.

... and, as a general principal, if your model leads you to taking
the log of a negative value (either for data-in-hand, or for
conceivable data), then you need a new model.

Rich Ulrich, Dec 7, 2011
4. ### RGuest

Hi Rich,

I misspoke by calling the term "x_per100000". It really is event per
100,000 x units. Regardless, the general question stands. Is it
possible to obtain the desired "rate" via logistic regression? Suppose
you were presented with the data above and were asked to obtain the
probability of event per 1000 "x" units using parameter estimates from
logistic regression. Clearly a log-binomial or poisson model would
*work* by transforming x, ln(x/1000), and enetering it into the linear
predictor as an offset. Can a transformation be applied to "x" such
that one could obtain the same "rate" via logistic question?

Ryan

R, Dec 7, 2011
5. ### RGuest

there are far more direct, pragmatic ways to construct a model which
serve my purposes. It was really more of an intellectual exercise
which requires further thought.

Thanks,

Ryan

R, Dec 8, 2011