Help with probability&stat problem

Discussion in 'Scientific Statistics Math' started by tutorny, Jun 11, 2007.

  1. tutorny

    Jack Tomsky Guest

    My response

    My calculation of the probability as 0.9753 was exact, while your calculation of 0.9786 was a mediocre approximation. The professor would have given me an A for my answer and given you an F for your answer.

    Jack
     
    Jack Tomsky, Jun 12, 2007
    #21
    1. Advertisements

  2. Binomial law DF: a simple program

    N=110, p=0.2

    ________p(X<=110) = 1.0000000000
    ________p(X<=10) = 0.0015595228
    ________p(X<=22) = 0.5567271744
    ________p(X<=30) = 0.9752864841
    ________p(X<=50) = 0.9999999996
    ________
    ________Licas


    REM "BINcum"
    CLS
    DEFDBL A-Z
    PRINT " F(a) = p(X<=a) X=Bin(p, N) "
    INPUT " p , N "; p, n
    INPUT " a ( a<=n ) "; a
    w = p / (1 - p)
    DIM pp(n)
    pp(0) = (1 - p) ^ n: s = pp(0)
    IF a = 0 THEN GOTO 10
    FOR j = 0 TO n - 1
    IF j > a - 1 THEN GOTO 10
    pp(j + 1) = pp(j) * (n - j) / (j + 1) * w
    s = s + pp(j + 1)
    NEXT j
    10 LOCATE 10, 50: PRINT USING "#.########## "; s
    END
     
    Luis A. Afonso, Jun 13, 2007
    #22
    1. Advertisements

  3. tutorny

    Jack Tomsky Guest

    Binomial law DF: a simple program


    Now for the same cases, compare these simple exact results with the Afonso normal approximation employing his continuity correction.

    p(X<=110)= 1.0000000000
    p(X<=10) = 0.0030607156
    p(X<=22) = 0.5474347428
    p(X<=30) = 0.9786231409
    p(X<=50) = 1.0000000000

    The conclusion is that it is better to use the simple exact binomial calculations.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #23
  4. EXACT : p(X<=10) = 0.00156

    Normal Approximation

    *** Without C.C.
    Z = (10 - 22)/sqrt(0.2*0.8*110) = -22/ 4.1952…= -2.8604. _________________ F(Z) = 0.00212
    *** With C.C.
    ________Z = (10 - 22 + 0.5)/ 4.1952… = -2.9796
    ________________________ F(Z) = 0.00144


    The differences approx-EXACT are respectively

    _______Without = +0.00056
    _______With C.C. = -0.00012


    ... and all Jack Tomsky´s argumentation falls on earth.
    *********************************************

    Licas
     
    Luis A. Afonso, Jun 13, 2007
    #24
  5. tutorny

    Anon. Guest

    What seems very silly about this latest spat is that Jack showed the
    exact test, and made the comparison to the approximation that the OP had
    used. Luis has used a different approximation, and come to a slightly
    different answer.

    But you both appear to agree on the exact calculation.

    Bob

    --
    Bob O'Hara

    Dept. of Mathematics and Statistics
    P.O. Box 68 (Gustaf Hällströmin katu 2b)
    FIN-00014 University of Helsinki
    Finland

    Telephone: +358-9-191 51479
    Mobile: +358 50 599 0540
    Fax: +358-9-191 51400
    WWW: http://www.RNI.Helsinki.FI/~boh/
    Journal of Negative Results - EEB: http://www.jnr-eeb.org
     
    Anon., Jun 13, 2007
    #25
  6. Added Note

    1) The discussion was about if C.C. worsen the results comparing with the raw calculation. IT DID NOT.
    2) The two normal approximations have rather different accuracies (they ARE NOT slightly different contrarily Bob said).
    ______Relative errors (absolute values):
    ______ Not using C.C.___ (282-156)/156 = 35.9%
    _________ using C.C.___ (144-156)/156 = 7.7 %
    Therefore
    ______this last one procedure improves almost FIVE TIMES the result. To use it is an indisputable MUST.
    **********************

    Licas
     
    Luis A. Afonso, Jun 13, 2007
    #26
  7. tutorny

    Jack Tomsky Guest

    EXACT : p(X<=10) = 0.00156


    (10-22+0.5)/4.1952 = -2.7412, not 2.9796. Thus F(z) = 0.00306, not 0.00144.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #27
  8. tutorny

    Jack Tomsky Guest

    Added Note


    The exact probablity calculated from the binomial distribution is 0.0015595.

    The Afonso normal approximation is 0.002115.

    The "improved" Afonso normal approximation with a correction factor is 0.003061, which is worse.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #28
  9. tutorny

    Anon. Guest

    I'll leave it to others to decide whether 0.9717 and 0.9786 are more
    than slightly different,

    ______Relative errors (absolute values): ______
    But i find this curious: are you really saying that one must use the
    Normal approximation with a continuity correction, rather than the exact
    binomial calculation? If so, why? And why can't this be disputed?

    Bob

    --
    Bob O'Hara

    Dept. of Mathematics and Statistics
    P.O. Box 68 (Gustaf Hällströmin katu 2b)
    FIN-00014 University of Helsinki
    Finland

    Telephone: +358-9-191 51479
    Mobile: +358 50 599 0540
    Fax: +358-9-191 51400
    WWW: http://www.RNI.Helsinki.FI/~boh/
    Journal of Negative Results - EEB: http://www.jnr-eeb.org
     
    Anon., Jun 13, 2007
    #29
  10. tutorny

    Jack Tomsky Guest

    What seems very silly about this latest spat is that


    Bob, what this shows is that with a p of 0.20, the binomial distribution is too assymetrical to be effectively approximated by a symmetrical normal distribution, either with a mean of 22 or 22.5. The Poisson would probably give a much better approximation.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #30
  11. Bob wrote:


    *** I'll leave it to others to decide whether 0.9717 and 0.9786 are more than slightly different, ***

    My response

    You SHOULD NOT leave to others but to THINK ABOUT. The way I posted is the *one* when the signal o the error I unimportant.
    You made an error of analysis: it is not throughout the Z values that the errors must be chosen but by the associated tail probabilities, evidently.
    *********************

    Bob:

    *** But i find this curious: are you really saying that one must use the Normal approximation with a continuity correction, rather than the exact binomial calculation? If so, why? And why can't this be disputed? ***

    My response

    OF COURSE NOT: I didn’t and I’ll never say such nonsense.

    The reason by which the normal approximation is used is that is immediate. On contrary the exact way needs computer programming. If available, do not hesitate: use it. If not, the only *decent* way is to use the continuity correction.

    IT`S SIMPLE, ISN`T IT?
    **********************

    Licas
     
    Luis A. Afonso, Jun 13, 2007
    #31
  12. tutorny

    Jack Tomsky Guest


    For the binomial distribution calculations in Excel, you don't need to know BASIC programming. You just use the BINODIST function. It takes about 20 seconds to type in the arguments.

    The normal distribution, based on NORMSDIST, takes longer because you have to calculate the argument (x-pN+0.5)/sqrt(p*(1-p)*N). What you achieve with the added time consumed is an inaccurate approximation of an assymmetric distribution by a symmetric distribution.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #32
  13. tutorny

    Jack Tomsky Guest


    I think that it is indecent to approximate the exact Prob(X <=10) of .001560 by a normal approximation of 0.002115 and then to apply a correction factor to make it even worse at 0.003061.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #33
  14. Follow-up, in SHORT, the continuity correction leads to better results:
    ______With C.C._____EXACT_______Without C.C.
    ______0.00144______0.00156________0.00212____
    Diff___0.00012_____________________0.00056____

    ***************
    See my post Jun 13, 2007, 7:29 AM. And APPRECIATE that Jack Tomsky DID FALSIFY my evaluation

    I WROTE (ipsis verbis, please check):
    Date: Jun 13, 2007 7:29 AM
    Author: Luis A. Afonso
    Subject: Re: Help with probability&stat problem

    EXACT : p(X<=10) = 0.00156
    Normal Approximation
    *** Without C.C.
    Z = (10 - 22)/sqrt(0.2*0.8*110) = -22/ 4.1952 = -2.8604.
    ________________ F(Z) = 0.00212
    *** With C.C.
    ________Z = (10 - 22 + 0.5)/ 4.1952 = -2.9796
    ________________________ F(Z) = 0.00144
    The differences approx-EXACT are respectively

    _______Without = +0.00056
    _______With C.C. = -0.00012.
    ********

    Jack´s response:

    The exact probablity calculated from the binomial distribution is 0.0015595. The Afonso normal approximation is 0.002115. The "improved" Afonso normal approximation with a correction factor is 0.003061, which is worse. Jack

    *******************************
    Licas
     
    Luis A. Afonso, Jun 13, 2007
    #34
  15. tutorny

    Anon. Guest

    Note that you describe it as something that must be done, i.e. that
    using the exact test must not be done, and don't qualify it at all. I
    guess you just let your rhetoric run away with you, which is why I checked.

    But there, it's all cleared up now.

    Bob

    --
    Bob O'Hara
    Department of Mathematics and Statistics
    P.O. Box 68 (Gustaf Hällströmin katu 2b)
    FIN-00014 University of Helsinki
    Finland

    Telephone: +358-9-191 51479
    Mobile: +358 50 599 0540
    Fax: +358-9-191 51400
    WWW: http://www.RNI.Helsinki.FI/~boh/
    Journal of Negative Results - EEB: www.jnr-eeb.org
     
    Anon., Jun 13, 2007
    #35
  16. tutorny

    Jack Tomsky Guest

    Follow-up, in SHORT, the continuity correction leads


    I was correcting your arithmetic. The actual result with CC is 0.00306, leading to a difference of 0.00306-0.00156 = 0.00150, which is even worse than the 0.00056 error without the CC.





    Do we agree that 10-22+0.5 = -11.5? Do we agree that -11.5/4.1952 = -2.7412? Do we agree that F(-2.7412) = 0.003061?

    I thought that you would appreciate that I was able to check out your calculations and correct the arithmetic. Or was it your BASIC program which did the miscalculation?

    If the normal distribution used for the approximation has a smaller mean of 21.5 instead of 22, then the cdf must be larger for all x. So it should have been a red flag that you would get a smaller estimate with the CC than without the CC.

    Jack
     
    Jack Tomsky, Jun 13, 2007
    #36
  17. Everybody curious to learn the logic of this procedure should consult

    _____Wikipedia: Continuity correction

    ____Licas
     
    Luis A. Afonso, Jun 13, 2007
    #37
  18. tutorny

    Jack Tomsky Guest

    Everybody curious to learn the logic of this

    What Wikipedia failed to mention is that one could get the exact result from Excel in about 20 seconds, using BINOMDIST and inputting the three arguments plus "TRUE" to obtain the cumulative binomial. For example, in the case of p = 0.20, N = 110 and x = 10, the normal approximation overestimates the true probability and then adding the continuity correction term further exaggerates the error. It is also so complicated to apply that a typical user such as Afonso could not calculate the numbers correctly.

    Jack
     
    Jack Tomsky, Jun 14, 2007
    #38
  19. The BASIC program I made, listing presented at this thread, provides INSTANTANEOUSLY the cumulative probability even if I enter X=110 (N=110, p=0.2). I wonder why to prefer the extremely slow .EXEL?. TWENTY SECONDS when X=30? WHAT AN ETERNITY!!!
    *********
    Licas
     
    Luis A. Afonso, Jun 14, 2007
    #39
  20. The BASIC program I made, listing presented at this thread, provides INSTANTANEOUSLY the cumulative probability even if I enter X=110 (N=110, p=0.2). I wonder why to prefer the extremely slow .EXEL?. TWENTY SECONDS when X=30? WHAT AN ETERNITY!!!
    *********
    IMPROOVED PROGRAM

    This program (listing below) is able to evaluate the cumulative probabilities of Bin (p=0.000001, N=10^6).It spends 10 seconds (circa) to evaluate F(X=10^6) providing the value 1.0000000000D+000.

    Licas


    REM "BINcum"
    CLS
    DEFDBL A-Z
    PRINT " F(a) = p(X<=a) X=Bin(p, N) "
    INPUT " p , N "; p, n
    INPUT " a ( a<=n ) "; a
    w = p / (1 - p)
    ante = (1 - p) ^ n: s = ante
    IF s = 0 THEN GOTO 20
    IF a = 0 THEN GOTO 10
    FOR j = 0 TO n - 1
    IF j > a - 1 THEN GOTO 10
    post = ante * (n - j) / (j + 1) * w
    s = s + post
    ante = post
    NEXT j
    10 LOCATE 10, 50
    : PRINT USING "##.##########^^^^^"; s
    END
    20 PRINT " p(0)=0 "
     
    Luis A. Afonso, Jun 14, 2007
    #40
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.