# A cowÂ´s story

Discussion in 'Scientific Statistics Math' started by Luis A. Afonso, Dec 1, 2010.

1. ### Luis A. AfonsoGuest

A cowÂ´s story

MCM (Monte Carlo Method) could be a good choice to provide Hypotheses Tests correct p-values to which only an approximate distribution of the test statistics is known.
Iâ€™ll illustrate by solving a problem issued in Probability and Statistical Inference /Robert V. Hogg, Elliot A. concerning the Chi-square gof (goodness of fit) test proposed by K. Pearson (1900).
Data is the fat production (in milk) by 90 cows, one intend though follows a normal (Gaussian) Distribution, H0. In order to follow the rule: to expect more than five values by class we choose to divide data in k= 15 classes, then 6 values are expected in each equal probability one. Therefore the class extremes were (-infinity, -1.5010), (-1.5010, -1.1108), . . . , (1.5010, +infinity) and distribution contents were (after ordering):
___2__4__4__5__5__5__5__6__6__6__6__7__7__10__12__
Chi= 13.66(6): Chi-square, 15 - 1 - 2 = 12 degrees of freedom and p(by chi-square) = 0.104, with alpha=5% there are no sufficient evidence to reject H0.
What happens if we do not want to approximate the test statistics and use random numbers instead simulating the class contents is that one get p = 0.150 with 400000 simulated samples. This result assures that the null hypotheses should not be rejected.

Luis

REM "butter1"
CLS : PRINT : PRINT
PRINT " Butterfat (in ponds) produced by 90 COWS"
DEFDBL A-Z
INPUT " all = "; all
n = 90
DIM x(90), L(91), h(50), inn(50)
DATA 486,537,513,583,453,510,570,500,458,555
DATA 618,327,350,643,500,497,421,505,637,599
DATA 392,574,492,635,460,696,593,422,499,524
DATA 539,339,472,427,532,470,417,437,388,481
DATA 537,489,418,434,466,464,544,475,608,444
DATA 573,611,586,613,645,540,494,532,691,478
DATA 513,583,457,612,628,516,452,501,453,643
DATA 541,439,627,619,617,394,607,502,395,470
DATA 531,526,496,561,491,380,345,274,672,509
FOR i = 1 TO 90: READ x(i): x = x(i): sx = sx + x / 90
sxx = sxx + x * x: NEXT i
mean = sx
var = sxx - n * mean * mean: var = var / (n - 1)
sd = SQR(var)
PRINT USING " ###.### "; mean; sd
FOR i = 1 TO n: x(i) = (x(i) - mean) / sd: NEXT i
PRINT " 15 classes <> 6/class , p=1/15= 0.016666... "
E = 6
DATA -100,-1.5010, -1.1108,-.8416, -.6229,-.4307,-.2533,-.0837
DATA .0837, .2533,.4307, .6229,.8416,1.1108,1.5010,100
FOR j = 1 TO 16
READ L(j): PRINT USING "## ####.#### "; j; L(j); : NEXT j : PRINT : PRINT
FOR tt = 1 TO n: h = x(tt)
FOR y = 1 TO 15
IF h < L(y + 1) AND h > L(y) THEN h(y) = h(y) + 1
NEXT y: NEXT tt
PRINT " contents/class "
FOR u = 1 TO 15: PRINT USING "####"; h(u);
chi = chi + (h(u) - E) ^ 2 / E: NEXT u
COLOR 14: PRINT USING " ###.###"; chi
COLOR 7
FOR i = 1 TO all
LOCATE 16, 50: PRINT USING "########"; all - i
qqui = 0: RANDOMIZE TIMER
FOR t = 1 TO n: g = INT(RND * 16) + 1
inn(g) = inn(g) + 1
NEXT t
qqui = 0
FOR f = 1 TO 16: qqui = qqui + (inn(f) - E) ^ 2 / E
NEXT f
COLOR 7
IF qqui > chi THEN outt = outt + 1 / all
FOR t = 1 TO 16: inn(t) = 0: NEXT t
NEXT i
LOCATE 17, 60: COLOR 12
PRINT USING " p value= #.### "; outt
END

Luis A. Afonso, Dec 1, 2010

2. ### Luis A. AfonsoGuest

10 classes, 9values/classe

Let be k=10, test statistics is 4.444; Chi-Square critical value (95% confidence level) is 0.87984 = 0.880.
By Monte Carlo (program V-10) I got (5 essays): 0.878, 0.873, 0.875, 0.875, 0.874;
_______ mean = 0.875 +/- 0.001.
This correct value of probability does validate the one obtained by the Chi- Square Distribution.

Luis

REM "V-10"
CLS : PRINT : PRINT " ****** V-10 ****** ":
PRINT " Butterfat (in pounds) produced by 90 COWS"
DEFDBL A-Z
INPUT " all = "; all
n = 90
DIM x(90), L(91), h(50), inn(50)
DATA 486,537,513,583,453,510,570,500,458,555
DATA 618,327,350,643,500,497,421,505,637,599
DATA 392,574,492,635,460,696,593,422,499,524
DATA 539,339,472,427,532,470,417,437,388,481
DATA 537,489,418,434,466,464,544,475,608,444
DATA 573,611,586,613,645,540,494,532,691,478
DATA 513,583,457,612,628,516,452,501,453,643
DATA 541,439,627,619,617,394,607,502,395,470
DATA 531,526,496,561,491,380,345,274,672,509
FOR i = 1 TO 90: READ x(i): x = x(i)
sx = sx + x / 90
sxx = sxx + x * x: NEXT i
mean = sx
ssd = sxx - n * mean * mean: var = ssd / (n - 1)
sd = SQR(var)
PRINT USING " ###.### "; mean; sd
FOR i = 1 TO n: x(i) = (x(i) - mean) / sd: NEXT i
PRINT " 10 classes <> 9/class , p=1/10= 0.01 "
kla = 10: E = n / kla
REM
DATA -100, -1.2816, -0.8416, -0.5244,-0.2533, 0.0000
DATA 0.2533,0.5244,0.8416,1.2816,100
FOR j = 0 TO kla
PRINT USING "## ####.#### "; j; L(j); : NEXT j
PRINT : PRINT
REM
FOR tt = 1 TO n: h = x(tt)
FOR y = 0 TO kla - 1
IF h < L(y + 1) AND h > L(y)
THEN h(y + 1) = h(y + 1) + 1
NEXT y: NEXT tt
PRINT " contents/class "
FOR u = 1 TO kla: PRINT USING "####"; h(u);
chi = chi + (h(u) - E) ^ 2 / E: NEXT u
COLOR 14
PRINT USING " ###.###"; chi: COLOR 7
FOR i = 1 TO all
LOCATE 16, 50
PRINT USING "########"; all - i
qqui = 0: RANDOMIZE TIMER
FOR t = 1 TO n: g = INT(RND * kla) + 1
inn(g) = inn(g) + 1
NEXT t
qqui = 0
FOR f = 1 TO kla
qqui = qqui + (inn(f) - E) ^ 2 / E: NEXT f
COLOR 7
IF qqui > chi THEN outt = outt + 1 / all
FOR t = 1 TO kla: inn(t) = 0: NEXT t
NEXT i
LOCATE 17, 60: COLOR 12
PRINT USING " p value= #.### "; outt
END

Luis A. Afonso, Dec 4, 2010

3. ### Luis A. AfonsoGuest

All Three

Itâ€™s a routine to provide CRITICAL VALUES (right tail)

_n=90/400 000 _______________5%_______1%___ __________________JB_______5.37______12.45__
__________________ U_______3.65______ 6.68__
__________________ V_______2.81______ 7.86__
_____/1 000 000 ____JB_______5.36_____ 12.49__
__________________ U_______3.65______ 6.68__
__________________ V_______2.81______ 7.84__

The Jarque-Bera Test significance alpha is unable to assure a similar alpha for U and V as well.

DATA COWS provided JB=0.577, U=0.269, V=0.308 showing that if the test is very LOW there is not conflict with the U and V critical values: all three are not rejected.
However, in general, the error remains: The Jarque â€“Bera Test is not reliable, it should be banned, because acceptance DOES NOT ASSURE us that Skewness and Kurtosis are acceptable at the same confidence level.

Luis

Luis A. Afonso, Dec 19, 2010