Jarque-Bera test: confidence intervals for normal data

Discussion in 'Scientific Statistics Math' started by Luis A. Afonso, Mar 7, 2007.

  1. You are stupid, even for a Biologist, Jack.

    As I stressed before (at the FIRST post of this THREAD, Mars 7) the values I find out (by Monte Carlo simulation) are the 95% and 99% fractiles of the JB (Jarque -Bera) statistics. It allows testing the Hypothesis H0: is the sample from a normal Distribution, against H1 (Ha), the sample is no-normal.
    If you had noted before JB is never negative because is a sum of squares (multiplied by constants), consequently the CONFIDENCE INTERVALS are [0, fractile (1-alpha)], with the two levels of significance, alpha =0.05 and 0.01.

    I’m answering to your comments, would you do the same of mine?

    If you are so SURE of you are saying you should criticize openly, as you do for me, the authors of

    1) Jarque - Bera Test and its Competitors for Testing Normality, Thorsten Thadewald and Herbert Buning (March 14, 2004),
    2) Precise finite-sample quantiles of the Jarque-Bera adjusted Lagrange multiplier test, Diethelm Wurtz and Helmut G. Katzgraber (August 2005).

    IS IT A DEAL? Let’s see how your guts are!!!!!!!

    __________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 10, 2007
    #21
    1. Advertisements

  2. Luis A. Afonso

    Jack Tomsky Guest

    You are stupid, even for a Biologist, Jack.



    These are not confidence intervals because then every sample would have the same confidence interval.

    Jack
     
    Jack Tomsky, Mar 10, 2007
    #22
    1. Advertisements

  3. *** These are not confidence intervals because then every sample would have the same confidence interval. Jack ***
    ... and consequently, facing the Population of normal N(0,1) random samples of size n, you deny that is expected that 95% of the means are in the interval
    ______[- 1.960 / sqrt (n) , 1.960 / sqrt (n)]
    Isn’t?


    Facing the Wikipedia definition:

    *** In statistics, a confidence interval (CI) for a population parameter is an interval between two numbers with an associated probability p which is generated from a random sample of an underlying population, such that if the sampling was repeated numerous times and the confidence interval recalculated from each sample according to the same method, a proportion p of the confidence intervals would contain the population parameter in question.***

    What, IN SIMULATION terms, the algorithm is?
    Is to synthesize a great number (2 millions) of n sized samples, to evaluate the test statistics (JB for example), to memorize its values and then from the empirical distribution to evaluate the fractiles. ALL FREQUENCIST STATICIANS (like me) are very comfortable with this procedure and since the middle of the 60´s to now there are millions of papers in this context.
    Are you so *brave* to state that are all WRONG?
    If so, fight them and let me in peace and quite , I am not even an *atom* in this CROWD. In fact I´m working for you: I did point out at least 4 teams that followed the procedure I used. : GO AND EAT THEM! If you are interested I´ll give you the names a full battalion, even an army !!!!!!!!!.

    _________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 10, 2007
    #23
  4. Luis A. Afonso

    Jack Tomsky Guest

    *** These are not confidence intervals because then


    Under the Afonso Theory of Statistics, all confidence intervals are the same and are independent of the sample. Any information contained in the sample is ignored. Even your Wikipedia reference says that confidence intervals depend on the sample.

    Similarly, under the Afonso Theory of Statistics, one is never allowed to accept the hypothesis that 8/13 is greater than 5/13.

    Jack
     
    Jack Tomsky, Mar 11, 2007
    #24
  5. I do hesitate to classify you:
    _________a clown?
    _________a madman?
    (I discard the hypotheses he is stupid).

    MEANWHILE
    I have the reward to find out, and show it with pleasure to the Readers, that with a very poor tool (computer) and scarce programming skills, everyone can *replicate* results that are both *educative* and *exact*. As exact as that are the same we read on text-books and tables.

    CONCLUSION
    ___*Tout le monde est en erreur sauf Jack Leon Tomsky*. What an odd thing!!!!!!!!

    ________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 11, 2007
    #25
  6. Luis A. Afonso

    Jack Tomsky Guest

    I do hesitate to classify you:



    There are no books in any language which give confidence intervals independent of the sample. I challenge you to find any book which does this.

    Jack
     
    Jack Tomsky, Mar 11, 2007
    #26
  7. Jack:

    The critical (WRONG) idea NOT TO ADMIT that simulated samples are SAMPLES by its own right put you at a so PECULIAR situation that you are compelled to deny all the work that has been made since H. LILLIEFORS (1967) p to now..
    I repeat
    Put your comments and critics (you can use the title *Against the wrong way the scientists use to find critical values or simulated samples are not samples* and try to publish it, if you are sufficiently persuaded you are RIGHT.
    I dare if it was a Referee’s team of a serious Journal to admit such a *trash*. Second time I invite you to do so.
    (Prenez bien garde : Tout le monde est en erreur sauf moi, c´est un symptôme de folie, ou de GENIE).
    YOU HAVE TO CHOOSE : OR YOU ARE A CRAZY OLD MAN OR YOU ARE SO GENIAL THAT YOU GO TO LEAD A REVOLUTION IN STATISTICS. (La choix est a vous).
    MEANWHILE I appreciate you did not disturb my work with your PECULIAR idea , not accepted by statisticians, of what are genuine, truthful, samples and simulated, false, ones.
    IS IT A DEAL?
    _________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 11, 2007
    #27
  8. Luis A. Afonso

    Jack Tomsky Guest

    Jack:

    For years now, Wikipedia and I have been trying in vain to teach you about confidence intervals, hypothesis tests, and the difference between parameters and sample statistics. You still maintian that confidence intervals are independent of the sample, that the null hypothesis can never be accepted, and that we can never know if 8/13 > 5/13.

    What is it about you that you're incapable of learning? I will continue to correct your errors in the forum until you get it right.

    Jack
     
    Jack Tomsky, Mar 11, 2007
    #28
  9. IF we are able to deduce, by the first principles, the mathematical expression of a sample statistics Distribution the quantiles (say 5% - 95% , 1% - 99%) provide us of the critical values and therefore the acceptance intervals for the parameters in study throughout the Hypotheses test,
    If we are not able we can (possibly) simulate the sample statistics a sufficient high number o times (for example 2 million) and from this *population* to evaluate the critical values (concerning a pre-defined alpha). The number of simulations is directly connected with the *precision* the critical value is obtained.
    Cft.
    *How many replications in Monte-Carlo replications?
    ____V.K. Stokes.

    _______licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 11, 2007
    #29
  10. Luis A. Afonso

    Jack Tomsky Guest

    IF we are able to deduce, by the first principles,


    This quote from V.K. Stokes has nothing to do with confidence intervals, the subject of your thread.

    Jack
     
    Jack Tomsky, Mar 11, 2007
    #30
  11. wwwpub.utdallas.edu/~herve/Abdi-Lillie2007-pretty.pdf

    UUUUUUUUUUUUUUUUUUUUUUUUUUUUU

    MY VALUES




    size___alpha=5%_1% ___Connover_____Abdi.
    _10___0.264___0.305___.258_.294__.2616_.3037
    _15___0.220___0.255___.220_.257__.2196_.2545
    _20___0.192___0.224___.190_.231__.1920_.2226
    _25___0.174___0.202___.173_.200__.1726_.2010
    _30___0.159___0.185___.161_.187__.1590_.1848
    _35___0.148___0.172_____________.1478_.1720
    _40___0.139___0.161_____________.1386_.1616
    _45___0.131___0.152_____________.1309_.1525
    _50___0.124___0.145_____________.1246_.1457

    (for each sample size, 500´000 samples were simulated
    by my work, 100´000 by Abdi & Molin).


    JACK TOMSKY is so unlearned and shameless that deserves to be exposed every time he posts an opinion on Hypotheses Tests. The less experience people should take in attention that HE IS A CLOWN.

    Readers. Do appreciate what I found out at WEB.

    *** Lilliefors/Van Soest´s test of normality ***
    _____Hervé Abdi & Paul Molin

    1. OVERVIEW

    The normality assumption is at the core of the majority of standard statistical procedures, and it is important to be able to test this assumption. In addition, showing that a sample does not come from a normally distributed population is sometimes of importance per se. Among the many procedures used to test this assumption, one of the most well-known is a modification of the Kolmogorov-Smirnov test of goodness of fit, generally referred to as the Lilliefors test for normality (or Lilliefors test for short).This test was developed independently by Lilliefors (1967) and by Van Soest (1967). The null hypotheses for this test is that the error is normally distributed (i.e. there is no difference between the observed distribution f and the normal distribution). The alternative hypotheses is that the error is not normally distributed.
    Like most statistical tests, this test of normality defines a criterion and gives its sampling distribution. When the probability associated with the criterion s smaller than a given [alpha]-level, the alternative hypotheses is accepted (i.e. we conclude that the sample does not come from a normal distribution). An interesting peculiarity of the Lilliefors test is the technique used to derive the sampling distribution of the criterion. In general mathematical statisticians derive the sampling distribution of the criterion using analytical techniques. However in this case, this approach fails and consequently Lilliefors decided to calculate an approximation of the sampling distribution by using the Monte Carlo technique.
    Essentially the procedure consists of extracting a large number of samples from a Normal Population and computing the value of the criterion for each of these samples. The empirical distribution of the values o the criterion gives an approximation of the sampling distribution of the criterion under the null hypotheses.
    Specifically, both Lilliefors and Van Soest used, for each sample size chosen, 1000 random samples derived from a standardized normal distribution to approximate the sampling distribution of a Kolmogorov-Smirnov criterion of goodness of fit. He critical values given by Lilliefors and Van Soest are quite similar, the relative error being of the order of 10^ (-2).
    According to Lilliefors (1967) this test of normality is more powerful than others procedures for a wide range of nonnormal conditions. Dagnelie (1968) indicated, in addition, that the critical values reported by Lilliefors can be approximated by an analytical formula. Such a formula facilitates writing computer routines because it eliminates the risk of creating errors when keying in the values of the table. Recently, Molin and Abdi (1998), refined the approximation given by Dagnelie and computed new tables using a larger number o runs (i.e. K=100,000) in their simulations. ***
    (End of citation).

    ____TOMSKY´s ABSOLUTELY K.O.!!!!!

    ____licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 12, 2007
    #31
  12. Luis A. Afonso

    Jack Tomsky Guest

    wwwpub.utdallas.edu/~herve/Abdi-Lillie2007-pretty.pdf



    Although there is no evidence that anyone has ever used any of Afonso's faulty statistics, it is important that his errors be corrected so that no one will ever think that confidence levels and significance levels are synonomous, that null hypotheses are never allowed to be accepted, and that no one can tell if 8/13 is greater than 5/13.

    Jack




     
    Jack Tomsky, Mar 12, 2007
    #32
  13. YES, I REPEAT MY STATEMENT

    ªªªª… allowed to be accepted, and that no one can tell if 8/13 is greater than 5/13.***

    At the proper CONTEXT I never denied: it.
    The NOTATION *a/b* I adopted with a precise meaning: that are * a * successes in * b * trials. I wasted my time to write a full thread putting this clear. You unethically and your *boss* Bob Ling, read intentionally (in order to attack me) the notation as plain fractions!!! And you are repeating ad nausea with the same purpose.
    When I did write (as you say) 8/13 > 5/13 a unique way of interpretation: is valid: comparing the event 8 successes in 13 trials with 5/13 we can state that the latter is less favorable (to successes) at alpha significance level.
    (I do not remember exactly, but I think that was 5%)
    IS IT THE LAST TIME YOU USE THIS *TRASH* TO BULLING ME? IS IT?

    _________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 12, 2007
    #33
  14. In a DOZEN of posts, Jack Tomsky, wanted to stop my job to find out the critical values of the Jarque-Bera test. He faced the drawback not be successful
    Meanwhile two points were obvious
    1) The total ignorance of the existence of this test (showing very weak awareness to be updated, even from Web’s material). This test is known since 1980.
    2) The most serious: a 40 years technique ignorance to reach confidence intervals (by simulation).
    In his *opaque* mind
    ____a confidence interval is only possible to be obtained throughout a real sample and it is unique.
    Consequently, for him, the procedure:
    a) Simulating samples a great number (1 million),
    b) For each of them evaluating the sample statistics under study,
    c) And from this empirical distribution to get the quantiles of interest for the test
    is WRONG, ABUSIVE, CONDENABLE.

    This procedure, since 1967 through H. Lilliefors, is currently used for the goal in view.
    To ignore it nowadays is ABSOLURELY NOT ACCEPTABLE to statistically learned people.
    What´s the Reader´s opinion about?

    _______licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 13, 2007
    #34
  15. Test J-B: POWER for exponential samples


    Conventionally *beta* is used to denote the probability to make a type II error (i. e. to accept the hypotheses H0 when we should not).
    *** The power, 1-beta, is the probability to reject H0 when we should do it.***
    When we are dealing with a GOF test (goodness of fit) the null hypotheses is H0: the sample was drawn from the Population of law W. The power is the probability to reject H0 when this is true, i.e., when the population has a law different from W, therefore when the alternative hypotheses, Ha, occurs.
    This tine we test random samples from the exponential law of density
    _____ f(x) = (1 / L)*exp(-x / L) ___ L real positive
    0 <= x < infinite.

    For alpha=5% exponential samples (L=1):

    __N______________Power
    __10______________0.332__
    __15______________0.496__
    __20______________0.631__
    __25______________0.734__
    __30______________0.821__
    __35______________0.884__
    __40______________0.928__
    __45______________0.957__
    __50______________0.977__

    (Note: the powers doesn’t vary with L)

    _______licas (Luis A. Afonso)


    REM "JBexp"
    CLS
    DEFDBL A-Z
    PRINT " JB test for exp. distr. "
    INPUT " LAMBDA = "; lbd
    INPUT " sample size = "; nn
    DIM w(1, 50)
    DATA 2.54,2.71,2.87,3.02,3.16,3.29,3.41,3.52
    DATA 3.62,3.72,3.81,3.89,3.96,4.03,4.09,4.15
    DATA 4.21,4.26,4.31,4.36,4.40,4.44,4.48,4.52
    DATA 4.56,4.59,4.62,4.66,4.69,4.72,4.74,4.77
    DATA 4.80,4.82,4.85,4.87,4.89,4.91,4.92,4.94
    DATA 4.95
    FOR t = 10 TO 50: READ w(1, t): NEXT t
    jc = w(1, nn)
    PRINT jc
    DIM x(nn)
    all = 40000
    FOR k = 1 TO all
    LOCATE 4, 50
    PRINT USING "##########"; all - k
    s = 0
    RANDOMIZE TIMER
    FOR i = 1 TO nn: x(i) = 0
    x(i) = -1 / lbd * LOG(1 - RND)
    s = s + x(i) / nn
    NEXT i
    m1 = s: m2 = 0: m3 = 0: m4 = 0
    FOR j = 1 TO nn: d = x(j) - m1
    m2 = m2 + d * d / nn
    m3 = m3 + d * d * d / nn
    m4 = m4 + d * d * d * d / nn
    NEXT j
    SK = m3 / (m2 ^ (1.5))
    Ku = m4 / (m2 * m2)
    JB = (nn / 6) * (SK * SK + (Ku - 3) * (Ku - 3) / 4)
    IF JB > jc THEN ww = ww + 1
    LOCATE 6, 50
    PRINT USING "#.###"; ww / k
    NEXT k
     
    Luis A. Afonso, Mar 13, 2007
    #35
  16. J-B test, POWER for Chi-square



    From Wikipedia:

    *** The power of a statistical test is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). As power increases, the chances of a Type II error decrease, and vice versa. The probability of a Type II error is referred to as *beta*.
    Statistical power depends on:
    a)__the statistical significance criterion used in the test
    b)__the size of the difference or the strength of the similarity (that is, the effect size) in the population ***
    ____________________________

    TABLE
    Jarque - Bera normality test , 5% significance level: POWER for Ch-squared Distributions, df degrees of freedom.

    ______df=3_______5_______7_______10__
    N=
    __10__0.251____0.176____0.146____0.117_
    __20__0.492____0.351____0.276____0.216_
    __30__0.687____0.500____0.396____0.313_
    __40__0.821____0.635____0.506____0.394_
    __50__0.913____0.746____0.617____0.480_
    _____________________________________


    For each distribution (column) the power increases from N=10 to 50, whereas for each line (N constant) it decreases when the dg increases because the Chi-squared distributions are progressively more alike to normal one. Forthis reason the Jarque - Bera seems to be progressively less able to distinguish them.

    _________licas (Luis A. Afonso)

    REM "JBchi"
    CLS
    DEFDBL A-Z
    PRINT " JB test for CHI "
    INPUT " sample size = "; nn
    INPUT " df = "; df
    pi = 4 * ATN(1)
    DIM w(1, 50)
    DATA 2.54,2.71,2.87,3.02,3.16,3.29,3.41,3.52
    DATA 3.62,3.72,3.81,3.89,3.96,4.03,4.09,4.15
    DATA 4.21,4.26,4.31,4.36,4.40,4.44,4.48,4.52
    DATA 4.56,4.59,4.62,4.66,4.69,4.72,4.74,4.77
    DATA 4.80,4.82,4.85,4.87,4.89,4.91,4.92,4.94
    DATA 4.95
    FOR t = 10 TO 50: READ w(1, t): NEXT t
    jc = w(1, nn)
    DIM x(nn)
    all = 40000
    FOR k = 1 TO all
    LOCATE 4, 50
    PRINT USING "##########"; all - k
    s = 0
    RANDOMIZE TIMER
    FOR i = 1 TO nn: x(i) = 0
    FOR dgg = 1 TO df
    a = SQR(-2 * LOG(RND))
    x = a * COS(2 * pi * RND)
    x(i) = x(i) + x * x
    NEXT dgg
    s = s + x(i)
    NEXT i
    m1 = s / nn: m2 = 0: m3 = 0: m4 = 0
    FOR j = 1 TO nn: d = x(j) - m1
    m2 = m2 + d * d / nn
    m3 = m3 + d * d * d / nn
    m4 = m4 + d * d * d * d / nn
    NEXT j
    SK = m3 / (m2 ^ (1.5))
    Ku = m4 / (m2 * m2)
    JB = (nn / 6) * (SK * SK + (Ku - 3) * (Ku - 3) / 4)
    IF JB > jc THEN ww = ww + 1
    LOCATE 6, 50
    PRINT USING "#.###"; ww / k
    NEXT k: END
     
    Luis A. Afonso, Mar 14, 2007
    #36
  17. JB by Bootstrap: reporting a failure


    The procedure

    From an unique normal N sized *source sample* a set of B Bootstrap samples are simulated (with the same size) and the JB statistics evaluated.
    Analysing this set I count how many these *pseudo-samples* have JB´s greater than the 5% significance level critical value. This frequency of this occurrence is the *Bootstrap* significance level. (BSL).
    _______________________________________
    size = 10
    100 *sources* each one Bootstrapped 4000 times ____values from 9% to 81%, mode = 15% , with 8 occurrences.

    size = 20
    idem
    ____values from 6% to 84%, mode = 8% with 12 occurrences.

    size = 30
    idem
    ____values from 5% to 73%, mode = 6% with 12 occurrences.

    size = 40
    idem
    ____values from 5% to 100%, mode = 6% with 15 occurrences.

    size = 50
    idem
    ____values from 5% to 99%, mode = 8% with 13 occurrences.

    _______________________________________
    size = 10
    100 *sources* each one Bootstrapped 10000 times____values from 8% to 79%, mode = 12%-13%, with 9 occurrences each.

    Conclusion
    The Bootstrap doesn´t work for the Jarque-Bera test.

    ________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 15, 2007
    #37
  18. Significance level, alpha, by CDF



    Acceptance (no-rejection) interval,
    right bounded: interval (-infinity, b] such that
    ________1- alpha = F(b)

    The rejection is (b, infinity) defined by
    ________ alpha = 1- F(b) - p(X=b)

    This way to define them is the same the density f(X) be continuous, i.e. p(X=b)=0 or has at *b* a discontinuity.
     
    Luis A. Afonso, Mar 16, 2007
    #38
  19. Luis A. Afonso

    Jack Tomsky Guest

    Significance level, alpha, by CDF

    What happens if there is no b such that 1- alpha = F(b)? Then your acceptance region is undefined.

    Jack
     
    Jack Tomsky, Mar 16, 2007
    #39
  20. Jack Tomsky wrote:

    *** What happens if there is no b such that 1- alpha = F(b)? Then your acceptance region is undefined. Jack ***

    My response

    It seems to me *homework* *master* Jack. You must tell what you got yet in this matter. Ask your teacher to direct you in the right way.

    ________licas (Luis A. Afonso)
     
    Luis A. Afonso, Mar 16, 2007
    #40
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.