# Probability of getting relative frequency (or probability)distribution

Discussion in 'Scientific Statistics Math' started by Petar Milin, Sep 19, 2006.

1. ### Petar MilinGuest

Hello,
I am not a mathematician, but psychologist, still learning math
Nevertheless, I have a problem in my research:
There is one distribution of relative frequencies which I treat as default
There is another one, and I am interested in how probable it is to get the second one if the first is the default? I am thinking of Chi-square test, and p-level, but I am not sure in both, my reasoning, and the test itself.
Is it possible to have the probability of getting a distribution from the default one. Even better, could I calculate the entropy of that event? How?
Hope the I was clear enough.

Sincerely,
Petar Milin

Petar Milin, Sep 19, 2006

2. ### m00esGuest

Let's use an example. Let's say that based on my default distribution,
I EXPECT the value 0 to occur with relative frequency .1, the value 1
with relative frequency .2, the value 2 with relative frequency .3, the
value 3 with relative frequency .2, the value 4 with relative frequency
..1, and the value 5 with relative frequency .1.

Now I draw a sample of n = 100 individuals and OBSERVE the following
results. 15 values are equal to 0, 15 values are equal to 1, 25 values
are equal to 2, 30 values are equal to 3, 10 values are equal to 4, and
5 values are equal to 5.

Then the probability of this particular outcome can be obtained by
using the multinomial distibution. Specifically, the probability of
this particular outcome is:

100! / (15! 15! 25! 30! 10! 5!) * .1^15 * .2^15 * .3^25 * .2^30 * .1^10
* .1^5 = damn small

What you can also do is to calculate the probability of the particular
outcome you observed or a more extreme outcome, assuming that the n =
100 values were drawn from the default distribution. Note that, based
on the default distribution, I would EXPECT: 10 values equal to 0, 20
values equal to 1, 30 values equal to 2, 20 values equal to 3, 10
values equal to 4, and 10 values equal to 5. The probability of the
outcome you observed is:

100! / (15! 15! 25! 30! 10! 5!) * .1^15 * .2^15 * .3^25 * .2^30 * .1^10
* .1^5 = damn small

A more "extreme" outcome than the one you observed would be to observe
16 values equal to 0, 14 values equal to 1, and the rest the same as
above. Note that this is more extreme in the sense that the observed
values would deviate even more strongly from the values you would
expect (10 values of 0, 20 values of 1, and so on).

100! / (16! 14! 25! 30! 10! 5!) * .1^16 * .2^14 * .3^25 * .2^30 * .1^10
* .1^5 = damn small

Now you could find all possible outcomes that are "more extreme" and
add up their probabilities. That will take a long time to do by hand. A
good approximation for this goes as follows.

Calculate: X2 = sum( (observed value - expected value)^2 / expected
value )

So, X2 = (15-10)^2/10 + (15-20)^2/20 + (25-30)^2/30 + (30-20)^2/20 +
(10-10)^2/10 + (5-10)^2/10 = 12.08

Then look up in a chi-square table what the probability of observing a
12.08 or a higher is when there are 5 degrees of freedom (5 is the
number of possible values minus 1). You will find that the probability
is .03. Therefore, the probability of the particular outcome I observed
or a more extreme outcome is .03. This is a chi-square test to test the
fit of the observed distribution with an expected one.

Hope this helps,

m00es

m00es, Sep 19, 2006

3. ### Petar MilinGuest

Thank you very much! That's what I thought it could be done, when I mentioned Chi-square test.
However, is there a way to calculate the entropy H of the OBSERVED distribution, given the EXPECTED one?
I would really like to have entropy measures, if possible, for many theoretical reasons.

Petar Milin

Petar Milin, Sep 20, 2006