# Compare two sets of proportions

Discussion in 'Scientific Statistics Math' started by SDC, Dec 11, 2010.

1. ### SDCGuest

Hello,

I would like a test to tell me if two sets of proportions are "similar". I
have a set of proportions p1, p2, ..., pn and q1, q2, ..., qn (where p1+p2+
.... +pn =1 and q1+q2+ ... +qn =1) and would like a test that tells me if:

p1 = q1 and p2=q2 and ... pn = qn.

(I also have the counts behind the proportions if that is useful information
to have)

Thanks.

SDC, Dec 11, 2010

2. ### danheymanGuest

The first test is to plot the pairs (p_i,q_i); if it looks like a 45
deg. straight line you're in business. If most points fit, examine the
others to see if there might be a good reason they don't fit. (Fewer
counts perhaps.) If you want to use a formal test, you need to have a
probability model. The chi-square goodness-of-fit test might apply,
but without a probability model it can't be justified.

danheyman, Dec 11, 2010

3. ### Rich UlrichGuest

The proportions give you something that is useful
to eyeball.

The Ns are needed for a test, since they are what
give some indication of how much anyone should
believe a proportion.

The usual test is the Pearson contingency table
chi-squared. Google can find you a chi-square calculator.
Here is one -
http://faculty.vassar.edu/lowry/newcs.html

I suggest that you compute and look at both the row and
column propoortions while you are reckoning how uniform
the proportions are. For your k x 2 table, whatever
unbalance exists might be easier to see by contrasting
the sets of 2 instead of looking at the sets of k.

Rich Ulrich, Dec 11, 2010
4. ### SDCGuest

A bit more background. The p and q represent the proportion of people in
each of six age groups. What I want to say is are the proportions within
each age group the same, considering all age groups simultaneously? The p's
are from my survey and the q's are from the national census, so the counts
behind the p's are much much smaller than the q's.

I thought that the assumption would be that each sample was from a
multivariate multinomial distribution and the test would come from this
assumption?

SDC, Dec 12, 2010
5. ### Luis A. AfonsoGuest

Luis A. Afonso, Dec 12, 2010
6. ### SDCGuest

Thanks, so if I have say six age groups then I think I only have 6 pairs to
compare, eg if p are my survey proportions and q are those from the census
the calculations are:

Abs(p1 - q1)
Abs(p2 - q2)
Abs(p3 - q3)
Abs(p4 - q4)
Abs(p5 - q5)
Abs(p6 - q6)

SDC, Dec 12, 2010
7. ### Rich UlrichGuest

The Pearson chisquare test of a contingency table, or for
goodness of fit, can be justified by any of several assumptions
of distribution, also including Poisson and normal.

If you want to use the population proportions, then
what you test will be the observed Ns for a fixed set
of expected p's -- "goodness of fit" to the expected
proportions. The test still requires that you provide your
original total-N and the set of proportions, or, equivalently,
the set of Ns for the separate cells.

The formula for calculation by hand will be shorter for
using the proportions instead of the Total-pop numbers.
You will find more on-line calculators that are set up for
comparing two sets of Ns.

Rich Ulrich, Dec 13, 2010
8. ### Luis A. AfonsoGuest

Date: Dec 12, 2010 1:00 PM
Author: SDC
Subject: Re: Compare two sets of proportions

Abs(p1 - q1)
Abs(p2 - q2)
Abs(p3 - q3)
Abs(p4 - q4)
Abs(p5 - q5)
Abs(p6 - q6)

My response

Use 2tails t-test in each of the FIVE comparisons

Luis

Luis A. Afonso, Dec 13, 2010
9. ### Bruce WeaverGuest

Here's an online calculator for the goodness of fit test, should you
decide to go that way.

Bruce Weaver, Dec 13, 2010
10. ### Luis A. AfonsoGuest

___p_j = national proportion

____Z(i, j) = | p_i - p_j| /s

_______s =
sqrt [ p_i *(1- p_i) /n_i + [ p_j *(1- p_j) /n_j]

_approx.Z(i, j) =
| p_i - p_j| / sqrt [ p_i *(1- p_i) /n_i ]

If n_i > 100 (say) you could use the Normal Statistics, otherwise better to prefer the Students n_i - 1 degrees of freedom.

Luis

Luis A. Afonso, Dec 13, 2010