# Using Percentages in Pearsons R

Hello all,

First time here, long time usenet junkie. Have a random question on
using Pearson's R and not exactly a statistician.

If I was looking for a relationship between poulation and unemployment
would you use the unemployment rate or the number of unemployed?
Certainly the .08 or .09 unemployment number will work in stats
packages but .08 in Los Angeles is far different than .09 in some small
town.
So when you take the mean of the population of a long list of cities
and the mean of percent or number of unemployed and plug these in to
Person's , using percent seems like it would always have a very low R,
regardless of +/-.

Been to a lot of websites and done a lot of reading but needed to ask
somewhere.

Any feedback would be appreciated.

It depends on what you want to talk about -- the unemployment rate,
or the number of unemployed people. The problem arises only when you
say just "unemployment", which is ambiguous. (The same sort of thing
happens in discussions of "crime".)

Thanks Ray, have been banging this around for a few days and want to
make sure I use SPSS the right way, and of course get the right result.
Pearson's uses averages so how do use an average of percentages that
Seems like the unemployment rate for example is use to normalize raw
data between cities so you can compare them. But then in Pearsons one
variable is population but you would want to use the # of unemployed,
not the %, as the other variable.
Doesn't prove anything but the question might be whether their is a
relationship between population and unemployment.

If there is much variability in the sizes of the cities then it's
a foregone conclusion that the number of unemployed will correlate
positively with size. However, finding that the unemployment rate
correlates with size would be informative. Just remember that much
work has been done on this by many people using techniques that are
more complicated than simple correlation, so don't take your initial
results too seriously, one way or the other.

was just using this as my example. Really just after that it is better
to use the raw data for a correlation than a percentage because of the
"weight" it gives in the Pearson's formula. Could still be + or -, no,
low or high but will calculate the right way.

