# unemployment statistics

Discussion in 'Scientific Statistics Math' started by PT, Oct 29, 2011.

1. ### PTGuest

Thsi post is motivated by a remark in Steven
Landburg's "The Armchair Economist", regarding
unemployment statistics.

In his book, "The armchair economist", he discusses
the problem of estimating the average length of
unemployment, among the (known) unemployed,
at a particular moment of time. He states that it is
biased in the upward direction, because one with
a longer period out of work has a greater chance,
i.e. more time, to be selected/sampled than
someone of relatively shorter duration. Assume
a simple samplng method, i.e. telephoning random
individuals listed as collecting unemployment.

Now, there seems to me an obvious logical
fallacy
here. In addition, there is a fundamental question:
given an unbiased uniform sample of a population,
a sample statistic must converge to the true statistic,
must it not? Regardless of the underlying
distribution of that population. Are there any
exceptions to this rule?

If this is merely a blip by Landsburg, it's no big deal,
everyone gets a few passes. But it's more troubling
if his belief represents the consensus of the
economics community; can such an error really
be the norm among a community of professional academics?

I am not 100% sure of my position, though, as Mr. Landburg
is a very smart man, and I may have missed something.

PT, Oct 29, 2011

2. ### kymGuest

....

It's not just Landburg.

But it would seem obvious if you want to measure some
"average length of time" at a particular instant in time using
a sampling method, it will be biased unless e.g. weighting is used.
Indeed, there are even recent pubs discussing methods for computing
average length of unemployment using non-obvious weighting.

kym, Oct 29, 2011

3. ### kymGuest

In sci.math wrote:
[...]

OK. To forestall more email, consider 3 different ways we might
measure some kind of "average length of unemployment".

Take all the people that become unemployed today, wait until each
becomes re-employed or leaves the workforce (in which case we may or
may not account for them). Take average. This would be a "prospective"
average length of unemployment now.

Take all the people that become employed today. Take the average.
Adjust for those that have never been employed before. This would
be a "retrospective" average.

Take all those whose midpoint unemployment is today, and take
the average.

Unfortunately, 2 of these involve prescience. All we have is
a telephone survey. We do know how long someone has been unemployed,
but not how long before they will be re-employed. If we just
assume they are at their mid-point we will have a biased estimate.
What we probably really want to know is the prospective figure anyway.
If an average worker became unemployed now, how long would they expect to
become unemployed?

kym, Oct 29, 2011
4. ### Rich UlrichGuest

Here is your problem. Just exactly *what* is the
"population"? What defines it? And what, then, constitutes
an "unbiased uniform sample" of that population? We
regularly talk about something like the "sampling frame".

For the usual question of "how long does today's new episode
of unemployment last?", your typical phone sample of a
cross-section (Are you unemployed today?) will miss almost
everyone whose unemployment was only a few days.
Sampling bias, for that question.

On the other hand, you *can* frame the question in
such a way that the answer matches the sampling frame;
in that case, the answer is unbiased.

For the piece that you paraphrased, it is not clear that he
must be talking about what *I* consider the usual
question in economics. On the other hand, now that you
know the correct perspective, you can check and see
whether he is actually that sloppy, or if you missed some
important distinction.

Epidemiologists work with similar "biases" when evaluating
the two technical quantities of Incidence and Prevalence
of a disease. So it is a regular cause for being careful.

Rich Ulrich, Oct 29, 2011
5. ### jgharstonGuest

In his book, "The armchair economist", he discusses
Err. Isn't the answer to just perform the relevant data
extract on the DSS record system?

JGH

jgharston, Oct 31, 2011
6. ### PerseusGuest

By weighting you mean some form of correction, I suppose. But to
correct the damn thing you must know how many are unemployed. Then,
the question of phoning. What if people unemployed has not a phone
and only gave the number of a friend or relative who has phone. Well,
what would be result of such a poll?

This obstacle can be also corrected, but you should know what is the
probability that an unemployed person has not a phone. Then, if poor
people passes more time working when they have a job, then what is the
probability that a given phone number do not answer, even at 7 pm
because the person is working? So, he cannot answer the phone. So,
these data is missing.
Of course this can be weighted also, but if we weight too much, we can
figure it out how many unemployed people exist and do not waste a
dollar making phone calls. You can weight the whole damn figuration
and tell there is only 7,1% unemployed people. Nobody is going to
take care to verify such a shit.

Perseus

Perseus, Oct 31, 2011
7. ### PTGuest

All members of the set characterized by
'unemployed'.
Samples chosen according to a uniform
distribution. Presumably one has access
to a random number generator.
?
Don't know that one.

.... and "how many days?"
hmmmm...
But if the distribution is stationary (or
nearly), the short-term unemployed person
will be replaced by another, newly unemployed.
So that effect cancels.

This observation is also an answer to the
claim "the long term unemployed has a greater
chance to be selected". We note that the
short-timers move in and out of the population
more often, and hence have a greater chance
of selection, as there are more of them.

But at a simpler level, I find both of these
'bias explanations' specious.

Look, we have a random variable. We sample
from a population of that variable. We get
a bunch of numbers, and infer statistics;
"Hello, are you out of work? If yes, how
many days?"

What could be simpler?

Seen this way, the 'long term unemployed
represent an upward bias' objection looks
specious. Of course, the larger numbers
shift the average upwards, they're SUPPOSED
to do that! But it's not a bias.

(another point: a particular individual
out of work 100 days could have been sampled
on any day of that period, uniformly. So he
might contribute a small number, as well as
a large. But this argument isn't necessary. )
It isn't so precisely spelled out.
Examples? And definitions -

Are there any pathlogical distributions,
which resist estimation by uniform sampling?

PT, Nov 4, 2011
8. ### Rich UlrichGuest

[snip, much, concerning collection of unemployment statistics]
RU >>
[snip, rest]

You might ask, "For everyone who is unemployed today
(on the day of the phone call of the survey), how long have
they been unemployed?" That is the question that you have
been answering. In epidemiology, that sort of question
speaks to the "prevalence" of a disease - How many people
have a current diagnosis for a disease, and how long have

For diseases, the population for whom we seek a cure is
apt to include everyone, because the long-term ill are still of
interest. For jobs, the people who are out of the job
market for 5 or 10 or 20 years are more-or-less ignored, or
(at least) are not included in tabulations of duration.

You might ask, "For everyone newly laid-off today (or on
some particular day), how long will they stay unemployed?"
That is the question that leads to the answer for, "What is the
average length of unemployment for someone newly laid-off?"
In epidemiology, this speaks to "incidence" -- the description
of new cases. For acute, short-term diseases like the flu,
where everyone is ill for a similar, brief period, you get the
similar results if you count incidence or prevalence, except
at the beginning or end of an epidemic.

For AIDS, which has no cure, the prevalence rate shows the
overall magnitude of one problem, but it is the rate of new
cases that shows the effectiveness of preventive policies.

The bottom line is, Prevalence (or a sample that uniformly
selects among current cases, as counted for Prevalence) can