asking for comments on this use of Bayesian stat

Discussion in 'Scientific Statistics Math' started by Art Kendall, Oct 26, 2010.

  1. Art Kendall

    Art Kendall Guest

    Art Kendall, Oct 26, 2010
    1. Advertisements

  2. Art Kendall

    Bruce Weaver Guest

    Hi Art. What I see here is the same thing that Altman and Bland
    discuss in this BMJ Statistics Note on predictive values of diagnostic
    tests. Notice how the predictive values are related to prevalence of
    disease. The formulae are much more readable in the free PDF.

    Bruce Weaver, Oct 26, 2010
    1. Advertisements

  3. Art Kendall

    Paul Guest

    I'm not a Bayesian statistician (in fact, I'm not a statistician, just
    a user),
    but as I read the post I failed to see any estimates of a cost
    function. The
    author also seemed to be treating the NSA monitoring in isolation --
    either it
    identifies me as a terrorist or it doesn't -- whereas I suspect that
    the results
    of the monitoring are combined with other sources of information to
    suspects. So while I neither defend nor condemn phone monitoring, I
    find the
    author's arguments less than persuasive.

    For that matter, his calculations and conclusions all eventually trace
    back to
    the needle-in-a-haystack problem -- 1,000 (or fewer) terrorists in a
    of 300 million. Does that mean that we should never look for needles
    in haystacks?
    Or at least not unless we have tests with zero false positive/false
    negative rates?
    Does that mean an end to organ donation?

    Paul, Oct 26, 2010
  4. The argument assumes that there is random screening for terrorists --
    clearly a futile process if there are only 1000 terrorists in 300m
    people. One assumes more intelligent strategies such as infiltration
    of dissident groups, etc.
    Barry W Brown, Oct 26, 2010
  5. Art Kendall

    Rich Ulrich Guest

    In a slightly paranoid vein -- I could imagine that some law officers
    might use the "1000 terrorists" as an excuse to run the system,
    while happily developing investigations of 100 times as many
    money-launderers, a few million illegal immigrants, and so on.

    By the way, I don't think that Bayesian statisticians have much
    better access to using Bayes's theorem than anyone else with
    a simple understanding of how base-rates affect probability of
    observing an event. Epidemiologists are one large set of people
    with a lot of practice at that.
    Rich Ulrich, Oct 28, 2010
  6. Art Kendall

    illywhacker Guest

    The same point applies to the medical case. If A is the proposition
    'has abnormality' and T the proposition 'test positive', then it is
    elementary that:

    P(A | T) = P(T | A) P(A) / P(T) ,


    P(T) = P(T | A) P(A) + P(T | ~A) P(~A) ,

    where ~ is negation. Let P(T | A) = p, and P(T | ~A) = r, then

    P(A | T) = p P(A) / (p P(A) + r P(~A)) .

    If P(A) is interpreted as frequency of occurrence in the population,
    then even if p = 1 and r is small, the probability that the patient
    has the abnormality given a positive test may be small if the
    abnormality is rare.

    However, as always (and this is one of the major strengths of the
    Bayesian approach), we need to indicate *all* our prior knowldge in
    the notation or we will make errors. We have some prior knowledge -
    all of medicine for a start. Denote all this knowledge by K. K will
    include the fact that the patient came to see the doctor; that they
    have other symptoms; and so on. BAyes theorem now becomes:

    P(A | T, K) = P(T | A, K) P(A | K) / [P(T | A, K)P(A | K) + P(T | ~A,
    K)P(~A | K)] .

    It may be reasonable that P(T | A, K) and P(T | ~A, K) are independent
    of K, but P(A | K) is certianly dependent on it. It is certainly not
    just the frequency in the population.

    A similar flaw infects the analysis at the beginning of the paper
    cited by Bruce, although their hearts are in the right place. The idea
    here is to estimate the accuracy of the test (the numbers p and r
    above). These are not really probabilities, but (very) succinct
    summaries of biological and medical circumstances. They must be
    estimated, or marginalized out, if we wish to apply the results of the
    study to compute P(A | T). As well as raising the issue of our
    uncertainty about p and r given the results of the study, which
    naturally affect our certainty about the results of the test, this
    also brings in other questions about our prior knowledge. In
    particular, it requires us to think about


    where {A_{i}} is the set of propositions about abnormality pertaining
    to the individuals in the population. It is normally assumed without
    even thinking about it, that

    P({A_{i}}) = \prod_{i} P(A_{i}) ,

    where P(A_{i}) does not depend on i. This may be reasonable, but it
    may well not be.

    illywhacker, Nov 3, 2010
  7. Art Kendall

    Herman Rubin Guest

    The problem is that you are getting into it in the middle.

    The general decision problem under uncertainty requires rhat onr
    look at all the aspects of the problem in all states of nature.
    It can be shown that this corresponds to minimizing the loss-prior
    combination by the use of Bayes Theorem, so this is the foundation
    for rational Bayesian behavior. The mathematics is the easy part.

    Herman Rubin, Dec 8, 2010
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.