Is a most-likely probability 'better' depending on the size of thenext-most-likely?

Discussion in 'Scientific Statistics Math' started by Steve, Feb 2, 2010.

  1. Steve

    Steve Guest

    Hi,

    I'm working on an algorithm to guess the correct English word within
    text in which some words have become illegible.
    It boils down to creating a list of candidate words, along with their
    probabilities, and choosing the most likely.
    Alternating between training data, and new test data, I can establish
    that the probability estimations are fairly accurate. (Though to be
    useful, the algorithm needs to provide a shorter candidate list in the
    first place!)

    Suppose I have two cases:
    A) There are 2 candidate words with probabilities 0.51 and 0.49.
    B) There are 101 candidate words, one with P=0.51, and a hundred
    others all with P = .0049.

    One of the approaches the algorithm takes is based on the N recent
    known words prior to the unknown word (its Ngram), so there are
    inevitably situations when the Ngram contains words that have
    themselves been corrected in a prior step. If this is the case, I need
    to know how much I can rely on that previous result.
    Is there any basis for believing that in case B) the result is more
    trustworthy? After all, the choice with P=0.51 is more than 100 times
    more likely than the next best word. But in case A) there's virtually
    nothing to choose between them.
    Rightly or wrongly, that's how I intuitively feel about the choices,
    but then I remember... both 'best choices' will be wrong 49% of the
    time, so it doesn't make any difference!

    Is there a measure for this, or is it totally irrelevant?
    ------------

    Eventually the goal is to have a much higher confidence than 0.51 in a
    single choice, but there will occasionally be situations with these
    borderline results. In these cases I'll offer the user a drop-down
    replacement list with all the choices and their probabilities, for
    them to pick from.
    Talking to non-maths friends about this, most of them feel the same
    way that they would be more confident making a choice in case B) than
    A) .

    Any thoughts?... Is this a bit of a Monty Hall problem?

    Thanks

    Steve
     
    Steve, Feb 2, 2010
    #1
    1. Advertisements

  2. Steve

    David Jones Guest

    Have you thought of involving a cost function? This would give a value/cost/utility to choosing word B, if word A is actually correct. Then some aspects of your problem would eventually become generalised to comparing a single alternative with its cost, with lots of small probabilities each having different costs. In such a case, you might prefer the second if a lot of the small probabilities are associated with small costs and only a few with high costs.

    Here "cost" might be used to distinguish similar words with similar meanings from similar words with different meanings.

    David Jones
     
    David Jones, Feb 2, 2010
    #2
    1. Advertisements

  3. Steve

    Steve Guest

    Thanks David, I hadn't thought of that idea. There's a lot of
    parameters linked to each candidate, such as meaning, part-of-speech,
    useage-frequency, collocation-frequency, context likelihood etc. so I
    could certainly shape some kind of cost for going against the grain of
    these.

    I'm still wondering if there's some simple heuristic involved with
    cases like these though.

    An example might be if I collected millions of usenet postings and
    found significant amounts of these examples:

    "Just my two * worth" and found * = cents 51% of the time and found
    rupees, yen etc 4.9% of the time for 10 variations.

    and

    "I married my * in a church" with * = wife 51% and husband 49%.

    Even if you were sure the probabilities were very accurate, the
    'cents' example just seems a safer bet because each alternative is
    quite unlikely.

    Steve
     
    Steve, Feb 2, 2010
    #3
  4. Steve

    David Jones Guest

    The cost approach has the potential that, with an extremely large amount of work, you could do a thorough application of decision theory to tell you what to choose in any given case. But it can also help to think about the problem. One of the essential parts are probabilities like that of "being wrong if I choose this one", as this would weight the cost of the choice. If you think about these, rather than the probability that "this one is right", then it would help to justify your feeling about the intrepretation to be made when you have lots of small probabilities ...these would convert to lots of instances where the probability of being wrong is high.

    David Jones
     
    David Jones, Feb 3, 2010
    #4
  5. Steve

    Steve Guest

    It's definitely given me a new perspective on weighing these kind of
    choices. There's no shortage of test data to try the cost approach and
    see how it performs compared to simple probability alone.

    Thanks

    Steve
     
    Steve, Feb 3, 2010
    #5
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.