Definition of the similarity in a set of integers

Discussion in 'Mathematica' started by Ryan Markley, Feb 12, 2009.

  1. Ryan Markley

    Ryan Markley Guest

    Hello I have two sets of integers eg

    S1 = (25,14,32,45) and S2 = (26,12,31,48)

    I want to define an operation similar to the variance that give me how
    similar both sets are, for example in the above example for both sets
    the results I have to get need to be similar because both sets are
    similar.

    The problem with the variance is this

    S1 = (25,1,1,1) and S2 = (1,1,25,1) these two sets have the same
    variance but they are completly different. What mathematical operation
    can I use to do what I am looking for.

    Thanks a lot in advance.
     
    Ryan Markley, Feb 12, 2009
    #1
    1. Advertisements

  2. Ryan Markley

    dh Guest

    Hi Ryan,

    what about the difference of the ordered sets?

    Daniel



     
    dh, Feb 13, 2009
    #2
    1. Advertisements

  3. Note that what you call "sets" are not sets as usually defined in
    mathematics: a collection of *distinct* objects. That is S1 = (25,1,1,1)
    as a set is {1, 25} and S2 = (1,1,25,1) as a set is {1, 25}, which
    clearly shows that both sets S1 and S2 are equal. OTOH, the sets S1 =
    {25,14,32,45} and S2 = {26,12,31,48} may be deemed as very dissimilar
    since they have no element in common. I think the objects you are
    dealing with can be described as vectors or ordered lists of integers.

    Now, assuming you are comparing only vectors of equal length, you could
    use the correlation or the cosine distance, among many others available
    in Mathematica. See "Distance and Similarity Measures" at

    http://reference.wolfram.com/mathematica/guide/DistanceAndSimilarityMeasu
    res.html


    For instance,

    In[1]:= S1 = {25, 14, 32, 45};
    S2 = {26, 12, 31, 48};

    CorrelationDistance[S1, S2] // N
    CosineDistance[S1, S2] // N

    Out[3]= 0.00361843

    Out[4]= 0.00152087

    In[5]:= S1 = {25, 1, 1, 1};
    S2 = {1, 1, 25, 1};

    CorrelationDistance[S1, S2] // N
    CosineDistance[S1, S2] // N

    Out[7]= 1.33333

    Out[8]= 0.917197

    In[9]:= S1 = {24, 1};
    S2 = {25, 2};

    CorrelationDistance[S1, S2] // N
    CosineDistance[S1, S2] // N

    Out[11]= 0.

    Out[12]= 0.00072905

    In[13]:= S1 = {25, 1};
    S2 = {1, 25};

    CorrelationDistance[S1, S2] // N
    CosineDistance[S1, S2] // N

    Out[15]= 2.

    Out[16]= 0.920128

    Regards,
    --Jean-Marc
     
    Jean-Marc Gulliet, Feb 13, 2009
    #3
  4. Assuming S1 and S2 contain the same amount of integers:

    EuclideanDistance[S1,S2]^2/Length[S1]

    or Correlation[S1,S2]

    Cheers -- Sjoerd
     
    Sjoerd C. de Vries, Feb 13, 2009
    #4
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.