What's the easy way to refer to the member variables in a matrix?

Discussion in 'SAS (Statistical Analysis Software)' started by Fred, Nov 19, 2004.

  1. Fred

    Fred Guest

    Hi, all

    I was always confused and trapped in the array operations in SAS.

    Suppose I have two big big data sets, each contains more that 20
    different variables.
    Therefore, it is impossible to explicitly write their exact names in code.Rather
    we have to refer to each distinct variable by using its index or order
    in the data set.
    Data set 1 named D1:
    var1_1 var1_2 var1_3 var1_4 .... var1_m
    .....
    ....

    Data se 2 named D2:
    var2_1 var2_2 var2_3 var2_4 .... var2_n
    .....
    ......

    For example, I need to do correlation between one variable in D1 with
    another specific one in D2.
    That is, to compute correlation(var1_i, var2_j), s1<=i<=m, s1+1 <= j <= n.
    Note: The number m and n are not pre-defined. s1 is a fixed number.

    Anyone has similar experience on matrix referring problem like the above?

    Thanks a lot in advance.

    Fred
     
    Fred, Nov 19, 2004
    #1
    1. Advertisements

  2. In SAS 'arrays' mean something else. Just think of these as rows and
    columns in a data table. A small data table. 20 different variables is
    a small table. If you had 20,000 variable,s that would be big.

    Now then. Why can't you write the 'exact names' in code? Using
    pseudonyms
    like var1_1 ... var1_20 is just asking for mis-labeling and
    misunderstandings.
    Use the SAS features to give the variables meaningful names and ALSO
    labels that explain the units, the source, whatever crucial meta-data
    you
    need.

    If your variables are labeled var1_1 to var1_25, say, then you can refer
    to all of them using one of the convenient notations covered in the SAS
    Online Docs.

    var1: refers to all the variable names starting with
    'var1'
    var1_1-var1_25 refers to var1_1, var1_2, ..., var1_25 in order

    Assuming you define the variables with better names, you can still
    use special features. Let's assume you named the variables as follows:

    foo bar baz quux alpha baker charlie tango sacco vanzetti merrill lynch

    If the names are in that order in the data set, then you can refer to
    ALL of them like this:

    foo--lynch

    Now you can assign meaningful names and use them conveniently.


    NEXT PROBLEM:
    You need to merge the two data sets. You *have* to have a meaningful
    ID variables or set of ID variables so that when you put the two data
    sets together, things line up properly. This is crucial!

    Let's assume you have a variable IDnum with the relevant information.
    Then you can do a simple merge in a data step to handle the problem.
    Just merge by the IDnum. Without any sort of IDnum, you have no
    guarantee that the records are lined up right. And if they're not,
    your 'correlations' will be random noise.

    David
     
    David L. Cassell, Nov 20, 2004
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.