Merging Variables

Discussion in 'SPSS' started by Aoi Lifeaftram, Jan 5, 2012.

  1. Okay, so I have a dataset that I am supposed to do analysis on that
    was horribly designed. Here is my situation:

    There are three variables that have selection values assigned by this
    survey software (select1, select2, select3). Then I have a single
    question (how do you like [select1]), but it's repeated 3 times, once
    for each of the variables. So essentially I have (select1_like,
    select2_like, select3_like). The problem is that I am getting a
    restricted sample size because it is possible that the participant was
    asked questions about the same selection value in any of those three
    variables (select1_like, select2_like, select3_like). So I need to be
    able to combine those three variables (select1_like, select2_like,
    select3_like) into a single variable when each of the original
    selection variables equal a certain value (i.e. select1 = 1, select2
    =1, select3 = 1).

    I thought I might be able to do this with 'Recode Into Different
    Variables', but SPSS won't let me use the same new variable name to
    recode into.

    Does this make sense? Is this even possible?
    Aoi Lifeaftram, Jan 5, 2012
    1. Advertisements

  2. Aoi Lifeaftram

    Rich Ulrich Guest

    No, this does not make clear sense. Here is my guess
    about what you are describing -- If I'm all wrong, you can
    try again with a description that is less shy about details.

    As I read it (possibly) -
    Subjects are asked to list their 1st, 2nd, and 3rd choice
    for [whatever]. Then they are asked to give detailed
    responses on their 1st choice; their 2nd choice; and their
    3rd choice. If "vanilla" is one of the choices, you want
    to have a set of questions that are about vanilla, regardless
    of whether Vanilla was ranked 1, 2, or 3.

    - Doing further statistics requires that variables be created
    for "vanilla" and each other high-count choices; and another
    variable (maybe) that has responses for the best-rated
    choice that is not individually tabulated.

    - Assuming the above is right -
    If the choices are numerous, you might systematically create
    a slew of vars by clever use of Cases-to-vars and Vars-to-cases.

    Yes, using ranks is generally a bad way to design surveys,
    whether it is like this in creating complications, or if it is
    any response that does not give an anchor to "How good"
    instead of saying "better than the others."
    Rich Ulrich, Jan 5, 2012
    1. Advertisements

  3. Aoi Lifeaftram

    Drew Guest

    Yeah ... I described it poorly. lol

    Participants are asked at the beginning of the survey what "products"
    they are aware of (simple checkbox selection of items). From there, 3
    are randomly selected, and become (select1, select2, select3). Because
    it is random, if you selected shampoo, it could show up for any of
    those three selections. They designed it this way in order to minimize
    ordering effects (bad way to do it, I know). Hence I may answer my
    liking of shampoo in select1_like, while you may answer your like of
    shampoo in select2_like. You can see my issue now, as when I am
    running analyses if I only use one of the three variables, I end up
    with a restricted sample size as your rating of liking shampoo could
    be in any of those three variables. Hence I need to combine those
    three variables into a single variable, so I have a large sample size
    in each cell.

    So basically I want to combine select1_like, select2_like, and
    select3_like into a single variable (lets call it shampoo_like). This
    would require not only some kind of combination command, but also
    using a filter to only list the answers in each of the three variables
    that ask about shampoo (basically 'if select1 = shampoo').

    I know this is a horrible survey design .... trust me, I've complained
    up and down about it! This study was not designed by me, I am only
    helping in the analysis. Basically I don't want to have to hand recode
    4K plus responses in order to run analyses, so I am hoping that SPSS
    can do this.

    Is this more clear?

    Thanks so much for your help!!
    Drew, Jan 5, 2012
  4. Aoi Lifeaftram

    peter m sopp Guest

    There are several more or less elegant ways to do it. Here is an easy to
    read example:

    * like_product# = likeliness of product.
    DO IF select1 = 1.
    + COMPUTE like_product1 = select1_like.
    + ELSE IF select2 = 1.
    + COMPUTE like_product1 = select2_like.
    + ELSE IF select3 = 1.
    + COMPUTE like_product1 = select3_like.
    END IF.

    * Now the same with product 2 and so on ...

    The problem here: you have to do it as many times as there are different
    products. But it's simple.

    Hope it's helpful to find the right way for you.


    Am 05.01.2012 03:53, schrieb Drew:
    peter m sopp, Jan 5, 2012
  5. Aoi Lifeaftram

    Rich Ulrich Guest

    *Here's the extended version for 6 products, changing "1" to #
    *and like_product1 to like_product.

    numeric shampoo towel umbrella VCR Wii Xerox.
    recode shampoo to Xerox(Else= $sysmis). /* initialize; good practice.

    DO REPEAT #= 1,2,3,4,5,6.
    /like_product= shampoo towel umbrella VCR Wii Xerox.
    * like_product# = likeliness of product.
    DO IF select1 = #.
    + COMPUTE like_product = select1_like.
    + ELSE IF select2 = #.
    + COMPUTE like_product = select2_like.
    + ELSE IF select3 = #.
    + COMPUTE like_product = select3_like.
    END IF.


    If there were multiple variables to be set, instead of only LIKE,
    it would be neater to define the new vars as vectors; and use
    the DO IF / ELSE IF/ ... to find an index for the vectors to be set;
    then, if the index is not null, list the Computes just once,using the

    [snip, previous]
    Rich Ulrich, Jan 5, 2012
  6. Aoi Lifeaftram

    David Guest

    *ALWAYS* important to note whether these fields are string or numeric.
    The following Assumes NUMERIC!
    If otherwise use AUTORECODE with GROUPED subcommand and follow with
    (of course omitting everything before VECTOR and using your own
    variable names):
    If for some reason your study is even more badly designed than
    indicated and your variable sets are *NOT* contiguous you will need to
    fix that prior to running the syntax.
    DATA LIST FREE / select_1 select_2 select_3 q_1 q_2 q_3.
    1 2 3 2 4 5
    4 2 1 2 3 1
    4 2 3 4 6 5
    5 6 7 2 1 3
    1 4 2 1 3 4
    4 5 3 4 6 5
    2 1 3 2 4 3
    VECTOR sel=select_1 TO select_3 / q=q_1 TO q_3 / sel_q (7).
    LOOP #=1 TO 3.
    + COMPUTE sel_q(sel(#))=q(#).
    FORMATS ALL(F1.0).

    SEL_Q5 SEL_Q6 SEL_Q7

    1 2 3 2 4 5 2 4
    5 . . . .
    4 2 1 2 3 1 1 3 .
    2 . . .
    4 2 3 4 6 5 . 6 5
    4 . . .
    5 6 7 2 1 3 . . . .
    2 1 3
    1 4 2 1 3 4 1 4 .
    3 . . .
    4 5 3 4 6 5 . . 5 4
    6 . .
    2 1 3 2 4 3 4 2
    3 . . . .

    Number of cases read: 7 Number of cases listed: 7
    David, Jan 5, 2012
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.