Define line color in a line graph by a value other than that of thestacking variable?

Discussion in 'SPSS' started by shysong, Dec 15, 2011.

  1. shysong

    shysong Guest

    Hello,

    I am trying to create a multiple line graph that shows the scores over
    multiple quarters on 17 different variables; each of these variables
    additionally has a rating of difficulty, which is invariant over time
    (it is a property of the measure). In other words, my data set looks
    like this:

    measureno difficulty quarter score
    1 1 1 10
    2 5 1 20
    3 6 1 70
    ....
    1 1 4 30
    2 5 4 50
    3 6 4 90

    I would like to plot a graph that shows time on the X axis, score on
    the Y axis, and a separate line for each measureno (that's the easy
    part -- I can do this quite simply with the multiple line graph
    option). However, I would like the color of each of the 17 lines to
    be determined not by measure number, but rather by difficulty (i.e.,
    lines fo measures that share the same difficulty values should be
    shaded the same color). How do I accomplish this in SPSS? The
    closest thing I've been able to do is row or column panels (i.e.,
    panel by difficulty value), but I really want all the lines on one
    graph. Any help would be appreciated.

    Thank you!
     
    shysong, Dec 15, 2011
    #1
    1. Advertisements

  2. shysong

    Andy W Guest

    Here is one potential (but hardly ideal) solution that I can come up
    with. It amounts to creating a separate line element for the 17
    measure numbers (a pain in the but, I know) and mapping color to the
    difficulty level.

    *********************************************************.
    input program.
    loop #measure = 1 to 17.
    loop #time = 1 to 10.
    compute time = #time.
    compute measure_num = #measure.
    end case.
    end loop.
    end loop.
    end file.
    end input program.
    execute.
    dataset name lines_color.

    if time = 1 difficulty = RND(RV.UNIFORM(0.5,4.5)).
    compute score = RV.NORMAL(100,10).
    execute.

    AGGREGATE
    /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
    /BREAK=measure_num
    /difficulty=FIRST(difficulty).
    execute.
    *does your data file look like this?.


    *now casestovars to get the data in the right shape.
    sort cases by time.
    casestovars
    /id = time
    /SEPARATOR = "_".
    *GGRAPH doesnt like periods in names.

    *Make it for 1 line element.
    GGRAPH
    /GRAPHDATASET NAME="graphdataset" VARIABLES=time[LEVEL=SCALE]
    score_1 difficulty_1
    MISSING=LISTWISE REPORTMISSING=NO
    /GRAPHSPEC SOURCE=INLINE.
    BEGIN GPL
    SOURCE: s=userSource(id("graphdataset"))
    DATA: time=col(source(s), name("time"))
    DATA: score_1=col(source(s), name("score_1"))
    DATA: difficulty_1=col(source(s), name("difficulty_1"))
    GUIDE: axis(dim(1), label("time"))
    GUIDE: axis(dim(2), label("score_1"))
    ELEMENT: line(position(time*score_1), color.interior(difficulty_1))
    END GPL.

    *Now just repeat for all 17 elements, yuck.
    GGRAPH
    /GRAPHDATASET NAME="graphdataset" VARIABLES=time[LEVEL=SCALE]
    score_1 difficulty_1
    score_2 difficulty_2 score_3 difficulty_3 score_4 difficulty_4
    score_5 difficulty_5
    MISSING=LISTWISE REPORTMISSING=NO
    /GRAPHSPEC SOURCE=INLINE.
    BEGIN GPL
    SOURCE: s=userSource(id("graphdataset"))
    DATA: time=col(source(s), name("time"))
    DATA: score_1=col(source(s), name("score_1"))
    DATA: difficulty_1=col(source(s), name("difficulty_1"),
    unit.category())
    DATA: score_2=col(source(s), name("score_2"))
    DATA: difficulty_2=col(source(s), name("difficulty_2"),
    unit.category())
    DATA: score_3=col(source(s), name("score_3"))
    DATA: difficulty_3=col(source(s), name("difficulty_3"),
    unit.category())
    DATA: score_4=col(source(s), name("score_4"))
    DATA: difficulty_4=col(source(s), name("difficulty_4"),
    unit.category())
    DATA: score_5=col(source(s), name("score_5"))
    DATA: difficulty_5=col(source(s), name("difficulty_5"),
    unit.category())
    GUIDE: axis(dim(1), label("time"))
    GUIDE: axis(dim(2), label("score"))
    ELEMENT: line(position(time*score_1), color.interior(difficulty_1))
    ELEMENT: line(position(time*score_2), color.interior(difficulty_2))
    ELEMENT: line(position(time*score_3), color.interior(difficulty_3))
    ELEMENT: line(position(time*score_4), color.interior(difficulty_4))
    ELEMENT: line(position(time*score_5), color.interior(difficulty_5))
    END GPL.
    ********************************************************************.

    I'm not quite sure, but I think there is a way to get the multiple
    lines drawn by inserting missing cases in the right place as well.
    Write back if you come up with a better solution, I would be
    interested to see.

    Andy W
     
    Andy W, Dec 16, 2011
    #2
    1. Advertisements

  3. shysong

    Jon Peck Guest

    The difficulty is that color is being used to distinguish the lines, so it is structural. If you set color to distinguish by difficulty, it will merge the lines that have the same difficulty, and you will thus get only one line per difficulty. It seems that you need two different aspects here - one that identifies the measure and one that identifies the difficulty.

    The solution is to use different attributes or to superimpose points on theline and assign their attribute by difficulty. Here is an example that sets the line color by measure and the point shape by difficulty,
    GRAPH
    /GRAPHDATASET NAME="graphdataset" VARIABLES=quarter MEAN(score)[name="MEAN_score"] measure difficulty
    MISSING=LISTWISE REPORTMISSING=NO
    /GRAPHSPEC SOURCE=INLINE.
    BEGIN GPL
    SOURCE: s=userSource(id("graphdataset"))
    DATA: quarter=col(source(s), name("quarter"), unit.category())
    DATA: MEAN_score=col(source(s), name("MEAN_score"))
    DATA: measure=col(source(s), name("measure"), unit.category())
    DATA: difficulty=col(source(s), name("difficulty"), unit.category())
    GUIDE: axis(dim(1), label("quarter"))
    GUIDE: axis(dim(2), label("Mean score"))
    GUIDE: legend(aesthetic(aesthetic.color.interior), label("measure"))
    SCALE: linear(dim(2), include(0))
    ELEMENT: line(position(quarter*MEAN_score), color.interior(measure), missing.wings())
    ELEMENT: point(position(quarter*MEAN_score), shape(difficulty))
    END GPL.

    This will give you two legends: one for the measure and one for the difficulty.

    HTH,
    Jon Peck
     
    Jon Peck, Dec 16, 2011
    #3
  4. shysong

    Andy W Guest

    Me and Jon apparently are making this more difficult than it seems! As
    of version 19, there is an optional argument within ELEMENT statements
    named "split", this option does not map to an aesthetic, but just
    splits up the elements. See updated example below (no reshaping of
    data nor specifying separate element statements are necessary).

    *********************************************************.
    input program.
    loop #measure = 1 to 17.
    loop #time = 1 to 10.
    compute time = #time.
    compute measure_num = #measure.
    end case.
    end loop.
    end loop.
    end file.
    end input program.
    execute.
    dataset name lines_color.

    if time = 1 difficulty = RND(RV.UNIFORM(0.5,4.4)).
    compute score = RV.NORMAL(100,10).
    execute.

    AGGREGATE
    /OUTFILE=* MODE=ADDVARIABLES OVERWRITE = YES
    /BREAK=measure_num
    /difficulty=FIRST(difficulty).
    execute.
    *does your data file look like this?.

    DATASET ACTIVATE lines_color.
    * Chart Builder.
    GGRAPH
    /GRAPHDATASET NAME="graphdataset" VARIABLES=time[LEVEL=SCALE] score
    difficulty measure_num MISSING=LISTWISE
    REPORTMISSING=NO
    /GRAPHSPEC SOURCE=INLINE.
    BEGIN GPL
    SOURCE: s=userSource(id("graphdataset"))
    DATA: time=col(source(s), name("time"))
    DATA: score=col(source(s), name("score"))
    DATA: difficulty=col(source(s), name("difficulty"), unit.category())
    DATA: measure_num=col(source(s), name("measure_num"),
    unit.category())
    GUIDE: axis(dim(1), label("time"))
    GUIDE: axis(dim(2), label("score"))
    GUIDE: legend(aesthetic(aesthetic.color.interior),
    label("difficulty"))
    ELEMENT: line(position(time*score), color.interior(difficulty),
    split(measure_num))
    END GPL.
    ************************************************************************.

    I'm glad I found this. Here come some choropleth maps in SPSS
    graphics!

    Andy
     
    Andy W, Dec 16, 2011
    #4
  5. shysong

    Art Kendall Guest

    Maps are back in version 20!



    Art Kendall
    Social Research Consultants

     
    Art Kendall, Dec 16, 2011
    #5
  6. shysong

    Jon Peck Guest

    Better, maps have an entire new implementation in V20 and are way better than they ever were with the old feature. And now mapping is part of the Base.
     
    Jon Peck, Dec 16, 2011
    #6
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.