Example Data

Overview of Sample Data

Suppose a researcher collected the following data during a study to investigate computer anxiety in middle school children. The data were collected from 40 ninth graders in three different school systems. The information collected on each student is: identification number, gender, school system, previous computer experience, scores on a 10-item Likert type computer anxiety scale, scores on a 10-item Likert type mathematics anxiety scale, math scores for a given testing period, and computer test scores for the same testing period. With this information in hand the researcher wanted to write a SAS program to analyze data, both descriptive and inferential.

Let's look into various aspects of creating a SAS program for this data analysis. The first task is to present these data in an orderly form so the SAS software can read and analyze them. There are several variables involved in this research. In SAS Version 8, variables are named with 32 or fewer characters, but must begin with a letter. Let us name these variables according to SAS conventions:

  • ID student identification number
  • SEX gender of the student
  • EXP previous computer experience in months/yrs
  • SCHOOL name of school system
  • C1 thru C10 10 scores on the computer anxiety scale
  • M1 thru M10 10 scores on the math anxiety scale
  • COMPSCOR computer test score for a given testing period
  • MATHSCOR math score for the same testing period

Once the variables are named according to SAS conventions, the next task is to prepare a code book with details of the data layout. Following is a code book for the research in discussion.


VARIABLE NAME         WIDTH          COLUMNS		VALUE LABELS



  ID                    2              1-2          none

  SEX                   1              1          M=male, F=female

  EXP                   1              4         1=1 yr or less,2=2 yrs, 3=3 yrs

  SCHOOL                1              5        1=rural,2=city, 3=suburban

  C1                    1              6         1=strongly agree, 2=agree,

                                                  3=undecided, 4=disagree, 5=agree

  C2                    1              7              "

  C3                    1              8              "

  C4                    1              9              "

  C5                    1              10             "

  C6                    1              11             "

  C7                    1              12             "      

  C8                    1              13             "

  C9                    1              14             "

  C10                   1              15             "

  M1                    1              16             "

  M2                    1              17             "

  M3                    1              18             "

  M4                    1              19             "

  M5                    1              20             "

  M6                    1              21             "

  M7                    1              22             "

  M8                    1              23             "

  M9                    1              24             "

  M10                   1              25             "

  MATHSCOR              2              26-27

  COMPSCOR              2              28-29

In the above code book VARIABLE NAME stands for the name of the variable in the data, and WIDTH stands for the number of fields taken by each variable. For example, the variable ID takes a maximum of two fields/columns since the highest ID number is 40; EXP takes a maximum of 1 column/field. COLUMNS stands for the column number/s on a given line where a value for each variable can be found by SAS. VALUE LABELS means the value represented within a variable. For example, within the variable SEX, M represents male and F represents female students. Within the variable SCHOOL, 1, 2, 3 represent rural, city, and suburban schools, respectively.

Now let us examine how the data layout will look on a coding sheet or on a computer terminal. These information/variable values are being copied from questionnaires filled in by students. The variables are placed into appropriate columns based on the code book prepared earlier.


01M12123112245222113541213944

02F22325445211233445422212526

03F11211551141121122155114845

Note that on every line a given variable appears in the same column(s). For example, the variable SEX appears in column 3 of every line. In the above data no blank space is left between variables. You may choose to leave a blank space after each variable as:

 

01 M 1 2 1 2 3 1 1 2 2 4 5 2 2 2 1 1 3 5 4 1 2 1 39 44

02 F 2 2 3 2 5 4 4 5 2 1 1 2 3 3 4 4 5 4 2 2 2 1 25 26

03 F 1 1 2 1 1 5 5 1 1 4 1 1 2 1 1 2 2 1 5 5 1 1 48 45

Whichever style (format) you choose, as long as you convey the format correctly to SAS, it should not have any impact on the analysis. In the above layout there are only three lines of data where each line stands for an observation (information about each person). Note that each subject has only one line (record) of data. In another situation you may have more than one record per subject/observation.

Suppose these data are stored in a file in your directory under the name clas.dat. The data can be entered directly to a Unix environment using an editor (e.g., vi, emacs, pico) or can be typed onto a floppy diskette from a microcomputer and then uploaded to the Unix environment using FTP (File Transfer Protocol) or any other appropriate communications package.

Downloading Sample Data

If you are interested in obtaining a copy of this data file you may copy it from the Stat/Math website (http://www.indiana.edu/~statmath).

To obtain a copy of the sample files:

  1. Click Sample program file (http://www.indiana.edu/~statmath/stat/sas/CLAS.SAS) and follow the instruction into the pop-up window.
  2. Then click Sample data file (http://www.indiana.edu/~statmath/stat/sas/CLAS.DAT).
  3. Transfer these files to your Unix account.

Contact a UITS consultant if you need assistance.


Next: Writing a SAS Program: the DATA Step
Prev: Introduction
Up: Table of Contents