Further Data Analysis

So far, we've used SPSS to develop a basic idea about how SPSS for Macintosh works. Next step is to examine a few other data analysis techniques (CORRELATIONS, REGRESSION, T-TEST, ANOVA). All the statistical procedures available under a mini or mainframe version of SPSS are available from SPSS for Macintosh. Refer to the vendor documentation for the most complete information.

Sample Data Set

Now we will turn to another data set with more variables and cases. In this example, you will read an ASCII data file, clas.dat, created with a word processor and saved as a text file into the SPSS session. The data collected from 40 middle school students contains 26 variables including the following:

  • id (student identification number)
  • sex (gender of the student)
  • exp (previous computer experience in months/yrs)
  • school (name of school system)
  • C1 thru C10 (10 scores on the computer anxiety scale)
  • M1 thru M10 (10 scores on the math anxiety scale)
  • mathscor (math score for the same testing period)
  • compscor (computer test score for a given testing period)

The first four variables (id, sex, exp, school) are background variables. The variable sex has two levels (M=male, F=female). Exp (prior computer experience) has three levels (1=less than one year, 2=1-2 years, 3=more than 2 years), school (type of school system) has three levels (1=rural, 2=suburban school, 3=urban school). The next 20 variables (C1..C10, M1..M10) are Likert type responses to computer opinion and math anxiety surveys. The remaining variables (mathscor, compscor) are scores on the math test and computer test.

A copy of the sample data file is available from the Stat/Math Web home page (http://www.indiana.edu/~statmath). To obtain a copy of the file:

  • Launch a Web browser (e.g. Netscape or Internet Explorer)
  • Go to the URL: http://www.indiana.edu/~statmath
  • Select Software Title then SPSS
  • Select SPSS Sample Data and Commands then clas.dat
  • Save this as a text file (e.g. in Netscape go to File /Save as... and change Save File as Type to text) to a diskette.

Creating a Program to Read the Data File

Let us assume that the data file, clas.dat, is on a diskette (named untitled). At this point the fastest way to read this data into SPSS for Macintosh is using the Syntax window. You may open a Syntax Editor window (File/New/Syntax) and type in the following lines or create a command file with the following lines using a word processor or editor and then read it into the Syntax Editor window (File/Open/Sytax and read it by clicking the filename from the Open File dialog box). Suppose the following command lines are stored (in text format) in a file, clas.sps, on a diskette.

  DATA LIST FILE='untitled:\clas.dat'
    /id 1-2 sex 3 (A) exp 4 school 5 c1 to c10 6-15 m1 to m10 16-25 mathscor 26-27 compscor 28-29.
  MISSING VALUES mathscor compscor (99).
  RECODE c3 c5 c6 c10 m3 m7 m8 m9 (1=5) (2=4) (3=3) (4=2) (5=1).
  RECODE sex ('M'=1) ('F'=2) INTO nsex. /* char var into numeric var
  COMPUTE compopi=SUM (c1 TO c10). /* find sum of 10 items using sum function 
  COMPUTE mathatti=m1+m2+m3+m4+m5+m6+m7+m8+m9+m10. /* adding each item
  VARIABLE LABELS id 'Student Identification' sex 'Student Gender'
    exp 'Yrs of Comp Experience' school 'School Representing'
    mathscor 'Score in Mathematics' compscor 'Score in Computer Science'
    compopi 'Total for Comp Survey' mathatti 'Total for Math Atti Scale'.
  VALUE LABELS sex 'M' 'Male' 'F' 'Female'/
    exp 1 'Up to 1 yr' 2 '2 years' 3 '3 or more'/
    school 1 'Rural' 2 'City' 3 'Suburban'/
    c1 TO c10 1 'Strongly Disagree' 2 'Disagree'
              3 'Undecided' 4 'Agree' 5 'Strongly Agree'/
    m1 TO m10 1 'Strongly Disagree' 2 'Disagree'
              3 'Undecided' 4 'Agree' 5 'Strongly Agree'/
    nsex 1 'Male' 2 'Female'.

  EXECUTE.
    

Use the mouse to highlight the command lines and click Run. The command lines will be executed and an active SPSS file will be created. Select Window/Untitled - SPSS Data Editor to see the data file you just read in. Save the data file as an SPSS system file to a diskette or to a hard drive.

  • Select File/Save
  • Type in a filename (e.g., clas.sav)
  • A copy of the file will now be saved in SPSS format. Now you are ready for further data analysis.

Correlation analysis

A correlation analysis is performed to quantify the strength of association between two numeric variables. In the following task we will perform Pearson correlation analysis. The variables used in the analysis are compopi, mathatti, mathscor and compsc or.

  • Select Analyze/Correlate/Bivariate... This opens the Bivariate Correlations dialog box. The numeric variables in your data file appear on the source list on the left side of the screen.
  • Select compopi, compscor, mathatti and mathscor from the list and click the arrow box. The variables will be pasted into the selection box. The options Pearson and Two-tailed are selected by default.
Correlation dialog box
  • Click OK

A symmetric matrix with Pearson correlation as given below will be displayed on the screen. Along with Pearson r, the number of cases and probability values are also displayed.

Correlation results

Simple Linear Regression

A correlation coefficient tells you that some sort of relation exists between the variables, but it does not tell you much more than that. For example, a correlation of 1.0 means that all points fall exactly on a straight line, but it says nothing abou t the form of the relation between the variables. When the observations are not perfectly correlated, many different lines may be drawn through the data. To select a line that describes the data, as close as possible to the points, you employ the Regressi on Analysis which is based on the least- squares principle. In the following task you will perform a simple regression analysis with compscor as the dependent variable, and mathscor as the independent variable.

  • Choose Analyze/Regression/Linear... The Linear Regression dialog box appears.
  • Choose compscor as the dependent variable
  • Choose mathscor as the independent variable
Linear Regression dialog box
  • Click OK

The output will now be displayed on the screen as shown below:

Variables and Model Summary

ANOVA and Coefficients

T-test

T-test is a data analysis procedure to test the hypothesis that two population means are equal. SPSS can compute independent (not related) and dependent (related) t-tests. For independent t-tests, you must have a grouping variable with exactly two valu es (e.g., male and female, pass and fail). The variable may either be numeric or character. Suppose you have a grouping variable with more than two categories. You may use the RECODE (Transform/Recode) command to collapse the categories into two gr oups. For example, a variable, exp, has 3 categories. You want to collapse this into two categories (1 = < 1 yr. exp, 2 = one or more yrs.) and create a new variable, newexp. The syntax is:

  recode exp (1 = 1) (2,3 = 2) into newexp. 
  execute. 

RECODE is a powerful SPSS command for data transformation with both numeric and string variables.

In the following task, we will perform an independent t-test. The test variables are mathscor and compscor, and the grouping variable is newexp.

  • Select Analyze/Compare Means/Independent-Samples T-test...
  • Select compscor, and mathscor as the Test Variables
  • Select newexp as the Grouping Variable.
Independent-Samples T-Test dialog box
  • Click on Define Groups...
  • Type 1 for Group 1, and 2 for Group 2.
Define Groups
  • Click Continue
  • Click OK

The output will now be displayed on the screen as shown below:

Group Statistics

Independent Samples Test

A t-test with two related variables is performed using the Paired-Samples T-Test from the Analyze/Compare Means menu. The paired T-test is applicable for data collected in a pre-post (before and after) kind of situation.

One-way Analysis of Variance

The statistical technique used to test the null hypothesis that several population means are equal is called analysis of variance. It is called that because it examines the variability in the sample and,

based on the variability, it determines whether there is a reason to believe the population means are not equal. The statistical test for the null hypothesis that all of the groups have the same mean in the population is based on computing the ratio of within and between group variability estimates, called the F statistic. A significant F value only tells you that the population means are probably not all equal. It does not tell you which pairs of groups appear to have different means. To pinpoint exac tly where the differences are, multiple comparisons may be performed.

In the following exercise you will perform a One-Way ANOVA with compopi as the dependent variable, and exp as the factor variable.

  • Select Analyze/Compare Means/One-Way ANOVA...
  • Select compopi for the dependent variable
  • Select exp for the factor variable
One-Way ANOVA dialog box
  • Click Post Hoc...
  • Select LSD(Least-significant difference)
Post Hoc Multiple Comparisons dialog box
  • Click Continue
  • Click Options...
  • Select descriptive
  • Click Continue
  • Click OK

The output will be displayed on the screen as shown below:

Descriptives and ANOVA

Multiple Comparisons


Next: Further Reading
Prev: SPSS Output
Up: Table of Contents