Stat/Math
Software Support
Software Consulting
Software Availability
Software Price
Contact

User Support
Documentation
Knowledge Base
Education
Consulting
Podcasts

Systems & Services
Cyberinfrastructure
Supercomputers
Grid Computing
Storage
Visualization
Digital Libraries & Data

Results & Impact
Publications
Grants & Grant Info
Events & Outreach
Economic Impact
Survey Results

Vision & Planning
News & Features

# Further Data Analysis

So far, we’ve used SPSS to develop a basic idea about how SPSS for Mac OS X works. Next step is to examine a few other data analysis techniques (CORRELATIONS, REGRESSION, T-TEST, ANOVA). Refer to the vendor documentation for the most complete information.

## Sample Data Set

Now we will turn to another data set with more variables and cases. In this example, you will read an ASCII data file, clas.dat, created with a word processor and saved as a text file into the SPSS session. The data collected from 40 middle school students contains 26 variables including the following:

id (student identification number) sex (gender of the student) exp (previous computer experience in months/yrs) school (name of school system) C1 thru C10 (10 scores on the computer anxiety scale) M1 thru M10 (10 scores on the math anxiety scale) mathscor (math score for the same testing period) compscor (computer test score for a given testing period)

The first four variables (id, sex, exp, school) are background variables. The variable sex has two levels (M=male, F=female). Exp (prior computer experience) has three levels (1=less than one year, 2=1-2 years, 3=more than 2 years), school (type of school system) has three levels (1=rural, 2=suburban school, 3=urban school). The next 20 variables (C1..C10, M1..M10) are Likert type responses to computer opinion and math anxiety surveys. The remaining variables (mathscor, compscor) are scores on the math test and computer test.

A copy of the sample data file is available from the Stat/Math Web home page.

## Creating a Program to Read the Data File

Let us assume that the data file, clas.dat, is on your Desktop. The fastest way to read this data into SPSS for Mac OS X is using the Syntax window. You may open a Syntax Editor window (File/New/Syntax) and type in the following lines.

```  DATA LIST FILE='/Users/mysuer/Desktop/clas.dat'
/id 1-2 sex 3 (A) exp 4 school 5 c1 to c10 6-15 m1 to m10 16-25 mathscor 26-27 compscor 28-29.
MISSING VALUES mathscor compscor (99).
RECODE c3 c5 c6 c10 m3 m7 m8 m9 (1=5) (2=4) (3=3) (4=2) (5=1).
RECODE sex ('M'=1) ('F'=2) INTO nsex. /* char var into numeric var
COMPUTE compopi=SUM (c1 TO c10). /* find sum of 10 items using sum function
COMPUTE mathatti=m1+m2+m3+m4+m5+m6+m7+m8+m9+m10. /* adding each item
VARIABLE LABELS id 'Student Identification' sex 'Student Gender'
exp 'Yrs of Comp Experience' school 'School Representing'
mathscor 'Score in Mathematics' compscor 'Score in Computer Science'
compopi 'Total for Comp Survey' mathatti 'Total for Math Atti Scale'.
VALUE LABELS sex 'M' 'Male' 'F' 'Female'/
exp 1 'Up to 1 yr' 2 '2 years' 3 '3 or more'/
school 1 'Rural' 2 'City' 3 'Suburban'/
c1 TO c10 1 'Strongly Disagree' 2 'Disagree'3 'Undecided' 4 'Agree' 5 'Strongly Agree'/
m1 TO m10 1 'Strongly Disagree' 2 'Disagree' 3 'Undecided' 4 'Agree' 5 'Strongly Agree'/
nsex 1 'Male' 2 'Female'.
EXECUTE.
```
Use the mouse to highlight the command lines and click Run. The command lines will be executed and an active SPSS file will be created. Select Window/Untitled - IBM SPSS Statistics Data Editor to see the data file you just read in. Save the data file as an SPSS system file to a USB drive or to a hard drive.
• Select File/Save
• Type in a filename (e.g., clas.sav)
• A copy of the file will now be saved in SPSS format. Now you are ready for further data analysis.

## Correlation analysis

A correlation analysis is performed to quantify the strength of association between two numeric variables. In the following task we will perform Pearson correlation analysis. The variables used in the analysis are compopi, mathatti, mathscor and compscor.

• Select Analyze/Correlate/Bivariate… This opens the Bivariate Correlations dialog box. The numeric variables in your data file appear on the source list on the left side of the screen.
• Select compopi, compscor, mathatti and mathscor from the list and click the arrow box. The variables will be pasted into the selection box. The options Pearson and Two-tailed are selected by default.

• Click OK

A symmetric matrix with Pearson correlation as given below will be displayed on the screen. Along with Pearson r, the number of cases and probability values are also displayed.

## Simple Linear Regression

A correlation coefficient tells you that some sort of relation exists between the variables, but it does not tell you much more than that. For example, a correlation of 1.0 means that all points fall exactly on a straight line, but it says nothing about the form of the relation between the variables. When the observations are not perfectly correlated, many different lines may be drawn through the data. To select a line that describes the data, as close as possible to the points, you employ the Regression Analysis which is based on the least- squares principle. In the following task you will perform a simple regression analysis with compscor as the dependent variable, and mathscor as the independent variable.

• Choose Analyze/Regression/Linear… The Linear Regression dialog box appears.
• Choose compscor as the dependent variable
• Choose mathscor as the independent variable

• Click OK

The output will now be displayed on the screen as shown below:

## T-test

T-test is a data analysis procedure to test the hypothesis that two population means are equal. SPSS can compute independent (not related) and dependent (related) t-tests. For independent t-tests, you must have a grouping variable with exactly two values (e.g., male and female, pass and fail). The variable may either be numeric or character. Suppose you have a grouping variable with more than two categories. You may use the RECODE command to collapse the categories into two groups. For example, a variable, exp, has 3 categories. You want to collapse this into two categories (1 = < 1 yr. exp, 2 = one or more yrs.) and create a new variable, newexp. The syntax is:

```RECODE exp (1 = 1) (2,3 = 2) INTO newexp.
EXECUTE.
```

RECODE is a powerful SPSS command for data transformation with both numeric and string variables. In the following task, we will perform an independent t-test. The test variables are mathscor and compscor, and the grouping variable is newexp.

• Select Analyze/Compare Means/Independent-Samples T Test…
• Select compscor, and mathscor as the Test Variables
• Select newexp as the Grouping Variable.

• Click on Define Groups…
• Type 1 for Group 1, and 2 for Group 2.

• Click Continue
• Click OK

The output will now be displayed on the screen as shown below:

## One-way Analysis of Variance

The statistical technique used to test the null hypothesis that several population means are equal is called analysis of variance. It is called that because it examines the variability in the sample and, based on the variability, it determines whether there is a reason to believe the population means are not equal. The statistical test for the null hypothesis that all of the groups have the same mean in the population is based on computing the ratio of within and between group variability estimates, called the F statistic. A significant F value only tells you that the population means are probably not all equal. It does not tell you which pairs of groups appear to have different means. To pinpoint exactly where the differences are, multiple comparisons may be performed.

In the following exercise you will perform a One-Way ANOVA with compopi as the dependent variable, and exp as the factor variable.

• Select Analyze/Compare Means/One-Way ANOVA…
• Select compopi for the dependent variable
• Select exp for the factor variable

• Click Post Hoc…
• Select LSD(Least-significant difference)

• Click Continue
• Click Options…
• Select descriptive
• Click Continue
• Click OK

The output will be displayed on the screen as shown below: