Stat/Math
Software Support
Software Consulting
Software Availability
Software Price
Contact

User Support
Documentation
Knowledge Base
Education
Consulting
Podcasts

Systems & Services
Cyberinfrastructure
Supercomputers
Grid Computing
Storage
Visualization
Digital Libraries & Data

Results & Impact
Publications
Grants & Grant Info
Events & Outreach
Economic Impact
Survey Results

Vision & Planning
News & Features

## Further Data Analysis

So far, we've used SPSS to develop a basic idea about how SPSS for Windows works. Next step is to examine a few other data analysis techniques (CORRELATIONS, REGRESSION, T-TEST, ANOVA). Refer to the vendor documentation for the most complete information.

### Sample Data Set

Now we will turn to another data set with more variables and cases. In this example, you will read an ASCII data file, clas.dat, created with a word processor and saved as a text file into the SPSS session. The data collected from 40 middle school students contains 26 variables including the following:

• id (student identification number)
• sex (gender of the student)
• exp (previous computer experience in months/yrs)
• school (name of school system)
• C1 thru C10 (10 scores on the computer anxiety scale)
• M1 thru M10 (10 scores on the math anxiety scale)
• mathscor (math score for the same testing period)
• compscor (computer test score for a given testing period)

The first four variables (id, sex, exp, school) are background variables. The variable sex has two levels (M=male, F=female). Exp (prior computer experience) has three levels (1=less than one year, 2=1-2 years, 3=more than 2 years), school (type of school system) has three levels (1=rural, 2=suburban school, 3=urban school). The next 20 variables (C1..C10, M1..M10) are Likert type responses to computer opinion and math anxiety surveys. The remaining variables (mathscor, compscor) are scores on the math test and computer test.

A copy of the sample data file is available from the Stat/Math Web home page (http://www.indiana.edu/~statmath). To obtain a copy of the file:

• Launch a Web browser (e.g. Internet Explorer or Mozilla Firefox)
• Go to the URL: http://www.indiana.edu/~statmath
• Select SPSS under Software Support
• Select Sample Data , then clas.dat (data)
• Save this as a text file (e.g. in Mozilla Firefox, right click on clas.dat and Save Link As... ) to a flash drive.

### Creating a Program to Read the Data File

Let us assume that the data file, clas.dat, is on drive C. At this point the fastest way to read this data into SPSS for Windows is using the Syntax window. You may open a Syntax Editor window (File → New → Syntax) and type in the following lines or create a command file with the following lines using a word processor or editor and then read it into the Syntax Editor window (File → Open and read it by clicking Syntax(*.sps) for the file type from the Open File dialog box). Suppose the following command lines are stored (in text format) in a file, clas.sps, on drive C.

```  DATA LIST FILE='C:\TEMP\clas.dat'
/id 1-2 sex 3 (A) exp 4 school 5 c1 to c10 6-15 m1 to m10 16-25 mathscor 26-27 compscor 28-29.
MISSING VALUES mathscor compscor (99).
RECODE c3 c5 c6 c10 m3 m7 m8 m9 (1=5) (2=4) (3=3) (4=2) (5=1).
RECODE sex ('M'=1) ('F'=2) INTO nsex. /* char var into numeric var
COMPUTE compopi=SUM (c1 TO c10). /* find sum of 10 items using sum function
COMPUTE mathatti=m1+m2+m3+m4+m5+m6+m7+m8+m9+m10. /* adding each item
VARIABLE LABELS id 'Student Identification' sex 'Student Gender'
exp 'Yrs of Comp Experience' school 'School Representing'
mathscor 'Score in Mathematics' compscor 'Score in Computer Science'
compopi 'Total for Comp Survey' mathatti 'Total for Math Atti Scale'.
VALUE LABELS sex 'M' 'Male' 'F' 'Female'/
exp 1 'Up to 1 yr' 2 '2 years' 3 '3 or more'/
school 1 'Rural' 2 'City' 3 'Suburban'/
c1 TO c10 1 'Strongly Disagree' 2 'Disagree'
3 'Undecided' 4 'Agree' 5 'Strongly Agree'/
m1 TO m10 1 'Strongly Disagree' 2 'Disagree'
3 'Undecided' 4 'Agree' 5 'Strongly Agree'/
nsex 1 'Male' 2 'Female'.
EXECUTE.
```

Use the mouse to highlight the command lines and click Run. The command lines will be executed and an active SPSS file will be created. Select Window → Untitled - SPSS Data Editor to see the data file you just read in. Save the data file as an SPSS system file to drive C or to other hard drive.

• Select File → Save
• Type in a filename (e.g., clas.sav)
• A copy of the file will now be saved in SPSS format. Now you are ready for further data analysis.

### Correlation analysis

A correlation analysis is performed to quantify the strength of association between two numeric variables. In the following task we will perform Pearson correlation analysis. The variables used in the analysis are mathscor (Score in Mathematics), compscor (Score in Computer Science), compopi (Total for Comp Survey), and mathatti (Total for Math Atti Scale).

• Select Analyze → Correlate → Bivariate... This opens the Bivariate Correlations dialog box. The numeric variables in your data file appear on the source list on the left side of the screen.
• Select compopi, compscor, mathatti and mathscor from the list and click the arrow box. The variables will be pasted into the selection box. The options Pearson and Two-tailed are selected by default.
• Click OK

A symmetric matrix with Pearson correlation as given below will be displayed on the screen. Along with Pearson correlation coefficient r, the p-values and the number of cases are also displayed.

Correlations

### Simple Linear Regression

A correlation coefficient tells you that some sort of relation exists between the variables, but it does not tell you much more than that. For example, a correlation of 1.0 means that there exits a positive linear relationship between the two variables, but it does not say anything about the form of the relation between the variables. When the observations are not perfectly correlated, many different lines may be drawn through the data. Linear Regression, often called Ordinary Least Squares (OLS), explores the relationship between a dependetn variable and independent variables in a systematic way. In the following task you will perform a simple regression analysis with compscor as the dependent variable, and mathscor as the independent variable.

• Choose Analyze → Regression → Linear... The Linear Regression dialog box appears.
• Choose compscor (Score in Computer Science), as the dependent variable
• Choose mathscor (Score in Mathematics), as the independent variable
• Click OK

The output will now be displayed on the screen as shown below:

Regression

### T-test

T-test is a data analysis procedure to test the hypothesis that two population means are equal. SPSS can compute independent (not related) and dependent (related) t-tests. For independent t-tests, you must have a grouping variable with exactly two values (e.g., male and female, pass and fail). The variable may either be numeric or character. Suppose you have a grouping variable with more than two categories. You may use the RECODE (Transform/Recode) command to collapse the categories into two groups. For example, a variable, exp, has 3 categories. You want to collapse this into two categories (1 = < 1 yr. exp, 2 = one or more yrs.) and create a new variable, newexp. The syntax is:

```  recode exp (1 = 1) (2,3 = 2) into newexp.
execute.
```

RECODE is a powerful SPSS command for data transformation with both numeric and string variables.

In the following task, we will perform an independent t-test. The test variables are mathscor (Score in Mathematics), and compscor (Score in Computer Science), and the grouping variable is newexp.

• Select Analyze → Compare Means → Independent-Samples T-test...
• Select compscor, and mathscor as the Test Variables
• Select newexp as the Grouping Variable.
• Click on Define Groups...
• Type 1 for Group 1, and 2 for Group 2.

• Click Continue
• Click OK

The output will now be displayed on the screen as shown below:

T-Test

A t-test with two related variables is performed using the Paired-Samples T-Test from the Analyze → Compare Means menu. The paired T-test is applicable for data collected in a pre-post (before and after) kind of situation.

### One-way Analysis of Variance

The statistical technique used to test the null hypothesis that several population means are equal is called analysis of variance. It is called that because it examines the variability in the sample, and based on the variability, it determines whether there is a reason to believe the population means are not equal. The statistical test for the null hypothesis that all of the groups have the same mean in the population is based on computing the ratio of within and between group variability estimates, called the F statistic. A significant F value only tells you that the population means are probably not all equal. It does not tell you which pairs of groups appear to have different means. To pinpoint exactly where the differences are, multiple comparisons may be performed.

In the following exercise you will perform a One-Way ANOVA with compopi (Total for Comp Survey) as the dependent variable, and exp (Yrs of Comp Experience) as the factor variable.

• Select Analyze → Compare Means → One-Way ANOVA...
• Select compopi for the dependent variable
• Select exp for the factor variable
• Click Post Hoc...
• Select LSD(Least-significant difference)

• Click Continue
• Click Options...
• Select descriptive
• Click Continue
• Click OK

The output will be displayed on the screen as shown below:

Oneway

Post Hoc Tests