Evaluative Summary of an article on
SPF-pr.q--Split Plot Factorial Design
1.
Background Information
Hess, C. M. & Shrigley,
R. (1981). A study of the effect of three modes of teaching on metric knowledge
and attitude. Science Education, 65(2), 131-138.
2.
Abstract
The purpose of this study was to examine the effectiveness of three instructional
modes for teaching metric knowledge and raising the level of positive metric
attitudes of preservice elementary teachers who include both high and low
math achievers.
A 32.2 split-plot factorial design (SPF-32.2) was employed with 3 Instructional
Modes (expository, modular, and gaming) and 2 levels of Math Abilities (high
math achiever and low math achiever) as between-subjects factors, and time
of testing (pretest and posttest) as within-subjects factor. A total of 82
preservice teachers (N = 82) with 41 high math achievers and 41 low math achievers
identified from 6 classes of preservice teachers (n=141) who enrolled in a
math education course at an eastern state college and were randomly assigned
to the expository, modular, and gaming instruction. participated in the study.
Twenty-six out of the 82 preservice teachers were in expository group, 30
preservice teachers in modular group, and 26 preservice teachers in gaming
group.
According to the researchers, there was no significant difference in preservice
teachers’ metric knowledge whether they were taught by expository, modular,
or gaming methods. Second, high math achievers and low math achievers gained
similarly in metric knowledge. In addition, preservice teachers achieved similar
scores in metric attitudes whether they were taught by expository, modular
or gaming methods. Finally, high math achievers and low math achievers gained
similarly in metric attitudes.
3.
Null hypothesis, alpha (or
p) level, and sample size per group.
As stated by the authors, there were 4 null hypotheses examined in this design.
The null hypotheses are as follows:
Ho1: There is no significant difference in the mean scores on metric knowledge
whether subjects are taught
with the expository, modular, or gaming treatments.
Ho2: There is no significant difference in the mean gain scores on metric
knowledge
between high math achievers
and low math achievers.
Ho3: There is no significant difference in the mean scores on metric attitudes
whether
subjects are taught with the
expository, modular, or gaming treatments.
Ho4: There is no significant difference in the mean gain scores on metric
attitudes
between high math achievers
and low math achievers.
Since the authors were interested to investigate the effectiveness of three instructional modes for teaching metric knowledge
and raising the level of positive metric attitudes of preservice elementary
teachers who include both high and low math achievers, it would be better
and clearer if the hypotheses were revised as follows:
Ho1: There is no significant difference in the preservice teachers’
gain scores on
metric knowledge whether
they are taught with the expository, modular, or
gaming methods.
Ho2: There is no significant difference in the preservice teachers’ gain scores
on metric
knowledge between high math
achievers and low math achievers.
Ho3: There is no significant difference in the preservice teachers’ gain scores
on metric
attitudes whether they are
taught with the expository, modular, or gaming
treatments.
Ho4: There is no significant difference in the preservice teachers’ gain scores
on metric
attitudes between high math
achievers and low math achievers.
If the use of SPF-32.2 was appropriate, 8 other hypotheses concerning interactions
between Instructional Modes, Math ability, and Time of Testing on preservice
teachers’ metric knowledge and metric attitudes should be considered. However,
since SPF-32.2 design was not appropriate for the purpose of this research,
I would not provide these hypotheses here. Instead if a correct design, 2-way
CRF-32 design (the reasons for using 2-way CRF-32 design are given in question
#8) was employed in this study, two other hypotheses concerning interactions
between Instructional Modes and Math ability on preservice teachers’ metric
knowledge and metric attitudes should be considered. These two hypotheses
are stated as below.
Ho5: There is no significant interaction between Instructional Modes and Math
ability
on preservice teachers’ metric
knowledge.
Ho6: There is no significant interaction between Instructional Modes and Math
ability
on preservice teachers’ metric
attitudes.
For F-test, the
level was not stated,
but the p level was reported in the ANOVA table. In ANOVA test of Metric Knowledge, two significant F-ratios, main
effect of Math Ability, and main effect of Time of Testing (Pre-post testing),
were reported with the p< .01 level. In ANOVA test of Metric Attitude,
two F-ratio were reported significant with main effect of Time of Testing
(Pre-post testing) at p<. 01, and interaction effect of Time of Testing
x Instructional Modes at p< .05. Although
the authors did not specify the
level for F test, they
mentioned if the interaction between factors was significant, the simple main
effects would be tested at a= .05 level.
This study consisted of 82 preservice teachers. The 82 subjects were identified
from 6 classes of preservice teachers (n=141) who enrolled in a math education
course at an eastern state college, were randomly assigned to the expository,
modular, and gaming instruction and were further identified as high math achievers
(n=41) and low math achievers (n=41). Twenty-six out of the 82 preservice
teachers were in expository method, 30 in modular method, and 26 were in gaming
method. From the information given in this article, it is not clear how many
subjects were in each cell (combination of Instructional Mode and Math Ability).
4.
Independent and Dependent Variables
If the SPF-32.2 design was correct, there would be three independent variables:
Instruction Modes, Math Ability as the between-subjects factors and Time of
Testing as the within-subjects factor. The first independent variable, Instruction
Modes, has three levels: expository method, modular method, and gaming method.
The second independent variable, Math Ability, consists of high math achievers
and low math achievers. The third independent variable, Time of Testing, includes
pretest and posttest of metric knowledge and metric attitudes.
There were two dependent variables, preservice teachers’ Metric Knowledge,
and preservice teachers’ Metric Attitudes.
5.
Instrument, comment on its
reliability and validity
For the independent variable, Math Ability, the authors used preservice teachers’
scores on the Sequential Tests of Educational Progress II, Form A, “Mathematics
Concepts” to identify high math achievers and low math achievers. However,
no reliability and validity evidence was provided regarding this measurement.
For the dependent variables, the Szabo-Trueblood Test of Metric Knowledge
(STMK) and the Shrigley-Trueblood Metric Measurement Attitudes Scale (SMAS)
were administered to test the preservice teachers’ Metric Knowledge and Metric
Attitudes respectively. The STMK consists of 50 multiple-choice items which
measure mastery of (1) knowledge of quantities within metric system, (2) comparison
of metric measurements to common objects or standard measures, and (3) conversion
between units of metric measurement. According to the authors, the STMK has
a reliability of .93 when administered to other preservice teachers. Because
a close comparison of the test items and the metric content being taught had
been done, the authors claimed that the STMK is content valid.
The SMAS consists of 22 statements, 10 negative and 12 positive. The SMAS
has a reliability of .92 when administered to other preservice teachers in
1977. For validity, the authors claimed the SMAS was validated by a factor
analysis and Likert analysis as well as Edwards’ set of 13 criteria for attitude
scale construction.
6.
Experimental Procedure
This study used a SPF-32.2 design. Eighty-two preservice teachers with 41
high math achievers and 41 low math achievers were identified from 6 classes
of preservice teachers (n=141) who enrolled in a math education course at
an eastern state college and were randomly assigned to the expository, modular,
and gaming instruction groups.
The three different instructional groups met simultaneously with three different
instructors. Since it was impossible to have all three instruction groups
taught by a single instructor and the researchers were also limited by the
availability of only two equally qualified instructors, procedures such as
control of teaching materials and the roles played by these instructors were
taken to control instructor bias or personality differences.
Before the formal treatment, two classes of nonmetric pretreatment designed
to familiarize all subjects with gaming and modular modes of learning were
implemented. Following these two classes of pretreatment activities, each
of the three different instructional groups were taught by three different
instructors for three class hours of metric instruction.
7.
Statistical Analysis and Conclusions
A SPF-32.2 analysis of variance with Instructional Modes and Math Ability
as between-subject factors and Time of Testing as within-subject factor was
employed. According to the authors, in metric knowledge, the results showed
that the main effect of Instructional Modes was not statistically significant
indicating that type of instruction did not influence preservice teachers’
metric knowledge, F(2, 76) = 2.47, p >.05. The results of SPF-32.2 also
showed that interaction effect of Math Ability x Time of Testing was not significant
indicating there is no significant difference in preservice teachers’ gain
scores in metric knowledge between high math achievers and low math achievers,
F(1, 76) =.00, p>.05.
For the metric attitudes, the results showed that the main effect of Instructional
Modes was not statistically significant indicating that instructional modes
did not influence preservice teachers’ metric attitudes, F(2, 76) = 1.79,
p >.05. In addition, the results showed that the interaction between Math
Ability x Time of Testing was not significant indicating that there is no
significant difference in preservice teachers’ gain scores in metric attitudes
between high math achievers and low math achievers, F(1, 76) =1.38, p>.05.
The results also suggested an interaction between Instructional Modes and
Time of Testing was significant, F(2,76)=3.68, p<.05 (The authors employed
simple main effects to test this interaction effect, but the result does not
make any sense).
If the SPF-32.2 design was corrected, the conclusion concerning metric attitudes
was questionable because significant interaction between Time of Testing and
Instructional Modes was shown. Therefore, it was inappropriate to conclude
main effect of Instructional Modes was insignificant. The researchers should
consider the interaction effect first. However, since the authors used an
incorrect research design, SPF-32.2, to analyze the data, the findings and
conclusions might be incorrect.
8.
If you were the researcher, how would you improve the study?
The purpose
of the study was to compare the effectiveness of three different instructional
modes on the metric knowledge and metric attitudes of the preservice teachers
who include both high math achievers and low math achievers. There were four
research questions involved in the research. To answer these questions, it
is inappropriate to test marginal means of the three different methods (here
marginal means refer to the means of pretest and posttest of the three different
instructional groups). The same argument can be applied to questions concerning
math ability. It would be more appropriate to use 2-way CRF design with gain
scores (difference between posttest and pretest) on metric knowledge and metric
attitudes as dependent variables to seek answers for the foregoing questions.
Furthermore, the effectiveness of the three different instructional modes
might result in different metric knowledge and metric attitudes for high math
achievers and low math achievers, therefore it is more appropriate to test
the interaction between instructional modes and math ability. The 3 (Instructional
Modes: expository, modular, and gaming) x 2 (Math Ability: high math achievers
and low math achievers) analysis of variance with gains cores (the difference
between posttest and pretest) of metric knowledge and metric attitudes as
dependent variables is appropriate for answering these questions.
Second, as described by the authors, the 3 groups were taught by 3 different
instructors and only 2 of the 3 instructors were equally qualified. Although
they made many efforts to control instructor bias or personality differences,
it was still possible that the instructors would have an impact on the preservice
teachers’ metric knowledge and metric attitudes. Given that it was impossible
for one instructor to teach three different methods, or having three equally
qualified teachers to teach three different instruction groups simultaneously,
if I were the researcher, to control the impact caused by the instructors,
I would choose hierarchical design.
Furthermore, if we consider the relationship between the two dependent variables,
Metric Knowledge and Metric Attitudes, we might want to use two-way MANOVA
instead of two-way ANOVA to avoid an inflated type I error.
There were also other weaknesses of this study. If I were the researcher,
the followings would also be taken into account.