Evaluative
Summary of article on
Split-Plot
Factorial Design (SPF p · qrt)
1. Background Information
Authors: Watts, G. H., and Anderson, R. C.
Title: Effects of three types of inserted questions on learning from prose.
Source: Journal of Educational Psychology, 62(5), 387-394.
Year: (1980).
2. Abstract
Using an experimental design, the current study examined the effects of questions that require students to apply what they have read to some new situation. More specifically, performance on a posttest was studied as a function of the type of question inserted into the instructional sequence. High school seniors answered an inserted question after reading each of five passages explaining a psychological principle. The inserted questions varied in that some involved identifying an example of the principle, whereas others involved identifying the name of the psychologist associated with that principle. The type of question inserted was a between-subject factor with five levels (RE1, RE2, A1, A2, N). After reading the passages and answering the inserted question, students took a posttest which consisted of 25 questions (5 questions from each passage). In addition to examining the effect of question type on a posttest, three other factors were also examined as having potential effects. These factors were within-subject factors and included subject matter (five principles), position of a given passage in the instructional sequence (1 to 5), and question type on the posttest (five types). The results of this study implied that inserted application questions induced students to process the text more thoroughly (i.e., students who received inserted application questions scored highest on the posttest). Furthermore, the within-subject factors of position of instructional passage, question type, and subject matter were all significant. Additionally, there were significant interactions between type of inserted question and posttest question type as well as subject matter and posttest question type.
3. Null hypothesis, alpha (or p) level, and sample size per group
Eleven null hypotheses were tested in this research study (note: this is not exhaustive; rather, these were the only hypotheses examined)
Ho1: The population means of type of inserted question (T) are equal. In other words, inserted question type has no effect on posttest performance. Posttest performance will be the same regardless of inserted question type.
Ho1: m RE1 = m RE2 = m A1 = m A2 = m N (m = mean)
Ho2: The population means of position of instructional passage(P) are equal. In other words, the position of the instructional passage has no impact on posttest scores.
Ho2: m P1 = m P2 = m P3
= m P4 = m P5
Ho3: The population means of question type (Q) are equal. In other words, the question type on the posttest has no effect on posttest scores.
H03: m Q1 = m Q2 = m Q3 = m Q4 = m Q5
Ho4: The population means of subject matter (SM) are equal. In other words, the subject matter of the passages has no effect on posttest scores.
Ho4: m SM1 = m SM2 = m SM3 = m SM4 = m SM5
Ho5: There is no interaction between inserted question type (T) and position of instructional passage (P). In other words, posttest scores will not differ because of inserted question type and position of instructional strategy. The effect of T does not depend on the levels of P. Further, the effect of P does not depend on the levels of T.
H05: there is no interaction between T and P
Ho6: There is no interaction between inserted question type (T) and posttest question type (Q). In other words, the effect of T does not depend on the levels of Q. Further, the effect of Q does not depend on the levels of T.
Ho6: there is no interaction between T and Q
Ho7: There is no interaction between inserted question type (T) and subject matter (SM). In other words, the effect of T does not depend on the levels of SM. Further, the effect of SM does not depend on T.
Ho7: there is no interaction between T and SM
Ho8: There is no interaction between position of instructional passage (P) and posttest question type (Q). In other words, the effect of P does not depend on the levels of Q. Further, the effect of Q does not depend on the levels of P.
Ho8: there is no interaction between P and Q
H9: There is no interaction between inserted question type (T), position of instructional passage (P), and posttest question type (Q).
H9: there is no interaction between T and
P and Q
H10: There is no interaction between subject matter (SM) and posttest question type (Q).
H10: there is no interaction between SM and Q
H11: There is no interaction between inserted question type (T), subject matter (SM), and posttest question type (Q).
H11: there is no interaction between T and SM and Q
The alpha level was not pre-stated. The ANOVA F-ratio was significant at the p < .01 for all effects and interactions.
Sample Size Per Group:
300 high school seniors participated in the current study
50 subjects were in each of the between factor (inserted question type/control) groups
4. Independent and dependent variables
Independent
1. Inserted question type, with five levels (between factor):
i. Repeated example questions in which the correct answer exactly reproduced a situation described in the opening paragraph of the passage (RE1)
ii. Repeated example questions in which the correct answer exactly reproduced a situation described in the second paragraph of the passage (RE2)
iii. Application questions in which the correct answer described a new example of the concept described in the opening paragraph of the passage (A1)
iv. Application questions in which the correct answer described a new example of the concept described in the second paragraph of the passage (A2)
v. Name questions in which the correct answer was the name of the psychologist identified with the principle in the first sentence of the second paragraph of each passage
2. Subject matter, with five levels (within factor):
i. Classical conditioning passage
ii. Extroversion-introversion passage
iii. Intermittent reinforcement passage
iv. Displacement passage
v. Drive reduction passage
3.
Position of a given passage in the instructional sequence,
with five levels (within factor)
i. First passage
ii. Second passage
iii. Third passage
iv. Fourth passage
v. Fifth passage
4. Question type on the posttest, with five levels (within factor)
i. Questions from passage 1
ii. Questions from passage 2
iii. Questions from passage 3
iv. Questions from passage 4
v. Questions from passage 5
Dependent: Student performance on posttest, as measured by the mean percentage correct for each treatment on the five question types answered on the posttest.
5. Instrument, comment on reliability and validity
Five passages were used in this study. Each passage discussed a psychological principle. The first paragraph of each passage presented a nontechnical account of some situation illustrating the principle. The second paragraph presented the name of a psychologist commonly association with the principle, a general explanation of the principle, and a second illustration. The third paragraph introduced an additional technical term related to the principle and a concluding summary sentence. The passages were about 450 words in length.
The inserted questions were written in a multiple-choice format and conformed to one of the above formats (RE1, RE2, A1, A2, N). The RE and A questions offered four choices, while the N questions offered five choices.
The posttest consisted of 25 questions, a compilation of the five questions prepared for each of the five passages.
The questions appear valid in that they assessed knowledge directly related to the passages. No reliability tests were applied.
6. Experimental procedure
Participants were 300 high school seniors. Students from classes designated as below average in achievement level were not used and no student who had taken a psychology course was included in the study. The study was run in the school auditorium in five shifts of approximately 60 students. There were approximately equal numbers of subjects from each treatment group in each shift (i.e., approximately 10 students from each treatment group were present during each shift). Note: a control group was included in which students did not answer any inserted questions. The students were randomly assigned to treatment groups (RE1, RE2, A1, A2, N, Control) simply by distributing the booklets which had been previously arranged in random order. Students were instructed to read the passages once only. No time limit was imposed but the students recorded the time. The inserted question was answered in the booklet after each passage. Students were not permitted to examine the passage while answering a question. The posttest was administered as soon as the passages were completed.
7. Statistical analysis and conclusion
A two-way analysis of variance (ANOVA) with repeated measures on three factors was performed for all experimental groups on the posttest (i.e., analysis of posttest variance). All main effects (inserted question type, position of instruction passage, posttest question type, and subject matter) were significant beyond the .01 level.
The mean posttest percentages under the six treatment groups were analyzed. Only the descriptive statistics (means) were presented in the text, although the researchers claimed that there was a significant difference between inserted question type in that the groups answering inserted application questions (A1 and A2) performed significantly better than all other groups (alpha = .05) on the posttest.
In terms of the subject matter main effect, the performance on questions related to the classical conditioning topic diverged significantly (alpha =.01) from performance on questions related to the other four topics in that students appeared to have higher performance on the classical conditioning questions (80%).
The significant Inserted question x Posttest question type interaction (p < .01) revealed that the application questions fostered better performance on the posttest, particularly the application questions, than the name or repeated example questions.
Interestingly, there was a significant Subject matter x Posttest question type interaction, although the researchers claimed that it was not relevant and did not discuss it.
An analysis of performance on inserted questions revealed a significant main effect for treatment F(4, 245) = 31.81, < .01), but not for position of passage F(4, 980) = 1.78).
Since a large number of multiple comparisons were anticipated, the researchers were interested in controlling Type I error rate. For that reason, they used the Tukey B procedure to control experimentwise error rates. The post-hoc tests revealed that all differences in inserted question type answers were significant (alpha = .01), except between groups RE1 and RE2.
Although it was not stated in the original hypotheses, the researchers did examine the differences in the time spent on answering inserted questions. They found that the time spent on answering inserted questions was significant F(4, 245) = 94.12, p < .01 in that the name questions required the least mean time and the application questions required the longest mean time.
Researchers also examined the time spent on each passage and found a significant decrease in the time spent on each passage over the instructional sequence, F(4, 1176) = 17.72), p < .01). The Treatment group x Position of passage interaction was significant, F(20, 1176) = 1.93, p < .01 in that the Name group more rapidly decreased time spent on the final three passages.
Finally, the researchers examined the time spent on the posttest and found significant differences between groups, F(5, 294) = 7.28, p < .01. The control group and the name group took the longest mean times, followed by the two repeated-example groups, while the two application groups took the shortest time on the posttest.
In sum, the main purpose of the current study was to examine performance on a posttest as a function of the type of question inserted into the instructional sequence. Students who answered the application questions during instruction demonstrated the best overall performance.
8. If you were the researcher, how would you improve the study?
This was an extremely complicated and at times, confusing, study. Perhaps it was unnecessarily complex in that the researchers never fully justified their reasons for performing a SPF p . qrt design. The researchers claimed to be studying the effect of inserted question type on posttest performance, and this seems like a reasonable quest. However, they then went and added the additional factors of subject matter, position of a given passage in the instructional sequence, and question type on posttest, but did so without explaining why. Are these factors necessarily related to inserted question type and/or posttest scores?
Further, a split-plot factorial design is a good choice if the researcher’s primary interests involve the repeated-measures factors (within-subject factors). In this particular study, the researchers main interests involved the between subject factor (inserted question type), so the SPF design may not have been optimal.
I wonder if a two-treatment split-plot factorial design may have been more reasonable. The researchers could have studied the effects of inserted question type (between factor) and subject matter (within subject). Again, the researchers did not discuss their reasoning for including the three within-factor variables, and consequently, it is hard to understand why they chose this model.
Also, why did they use within-subject variables? Subject matter, position of a given passage in instructional sequence, and question type on the posttest could have easily been between-subject variables. Then there would be no concern for carry-over effects. However, I do realize that a within-subjects design does help the researcher increase statistical sensitivity, or power.
If I were the researcher, I would have simplified the study. I do believe that it is important to examine the effects of inserted questions on posttest scores. Reason being, it is pertinent to find ways to facilitate students’ processing of information. If certain questions help foster a greater understanding of information (e.g., name questions or application questions), then instructors can include more of those during learning tasks. Since this is such an important area to study, it is unfortunate to complicate it with other, less relevant, factors such as subject matter and posttest question type. Again, the authors claimed to have been interested in the effect of inserted questions on students’ performance on a related posttest. So, why is it necessary to examine the posttest type of questions?
My final concern is the thoroughness of the study. The authors did use the SPF p . qrt design. But not all related hypotheses were tested. It wasn’t exhaustive. For example, they did not examine the T x P x SM x Q interaction, among others.
Again, this was an extremely complex study. This made it very hard to accurately critique some basic statistical issues in that it was hard enough to follow the author’s arguments and reasoning. Unfortunately, their original hypotheses were never fully addressed in the discussion.
One factor that may have added to the richness of this study would be grade level. The authors only tested high school seniors. It seems plausible that the effects of inserted questions might be different for a different age group. In that case, perhaps a SPF pr . q design would be appropriate. There would be two between subject variables (type of inserted question and grade level) and one within subject variable (subject matter).