From Carey, A Beginner's Guide to Scientific Method
So far in our discussion of
causal experiments, we have considered only examples designed
by selecting a number of subjects (none of whom have the suspected causal agent), dividing them into two groups, and
administering the suspected
causal agent to members of one of the two groups. These are called
randomized
causal experiments. But there are two other types of causal experiment,
neither of
which begin with randomly selected subjects that have not yet
been exposed to the
suspected causal factor: prospective and retrospective causal
experiments or, as they
are often called, causal studies. Prospective and retrospective
studies typically provide less evidence of causal links than do
randomized experiments but in some
situations, for reasons we discuss later, randomized experiments
would be difficult if not impossible to undertake. Following is a brief description of
the three basic types of causal experiment along with a summary
of both the advantages and limits of each.
1. Randomized
Causal Experiments
A randomized causal experiment
is the very sort of experiment we have been working with. The
subjects used in the experiment are selected and randomly divided
into two groups prior to administering the suspected causal agent.
Randomized experiments are capable of providing strong evidence
precisely because they enable us to control
quite effectively for other possible causal factors. That subjects
are selected prior to being exposed to the suspected cause, coupled
with being randomly divided into experimental and control groups,
goes a long way toward controlling for extraneous causal factors.
Randomized experiments, however, have a number of disadvantages.
They tend to be quite expensive and time-consuming to carry out,
particularly if it is necessary to work with large groups of subjects.
Unless the suspected effect follows reasonably immediately upon
exposure to the causal agent, randomized experiments may take
a great deal of time. Does exercise have an influence on longevity?
Though we might design a randomized test of the possible link
between the two, it would take years to complete. Finally, we
would have grave reservations, to say the least, about carrying
out randomized experiments dealing with many suspected causal
links. Do high rates of cholesterol in the blood cause heart disease?
Imagine what a randomized experiment might involve. We might begin,
for example, with a large number of small children, divide them
at random into two groups, and train one group to eat and drink
lots of fatty, starchy, and generally unhealthy foods of the sort
we suspect may be associated with high levels of cholesterol.
You can see the problem. Not coincidentally, much medical research
is carried out on laboratory animals precisely because we tend
to have much less hesitation about administering potentially hazardous substances to members of
nonhuman species .
2. Prospective Causal Experiments
In prospective causal
experiments we begin with two groups of subjects, one of which
- the experimental group - already has the suspected causal factor
while theother group does not. During the
course of the experiment, we wait to see any emerging level of
difference of the effect in the two groups. Consider, for example,how
we might carry out a prospective experiment to investigate the
link between class attendance and test performance. We might begin
by selecting a large number of students at random. Next we must
find some way of accurately determining their patterns of class
attendance. We might simply observe them for, say, the first ten weeks of my course. Then we divide
them into two groups: those. who attend class regularly (we might
define "regularly" as those who miss less than 5% of
all classes) and those who do not. The former become our experimental
group and the latter our control group. If we find that more than
half of our subjects are in one group or the other, we can pare
down the size of the larger group by randomly excluding subjects
from it. Now we track them and await the results of the final
exam. Such experiments are called prospective because they are
future-oriented; they use subjects who already have the suspected
cause and wait to see what happens with respect to the effect.
To see the primary limitation of prospective experiments, imagine
that we actually carry out the experiment just described and discover
a statistically significant difference in levels of test performance
between the two groups: the members of the experimental group
score much higher on the final on average than the members of
the control group. But this may not show a link between attendance
and test performance. In selecting individuals for membership
in our experimental and control groups we were guided by a single
consideration: class attendance. Yet other factors clearly might
influence test performance, one of which we discussed earlier:
the amount one studies. Undoubtedly there are more, such as how
effectively one studies, how motivated one is to achieve outstanding
grades, and how much one already knows about the subject matter
of the course. By concentrating on a single causal factor in our
selection process, we leave open the possibility that whatever
difference in levels of effect we observe in our two groups may
be due to other factors. This, of course, is precisely where prospective
experiments differ from randomized experiments. By randomly dividing
subjects into experimental and control groups before administering
the suspected cause, we greatly decrease the chance that other
factors will account for differences in level of effect. In prospective
experiments it is always possible that other factors will come
into play, precisely because we begin with subjects already having
the suspected cause.
Matching can be used to control for potentially troublesome causal
factors in prospective experiments. Suppose, for example, we discover
that about 50% of our experimental subjects study five or more
hours per week per course but only 35% of our control subjects
study at this rate. We can easily subtract some subjects from
our experimental group or add some to the control group to achieve
similar percentages of this obvious causal factor. It is not an
oversimplification to say that the reliability of a prospective
experiment is in direct proportion to the degree such matching
is successful. Thus in assessing the results of a prospective
experiment we need to know what factors have been controlled for
via matching. In addition, it is always wise to be on the lookout
for other factors that might influence the experiment" s
outcome yet have not been controlled for. In general, a properly
done prospective study can strongly indicate a causal link, though
unfortunately not as strongly as can a randomized experiment.
In some respects, prospective experiments offer advantages over
randomized causal experiments. For one thing, they require much
less direct manipulation of experimental subjects and thus tend
to be easier and less expensive to carry out and to occasion fewer
ethical objections. Their principle advantage, however, is that they enable us to work with very
large groups. And as we have discovered, causal factors often
result in differences in. level of effect that are so small as
to require large samples to detect. Moreover, greater size alone
increases the chances that our samples will be representative
with respect to other causal factors. This is crucial when an
effect is associated with several causal factors. If a number
of factors cause B in Cs, we increase our chances of accurately
representing the levels of these other factors in our two groups
as we increase their size. In addition, prospective experiments
allow us to study potential causal links we cannot make the subject
of randomized experiments. As pointed out earlier, we would have
serious reservations about a randomized experiment dealing with
cholesterol and heart disease - in human beings, at any rate.
However, we should have no similar moral reservations about a
study that involves nothing more than tracking people with preexisting
high levels of cholesterol.
3. Retrospective Causal
Experiments
Retrospective experiments
or studies begin with two groups, our familiar experimental and
control groups, but the two are composed of subjects who do and
do not have the effect in question. Remember, in randomized and
prospective studies subjects do not have the effect being tested
for prior to the beginning of the study. By contrast, retrospective
studies look to the past in an attempt to discover differences
in the level of potential causal factors.
To carry out a retrospective study of the link between class attendance
and test performance, we need only look at records of past classes.
We might begin by looking for students who have done well on my
final, perhaps those who scored 85% or higher. They become the
experimental group; those who scored lower are our control group.
Fortunately, I have kept detailed attendance records for all past
classes; so we look at them to find our two groups. If there is
a link between attendance and test performance we would expect
to find significantly better rates of attendance of students in
our experimental group.
Even the best of retrospective studies provide only weak evidence
for a causal link, because it is exceedingly difficult to control
for other potential causal factors. Subjects are selected because
they either do or do not have the effect in question, so potential
causal factors other than the one tested for may automatically
be built into our two groups. A kind of backward matching is possible
in retrospective studies. Suppose that in our study of the link
between class attendance and test performance we discover that
50% of our experimental group spends five hours or more per week
preparing for each of their classes while only 20% of our control
group does so. It may be possible to do some matching here by
eliminating subjects from one group or adding more to the other
and then looking to see if the difference in levels of the suspected
cause in the two groups remains the same. However, even if by
the process of backward matching we are able to configure our
two groups so that they exhibit similar levels of other suspected
causes, we have at most very tentative evidence for the causal
link in question.
All we are in a position to conclude from a retrospective study is that we have looked into the background of subjects who have a particular effect and found that a suspected cause occurs more frequently than in subjects who do not have the effect. Whether the effect is due to the suspected cause is difficult to say even when pains are taken to control for other potential causal factors, for in manipulating them we may well disturb some combination of responsible factors. That our two groups now appear to be alike with respect to other causal factors is thus largely because they are contrived to appear that way.
One final limitation of retrospective studies is that they provide no way of estimating the level of difference of the effect being studied. The very design of retrospective studies ensures that 100% of the experimental group, but none of the control group, will have the effect. Due to their limitations, retrospective studies are best regarded as a tool for uncovering potential causal links. We discover that a number of people have contracted effect B. Comparing them with a group of people who do not have B, we find a significant difference in the level of some factor A. It would seem that A may well be a cause of B. To determine more about the potential link between A and B, we would be well advised to undertake a more careful prospective or randomized experiment.
The advantages to retrospective studies, in contrast to randomized or prospective studies, are that they can be carried out quickly and inexpensively; they involve little more than careful analysis of data that is already available. And sometimes alacrity is of the essence. Imagine, for example, that we have discovered that Guernsey cows are dying at an alarming rate from unknown causes. Before we can do much of anything, we need some sense of what might be causing the problem. A quick search for factors in the background of infected cows that are absent at a significant level in the background of noninfected cows might turn up just the clue we need.
