Contextualizing The Meaning of Probabilities C. Y. Joanne Peng, Anne Buu, and Bernard Flury

Introduction

When communicating the likelihood of uncertain information, do people more often use words than numbers to present and explain that information? Are these words used alone or within a context? In a recent survey by Wallsten, Budescu, Zwick, and Kemp (1993), 77% of subjects thought that most people preferred using words instead of numbers to communicate uncertainty in everyday life. In addition, probability words were preferred because they were more natural, personal, and were judged to better convey imprecise information. Previous research has examined primarily the numeric meanings of probability expressions (words and phrases) without using a context (see Peng & Bolte, 1992). However, after reviewing empirical studies, Clark (1990), Tanur (1990), and Wallsten and Budescu (1990) concluded that the numeric meanings of probability expressions changed significantly with the context in which the expressions were embedded. Therefore, this mapping between probability expressions and numbers should not be researched without a context. The importance of contexts in shaping meanings of probability expressions was further supported by a review conducted by Peng and Bolte (1992). Out of all studies reviewed, three included both lexical-to-numeric and numeric-to-lexical mappings without a context and proved that these mappings are equivalent (Budescu, Weinberg, & Wallsten, 1988; Reagan, Mosteller, & Youtz, 1989; Wallsten, Budescu, Rapoport, Zwick, & Forsyth, 1986). In other words, contexts shifted meanings of numeric probabilities; whithout a context, these numbers were interpreted into words in predictable ways.  The same review also found that the numeric-to-lexical mapping had not yet been fully explored across a variety of contexts. Specifically, it remained to be studied how teachers interpreted numeric probabilities used in test manuals, textbooks, clinical reports, news, research reports, daily conversations with parents, etc. To fill the void in the literature, Peng and Bolte (1992) conducted an empirical research to examine the numeric-to-lexical mapping within educational contexts.

Since the Peng and Bolte study in 1992, a review of the literature found three additional studies that followed the mainstream research of lexical-to-numeric mapping (Boettcher, 1995; Ness, 1995; Tavana, Kennedy, & Mohebbi, 1997). Of these studies, only the first study supplied a context for subjects. Two methodological considerations might have explained this research trend of context-free lexical-to-numeric mapping. First, it was more straightforward to quantify the meaning of probability expressions in the lexical-to-numeric mapping process than in the numeric-to-lexical mapping process. In the lexical-to-numeric mapping process, researchers typically presented lexical expressions to subjects and asked them to supply numeric equivalents. Once the data were collected, the group mean, median, percentile, or range of each probability expression were calculated. Derived in this way, the estimates (i.e., mean, median, percentile, and range) have been stable over 20 years of research (Mosteller & Youtz, 1990; Reagan et al., 1989), and is well accepted in the field of psychology. On the other hand, the data collected on numeric-to-lexical mapping usually consist of frequencies of verbal restatement or words selection. In order to analyze this type of data, a researcher needs sophisticated quantitative methods. The three studies researching numeric-to-lexical mapping reviewed by Peng and Bolte (1992) all employed more advanced models, than those used by researchers studying the lexical-to-numeric mapping, to unveil the underlying meaning of numeric probabilities (Budescu et al., 1988; Reagan et al., 1989; Wallsten et al., 1986).

The second methodological concern is that embedding probability information within a context is procedurally more complex than the context-free mapping (Clark, 1990). Arguing that the meaning of a probability expression could never be obtained directly but only derived from its use on a particular occasion, Clark (1990) proposed a two-step approach to determining the meaning of an expression. Clark's proposal started with step one–examining the possible uses of the expression and establish how it is interpreted on each occasion. This was to be followed by step two–reconstructing the meaning of the expression from the invariances in these interpretations (1990, p.14). According to Clark's theory, if a probability expression was presented to a subject without any context, she/he had to imagine a more or less concrete situation in which the expression would be used; researchers then had to estimate the numeric probability it would denote in that situation. Since subjects could imagine different situations and therefore yield different numerical equivalents to the same expression, a point- or interval-estimate of that expression was merely an aggregation over a set of unknown situations. Therefore, he suggested that the "no context" condition was really the "unknown context" condition. Clark (1990) further demonstrated, using empirical evidence from psychology literature, that the numeric meanings of probability expressions changed drastically with contexts in which the expressions were embedded. For example, the word, "frequently" was judged to represent about 75% of the time in the context, "At a recent press conference, Miss Sweden said she felt that in real life men frequently found her attractive." The same word, "frequently" represented about 28% of the time in the context, "The New York Daily News reported that in the U.S.A. during February 1966, commercial passenger planes frequently crashed." It was apparent that the difference between 75% and 28% was due to the context effect. Substantial context effects, due to the severity of the disease symptoms, the familiarity of the intended audience with the issues discussed,etc., on the meanings of probability expressions were also found in the other reviews by Tanur (1990) and Wallsten and Budescu (1990).

Peng and Bolte (1992) investigated prospective teachers' interpretation of numeric probabilities within educationally related statements in terms of 11 commonly used probability expressions. They conducted two pilot studies to develop their instrument. Two context effects were manipulated in their study: the subject's expectation of the real probability of the event and the magnitude of numbers in the fractional restatement (e.g. 36% was restated as either 36 of every 100 or 36000 of every 100000). In their actual study, 188 undergraduates who enrolled in education courses at a large mid-western university were asked to participate in their study. The descriptive frequency analyses of the data were reported in their paper. The 1992 data were further analyzed in this paper to investigate the context effects and the numeric meanings of probability expressions. The same instrument used in the 1992 study was administered to 100 volunteer undergraduates registered in the Spring semester of 1998. The results of this data were compared to the results of the 1992 data in order to extend the limited believability and generalizability of one-time findings (Robinson & Levin, 1997). We hope that findings reported in this paper will address these research questions: (1) what is the effect of the magnitude of numbers in the fractional restatement on the use of probability expressions by prospective teachers? (2) to what degree prospective teachers' expectation of the real probability of the event influenced their use of probability expressions? (3) what numeric meanings can be derived from probability expressions in the numeric-to-lexical mappings within educational contexts? (4) what are the differences in scaled values of probability expressions estimated by the least squares method versus those estimated by the logit modeling? and (5) Any inconsistency in results between the 1992 and the 1998 studies?