MW CogSci Invited Plenary Address

Mark Steyvers, Ph.D.
University of California, Irvine

Combining Human Judgments in General Knowledge and Forecasting Tasks

In this research, we build on ideas from cognitive science, crowdsourcing and Bayesian statistics to build aggregation models that combine human judgments in general knowledge and prediction tasks. We propose that a successful approach to the aggregation of human judgment requires a cognitive modeling framework that explains how individuals produce their answers, and that also allows for individual differences in skill and expertise of participants. In addition, we argue that it is essential to correct for any systematic distortions in human judgment when aggregating judgments over individuals. We present two case studies that highlight our overall approach.

In our first case study, we present preliminary results from the Aggregative Contingent Estimation System (ACES), which is part of a project funded by the Intelligence Advanced Research Projects Activity. The goal is to develop new methods for collecting and combining forecasts of many widely-dispersed individuals in order to increase aggregated forecasts’ predictive accuracy. Currently, the project has enrolled over 2000 users from the general public who have provided forecasts on over 225 prediction problems (the site can be accessed at http://www.forecastingace.com/aces/). An important consideration for aggregation approaches is the presence of systematic biases that distort subjective probability estimates: the probability of rare events is often overestimated while the probability of common events is underestimated. We construct a series of Bayesian models that first estimate the bias inherent in subjects' forecasts, then correct and aggregate the forecasts. Models differ both in the extent to which they allow for individual differences, as well as the point at which the bias correction takes place.

In the second case study, we apply a cognitive modeling approach to the problem of measuring expertise on ranking tasks involving general knowledge (e.g., ordering American holidays through the calendar year) and forecasting (e.g., predicting the order of football teams at the end of the season). Using a Bayesian model of behavior on this problem that allows for individual differences in knowledge, we are able to infer people’s expertise directly from the rankings they provide and without using any knowledge of the true answer. We show that our model-based measure of expertise outperforms self-report measures of expertise, taken both before and after completing the ordering of items, in terms of correlation with the actual accuracy of the answers.