(This is a two-part post on communicating about probability and uncertainty in climate change. Read Part I.)
In my previous post I attempted to provide an overview of the IPCC 2007 report’s approach to communicating about uncertainties regarding climate change and its impacts. This time I want to focus on how the report dealt with probabilistic uncertainty. It is this kind of uncertainty that the report treats most systematically. I mentioned in my previous post that Budescu et al.’s (2009) empirical investigation of how laypeople interpret verbal probability expressions (PEs, e.g., “very likely”) in the IPCC report revealed several problematic aspects, and a paper I have co-authored with Budescu’s team (Smithson, et al., 2011) yielded additional insights.
The approach adopted by the IPCC is one that has been used in other contexts, namely identifying probability intervals with verbal PEs. Their guidelines are as follows:
Virtually certain >99%; extremely likely >95%; very likely >90%; likely >66%; more likely than not > 50%; about as likely as not 33% to 66%; unlikely <33%; very unlikely <10%; extremely unlikely <5%; exceptionally unlikely <1%.
One unusual aspect of these guidelines is their overlapping intervals. For instance, “likely” takes the interval [.66,1] and thus contains the interval [.90,1] for “very likely,” and so on. The only interval that doesn’t overlap with others is “as likely as not.” Other interval-to-PE guidelines I am aware of use non-overlapping intervals. An early example is Sherman Kent’s attempt to standardize the meanings of verbal PEs in the American intelligence community.
Attempts to translate verbal PEs into numbers have a long and checkered history. Since the earliest days of probability theory, the legal profession has steadfastly refused to quantify its burdens of proof (“balance of probabilities” or “reasonable doubt”) despite the fact that they seem to explicitly refer to probabilities or at least degrees of belief. Weather forecasters debated the pros and cons of verbal versus numerical PEs for decades, with mixed results. A National Weather Service report on a 1997 survey of Juneau, Alaska residents found that although the rank-ordering of the mean numerical probabilities residents assigned to verbal PE’s reasonably agreed with those assumed by the organization, the residents’ probabilities tended to be less extreme than the organization’s assignments. For instance, “likely” had a mean of 62.5% whereas the organization’s assignments for this PE were 80-100%.
And thus we see a problem arising that has been long noted about individual differences in the interpretation of PEs but largely ignored when it comes to organizations. Since at least the 1960’s empirical studies have demonstrated that people vary widely in the numerical probabilities they associate with a verbal PE such as “likely.” It was this difficulty that doomed Sherman Kent’s attempt at standardization for intelligence analysts. Well, here we have the NWS associating it with 80-100% whereas the IPCC assigns it 66-100%. A failure of organizations and agencies to agree on number-to-PE translations leaves the public with an impossible brief. I’m reminded of the introduction of the now widely-used cyclone (hurricane) category 1-5 scheme (higher numerals meaning more dangerous storms) at a time when zoning for cyclone danger where I was living also had a 1-5 numbering system that went in the opposite direction (higher numerals indicating safer zones).
Another interesting aspect is the frequency of the PEs in the report itself. There are a total of 63 PEs therein. “Likely” occurs 36 times (more than half), and “very likely” 17 times. The remaining 10 occurrences are “very unlikely” (5 times), “virtually certain” (twice), “more likely than not” (twice), and “extremely unlikely” (once). There is a clear bias towards fairly extreme positively-worded PEs, perhaps because much of the IPCC report’s content is oriented towards presenting what is known and largely agreed on about climate change by climate scientists. As we shall see, the bias towards positively-worded PEs (e.g., “likely” rather than “unlikely”) may have served the IPCC well, whether intentionally or not.
In Budescu et al.’s experiment, subjects were assigned to one of four conditions. Subjects in the control group were not given any guidelines for interpreting the PEs, as would be the case for readers unaware of the report’s guidelines. Subjects in a “translation” condition had access to the guidelines given by the IPCC, at any time during the experiment. Finally, subjects in two “verbal-numerical translation” conditions saw a range of numerical values next to each PE in each sentence. One verbal-numerical group was shown the IPCC intervals and the other was shown narrower intervals (with widths of 10% and 5%).
Subjects were asked to provide lower, upper and “best” estimates of the probabilities they associated with each PE. As might be expected, these figures were most likely to be consistent with the IPCC guidelines in the verbal- numerical translation conditions, less likely in the translation condition, and least likely in the control condition. They were also less likely to be IPCC-consistent the more extreme the PE was (e.g., less consistent foro “very likely” than for “likely”). Consistency rates were generally low, and for the extremal PEs the deviations from the IPCC guidelines were regressive (i.e., subjects’ estimates were not extreme enough, thereby echoing the 1997 National Weather Service report findings).
One of the ironic claims by the Budescu group is that the IPCC 2007 report’s verbal probability expressions may convey excessive levels of imprecision and that some probabilities may be interpreted as less extreme than intended by the report authors. As I remarked in my earlier post, intervals do not distinguish between consensual imprecision and sharp disagreement. In the IPCC framework, the statement “The probability of event X is between .1 and .9 could mean “All experts regard this probability as being anywhere between .1 and .9” or “Some experts regard the probability as .1 and others as .9.” Budescu et al. realize this, but they also have this to say:
“However, we suspect that the variability in the interpretation of the forecasts exceeds the level of disagreement among the authors in many cases. Consider, for example, the statement that ‘‘average Northern Hemisphere temperatures during the second half of the 20th century were very likely higher than during any other 50-year period in the last 500 years’’ (IPCC, 2007, p. 8). It is hard to believe that the authors had in mind probabilities lower than 70%, yet this is how 25% of our subjects interpreted the term very likely!” (pg. 8).
One thing I’d noticed about the Budescu article was that their graphs suggested the variability in subjects’ estimates for negatively-worded PEs (e.g., “unlikely”) seemed greater than for positively worded PEs (e.g., “likely”). That is, subjects seemed to have less of a consensus about the meaning of the negatively-worded PEs. On reanalyzing their data, I focused on the six sentences that used the PE “very likely” or “very unlikely”. My statistical analyses of subjects’ lower, “best” and upper probability estimates revealed a less regressive mean and less dispersion for positive than for negative wording in all three estimates. Negative wording therefore resulted in more regressive estimates and less consensus regardless of experimental condition. You can see this in the box-plots below.
In this graph, the negative PEs’ estimates have been reverse-scored so that we can compare them directly with the positive PEs’ estimates. The “boxes” (the blue rectangles) contain the middle 50% of subjects’ estimates and these boxes are consistently longer for the negative PEs, regardless of experimental condition. The medians (i.e., the score below which 50% of the estimates fall) are the black dots, and these are fairly similar for positive and (reverse-scored) negative PEs. However, due to the negative PE boxes’ greater lengths, the mean estimates for the negative PEs end up being pulled further away from their positive PE counterparts.
There’s another effect that we confirmed statistically but also is clear from the box-plots. The difference between the lower and upper estimates is, on average, greater for the negatively-worded PEs. One implication of this finding is that the impact of negative wording is greatest on the lower estimates—And these are the subjects’ translations of the very thresholds specified in the IPCC guidelines.
If anything, these results suggest the picture is worse even than Budescu et al.’s assessment. They noted that 25% of the subjects interpreted “very likely” as having a “best” probability below 70%. The boxplots show that in three of the four experimental conditions at least 25% of the subjects provided a lower probability of less than 50% for “very likely”. If we turn to “very unlikely” the picture is worse still. In three of the four experimental conditions about 25% of the subjects returned an upper probability for “very unlikely” greater than 80%!
So, it seems that negatively-worded PEs are best avoided where possible. This recommendation sounds simple, but it could open a can of syntactical worms. Consider the statement “It is very unlikely that the MOC will undergo a large abrupt transition during the 21st century.” Would it be accurate to equate it with “It is very likely that the MOC will not undergo a large abrupt transition during the 21st century?” Perhaps not, despite the IPCC guidelines’ insistence otherwise. Moreover, turning the PE positive entails turning the event into a negative. In principle, we could have a mixture of negatively- and positively-worded PE’s and events (“It is (un)likely that A will (not) occur”). It is unclear at this point whether negative PEs or negative events are the more confusing, but inspection of the Budescu et al. data suggested that double-negatives were decidedly more confusing than any other combination.
As I write this, David Budescu is spearheading a multi-national study of laypeople’s interpretations of the IPCC probability expressions (I’ll be coordinating the Australian component). We’ll be able to compare these interpretations across languages and cultures. More anon!
Budescu, D.V., Broomell, S. and Por, H.-H. (2009) Improving the communication of uncertainty in the reports of the Intergovernmental panel on climate change. Psychological Science, 20, 299–308.
Intergovernmental Panel on Climate Change (2007). Summary for policymakers: Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Retrieved May 2010 from http://www.ipcc.ch/pdf/assessment-report/ar4/wg1/ar4-wg1-spm.pdf.
Smithson, M., Budescu, D.V., Broomell, S. and Por, H.-H. (2011) Never Say “Not:” Impact of Negative Wording in Probability Phrases on Imprecise Probability Judgments. Accepted for presentation at the Seventh International Symposium on Imprecise Probability: Theories and Applications, Innsbruck, Austria, 25-28 July 2011.