Fat Chance: Writing about Probability

 

Since the COVID-19 pandemic began, almost every choice we have made in our day-to-day lives has required careful consideration of the odds. How dangerous is going to the supermarket at peak time? Is it safe to see friends after getting one vaccine shot? Will children get sick, or spread the virus to others, if they go back to school?

Just the quantity of decisions can be exhausting. But there’s something else making all of these choices so difficult: People, by and large, are bad at probability. We turn high odds into certainties, as when people assume that the accuracy of PCR tests for COVID means they’ll always yield correct results, or when we think of relatively unlikely events, like catching the virus after being fully vaccinated, as impossible. And in between 100 percent certainty and zero chance, the way we interpret any given number can change radically based on how that probability is expressed.

Such glitches in thinking aren’t surprising, says journalist and statistician Regina Nuzzo, a professor of statistics at Gallaudet University in Washington, DC, and a senior advisor for statistics communication and media innovation at the American Statistical Association. “Human brains hate probability, they hate ambiguity, they hate the uncertainty,” she says. “We’re just not wired to deal with this sort of thing very well.”

But for science writers, avoiding uncertainty isn’t an option. “Everything [in science] is quantified by the likelihood or lack of likelihood of it happening,” says science journalist Tara Haelle, who has written extensively about covering statistics for the Association of Health Care Journalists (AHCJ). But as Haelle says, “Humans don’t make decisions that way.” People tend to be far more comfortable with a definitive yes or no, she says, than with the uncharted space in between.

That means science writers have to work hard to coax readers away from those two extremes and toward a more nuanced understanding. “Probability information is crucial to making a really informed decision,” says psychologist Vivianne Visschers, who studies risk communication and decision making at the University of Applied Sciences and Arts Northwestern Switzerland. Without a good understanding of such information, readers may ease up on pandemic social-distancing precautions too early, or avoid a medical examination that they want to have.

There’s no single, straightforward way to write about probability. The very concept can be ambiguous and tricky. But strategies like using analogies, creating visuals, and making careful use of language can help to convey just what a 15 percent chance of an extreme weather event, or an 80 percent chance of recovering from a disease, really means.

 

Acknowledge Uncertainty

Because uncertainty is so hard to communicate effectively, it can be tempting to act like the science in any given area is more definitive than it really is. But uncertainty is an intrinsic part of science, and ignoring it does readers a disservice. “The journalists that I’ve seen that do this very well talk about why they’re not completely, 100 percent certain about anything,” Nuzzo says. “They give a little peek into the fact that science is not perfect, that there are going to be some uncertainties.”

Many scientific predictions, whether about the chance of rain tomorrow, or of catastrophic wildfires in California next fall, or of a bridge collapsing in the next decade, come from computational models—mathematical tools that scientists use to make predictions based on large volumes of data. As sophisticated as these models are, they are ultimately only simplifications of a complex, uncertain world. Explaining just what makes a model imprecise—like inaccuracies in the underlying data or the omission of an important variable—can be difficult. But, says Nuzzo, we should try to give readers some insight into why models can’t give definitive predictions.

Because uncertainty is so hard to communicate effectively, it can be tempting to act like the science in any given area is more definitive than it really is.

One major source of uncertainty in any model is the set of assumptions it’s based on. For a climate change model, that might mean an assumption about what types of environmental policies governments will implement and how those policies might affect climate change. For COVID models, assumptions could include how many people in a given state will follow a mask order and how that will affect the spread of the virus.

The probabilities coming from models can only ever be as good as the assumptions that go into them, notes psychologist Gerd Gigerenzer, director of the Harding Center for Risk Literacy at the University of Potsdam in Germany.

In a January 2021 article for Quanta about why COVID modeling has been so difficult, Jordana Cepelewicz discussed how a model built by University of Illinois researchers to predict the risk of resuming in-person instruction went wrong. The model assumed that students who tested positive and were ordered to quarantine would strictly follow those directions. In reality, some infected students went to parties, and hundreds of students ended up contracting COVID—an outbreak that the model failed to anticipate. “This [assumption] turned out to be critical,” Cepelewicz writes. “Given how COVID-19 spreads, even if only a few students went against the rules, the infection rate could explode.”

Since bad assumptions can have a substantial impact, it’s important for journalists covering predictions based on modeling to explain how assumptions factor into the uncertainty of those predictions. As Gigerenzer points out, “Part of the art of risk communication [is] to say, ‘We don’t know.’”

 

Turn Probabilities into Concrete Numbers

Scientific studies generally communicate probabilities as percentages—this paper, for example, finds that there’s a 90 percent chance the world’s poorest countries will see their economies shrink by between 2 and 20 percent over the next century because of climate change. But that doesn’t necessarily mean journalists should do so. Instead, writers can communicate 90 percent as 9 out of 10.

In numerous studies, Gigerenzer has found that people can better use probabilities when they are presented as concrete numbers, even for probabilities that might seem easy to understand. One reason may be that concrete numbers are simpler to visualize.

Concrete numbers can also make statistics feel more personally relevant. A 0.5 percent risk of developing a particular kind of cancer may seem minuscule. But if a reader went to a high school with 1,000 students, they may find it more impactful to hear that five of their classmates, on average, will develop the disease. In a March 2021 story, American Public Media used concrete numbers rather than percentages to communicate race disparities in COVID deaths. They reported that 1 of every 390 Indigenous Americans had died of COVID.

Journalists and researchers also recommend conveying probabilities that describe risk in terms of absolute risk rather than relative risk. Language like “the preventative exam reduced the risk of serious illness by 50 percent” can mislead readers. Using absolute percentages gives a more accurate picture: “Without the exam, 2 percent of people experienced serious illness. With it, only 1 percent did.” Relative risk makes the difference seem enormous—but absolute risk makes clear that the exam only changed the outcome for 1 out of 100 people.

In her extensive reporting on cancer screenings, journalist Christie Aschwanden uses absolute risk to give readers as accurate a view as possible of the benefits—and costs—of screening. When writing about a study on the effectiveness of mammograms, for example, Aschwanden could have reported that the incidence of large breast tumors decreased by about 20 percent over about 30 years, a finding that might suggest mammograms caught many tumors while they were still small. But Aschwanden instead focused on the absolute numbers and wrote that only 30 out of every 100,000 of women were actually protected from those large tumors—because the rate of incidence was low in the first place.

“It’s always a really good idea to give readers a sense of the base rate of any problem when you’re talking about risk, so that they know [if it’s] a huge problem to begin with,” she says. “People often tend to overestimate their baseline chances of getting something, of having the bad outcome. And so that skews how they’re thinking.”

 

Be Careful with Imprecise Language

The advantage of including percentages and concrete numbers in a story is that they offer precise backup for claims. The disadvantage is that cramming too many numbers into a single article can turn off readers. It’s tempting to lighten readers’ load by using probability phrases such as “unlikely,” “possible,” and “very likely.” However, people tend to interpret such phrases in different ways. One person might read “possible” and think “5 percent,” while another person might think “65 percent.”

The advantage of including percentages and concrete numbers in a story is that they offer precise backup for claims. The disadvantage is that cramming too many numbers into a single article can turn off readers.

Such ambiguity means that journalists should consider anchoring probability phrases with precise numbers. That’s what Vox’s Susannah Locke did in a 2014 story about the genetics of sleep. In describing a new study, she paired a probability phrase with a concrete number, writing that it was “very, very unlikely” her reader could get by on six or fewer hours of sleep per night. Then, she added that the study found only 5 percent of people could function on that amount of sleep.

As Locke’s story illustrates, probability phrases work best when they’re deployed in tandem with other components of the story. And those components can go beyond just a single sentence. “You can communicate urgency in the frame of your article, without necessarily having most of the attention be on whether it’s ‘extremely likely’ or ‘very likely’,” says Shannon Osaka, a climate reporter at Grist. This was her approach when she wrote an article about how environmental changes like deforestation had made pandemics more likely. Though she included no statistics about precisely how much more likely scientists think pandemics have become, she left little room for readers to ignore that risk. “We need to recognize that we’re playing with fire,” she quoted one researcher as saying.

 

Communicate through Visuals

Visuals provide a unique benefit when it comes to communicating probability: They can achieve precision using few, or no, numbers. Sometimes even a very simple visual can do a lot of work. For example, you could communicate a 27 percent risk of being affected by a certain disease by showing a grid depicting 100 figures, with 27 of them colored in.

Of course, different kinds of visuals are more effective for some circumstances than others. For example, as Visschers cautions, when an outcome is very rare, a simple visual showing the number out of 100 people who are affected might not work. In the U.S., for instance, 0.17 percent of the population has died of COVID-19—more than 570,000 by May 2021, an enormous toll. In an image containing 100 people, none of them would be colored in. So visualizing such information demands a different approach. In May 2020, for example, when total U.S. COVID deaths had reached 100,000, The New York Times  highlighted the pandemic’s human impact by depicting those who had died of COVID in the U.S. as individuals, rather than as a fraction of the population.

Other types of visuals can help make probability statistics feel personally relevant. To communicate how long someone of a given age can expect to live, statistician Nathan Yau, who runs the data-visualization blog FlowingData, created a visualization in which dots representing individual people move along a curved horizontal line above an axis that extends from age zero to age 120. At some point, each dot falls off the curve and “dies”—some earlier, some later. The fallen dots convey the abstract concept of life expectancy in concrete form. And by allowing his reader to change the sex and age of the simulated people, Yau keeps his visualization relevant to the reader.

Beyond just communicating the numbers and their personal or societal implications, visuals can dig into the sources of uncertainty surrounding a given issue. In “Why It’s So Freaking Hard to Make a Good COVID-19 Model,” published in March 2020, FiveThirtyEight’s Maggie Koerth, Laura Bronner, and Jasmine Mithani assembled a series of flowcharts to get across just how many factors go into predicting COVID cases and deaths—33 factors, according to the colored boxes they drew—and how much we can’t be sure of in the process. By drawing 37 lines among these boxes, they demonstrate the entangled linkages among the factors far more clearly than they could have in text.

 

Use Comparisons Effectively

Writers can also communicate probability by comparing an unfamiliar statistic with something more familiar. The morning of the 2020 presidential election, for example, FiveThirtyEight compared Donald Trump’s 10 percent chance of victory to the chances of rain in downtown Los Angeles on any given day. While few people likely know the precise probability of rain in Los Angeles, most have a general sense of about how often it rains there (rarely, but not never).

Before deploying a specific analogy, Aschwanden says, it’s a good idea to run it by an expert source. “The best possible scenario is that the researcher proposes the analogy in the first place,” she says. “But if it’s one that I’ve come up with, I almost always will want to run it by them and make sure that it’s an apt analogy.” When Aschwanden was interviewing experts for an article about a misleading skin cancer prevention ad, one of them described skin cancers as “turtles” (which are too slow to ever cause problems), “birds” (which have already escaped the possibility of treatment), and “bears” (which can be stopped). Aschwanden brought that analogy into her article to help readers grasp why only a fraction of cancer deaths can be prevented with screenings.

Even apparently subtle decisions made during the writing process, such as when to present concrete numbers instead of percentages or whether to use one analogy versus another, can have a big impact on how readers interpret a statistic.

Analogies can also be helpful in establishing the range in which a given probability lies. Nuzzo points to an example from the Ebola epidemic, when NPR reporter Michaeleen Doucleff compared a new Ebola vaccine—with an efficacy between 70 percent and 100 percent—with the most recent influenza vaccine, which had an efficacy of around 50 percent. In doing so, Nuzzo says, Doucleff efficiently conveyed just how impressive the Ebola vaccine’s efficacy really was.

Still, analogies are not without their drawbacks. Such comparisons are effective when they link numbers to the reader’s own background and experiences. But different readers might not always interpret those analogies in the same way. For example, says Visschers, comparing a particular health risk with the dangers of smoking may not have the intended impact on someone whose grandmother lived into her 90s smoking a pack a day.

Ultimately, these various strategies may work best when used together. “You’re going to have diverse thinkers” in your audience, says Haelle, and different ways of communicating about probability may work best for different people. But, she says, be judicious: Throwing every strategy at the wall to see what sticks will “weigh down your piece, especially if you have multiple things to get across,” she says.

 

Using Probabilities with Purpose

Even apparently subtle decisions made during the writing process, such as when to present concrete numbers instead of percentages or whether to use one analogy versus another, can have a big impact on how readers interpret a statistic. That means that communicating probabilities effectively is about much more than making numbers digestible: It can have a real impact on what readers take away from your article and the choices they make. “Too often, because people don’t understand numbers, they’re prone to making decisions that don’t align with their values,” Aschwanden says.

But journalists disagree about whether writers should think explicitly about these decisions when they write about probability. Aschwanden draws a stark line between her role—to communicate the numbers—and the ultimate judgments that readers will make. “My job as a journalist is to help people understand the numbers and how these numbers apply to them,” she says. “And then they can make the decisions about whether this [medical] test is important, whether this environmental thing is harmful or not, what sort of risk they’re willing to take on.”

When she writes about mammograms, for example, Aschwanden discusses both the risk of dying of breast cancer and the (much higher) risk of stressing out over a false-positive test result. But she doesn’t tell readers how to weigh these risks against each other. “Amid all this conflicting advice, patients have to make their own decisions,” she writes in an article for FiveThirtyEight. “After more than 15 years reporting on this issue, I’ve decided to skip mammograms altogether, [but] a smart friend of mine has examined the same evidence and come to the opposite conclusion.”

Haelle, on the other hand, cautions that it’s not entirely possible to separate probability communication from the way readers make their decisions. In a column for AHCJ, she explains that covering polls about whether people intend to get a COVID vaccine—which may not accurately predict vaccine uptake—can actually impact people’s willingness to do so by establishing new social norms. “The more people question vaccines, the more people question vaccines,” she wrote. “Vaccine hesitancy is contagious.” The very decision about what stories to cover is inevitably a decision that can influence people’s behavior.

Haelle says it’s not her job to promote public health, but she doesn’t want to undermine it either. “People are always going to take away something that we say in a certain way,” she says. “The question is, are we going to be deliberate about it or not?”

 

 

Grace HuckinsCourtesy of Grace Huckins

Grace Huckins is a freelance science writer and a doctoral candidate in neuroscience at Stanford University. Her work has appeared in Wired, Scientific American, and Popular Science, among other publications. In 2020, she worked as a AAAS Mass Media Fellow at Wired. She holds master’s degrees in neuroscience and gender studies from the University of Oxford, where she studied with the support of a Rhodes Scholarship. Follow her on Twitter @grace_huckins.

Comments are closed.