Reported Features

Problems with Preprints: Covering Rough-Draft Manuscripts Responsibly

At the end of January, as the new coronavirus raged in Wuhan, China, and cases cropped up in other countries, a rough-draft version of a scientific manuscript went online in which researchers posited that the new pathogen might have acquired portions of its RNA sequence from HIV, perhaps making it more infectious. A Reuters analysis found that the study was tweeted more than 17,000 times and was picked up by at least 25 news outlets in the days that followed. But the takeaway from this manuscript was not related to its conclusions, which proved unfounded. The real lesson was about how reporters should be wary of such unvetted scientific reports: Just two days after the preliminary paper went online—after other scientists noted fundamental flaws—it was withdrawn.

The episode was an early indication that rough-draft scientific manuscripts, known as preprints, would feature heavily in this pandemic, as would concerns about their rigor. In a more recent example, an unvetted genetic analysis suggesting that a mutant version of the coronavirus had evolved to become more contagious was found to be overblown, and journalism watchdogs questioned its coverage. And in yet another case, a study using antibody tests on volunteers in Santa Clara county in California claimed that 50 to 85 times more people in the area had been infected with SARS-CoV-2 than previously thought, suggesting a lower risk of death from the virus. But the study, which had been posted as a preprint, was quickly panned by many epidemiologists who pointed to significant problems, including how the researchers had recruited individuals for testing. A revised version of the manuscript appeared on medRxiv two weeks later. It has yet to appear in a scientific journal.

Preprints are manuscripts that are made publicly accessible online after experiments are done but before undergoing a coordinated review by experts in the relevant field of study to assure that the conclusions are supported by the data. The first preprint server, called arXiv and devoted to physics papers, was launched in 1991 and remains in use today. The rise of preprints in biology and medicine is more recent; the well-known life-sciences preprint servers bioRxiv and medRxiv were launched in 2013 and 2019, respectively. A paper published in April—still in preprint form itself—found 44 platforms for preprints that are biomedical and medical in scope. “The world is moving to a preprint-first world,” says Ivan Oransky, co-founder of Retraction Watch.

The rise of preprints in biology and health is staggering. Even though only around 2.3 percent of published papers indexed in the PubMed database appear in preprint form before publication, the number of preprints in the biological sciences has gone up more than threefold in the last two years, from around 2,500 per month to around 8,000 a month.

The COVID-19 pandemic has further boosted the visibility of preliminary, unvetted manuscripts. In a newly posted manuscript—itself a preprint that the researchers have not yet submitted for review—Jonny Coates, a postdoctoral researcher in immunology at Cambridge University, and his colleagues compared traffic metrics for all studies posted on bioRxiv and medRxiv between September 2019 and April 2020. Preprints pertaining to COVID-19, all of which were posted this year, were viewed more than 15 times as often as non-COVID-19 preprints. (Some preprint servers, including bioRxiv, flag manuscripts related to SARS-CoV-2 with a note reminding users that preprints are preliminary and should not be treated as conclusive.)

Media coverage of preprints has also risen markedly since the emergence of SARS-CoV-2. That increase is not surprising, says Lauren Morello, deputy health care editor at Politico. “The pandemic is accelerating a trend, but this was coming already,” she says.

“We do have pressure to be first and to break news, and I think preprints seem to offer this shiny opportunity to do that,” Aschwanden says. “But most of the time that’s not really going to pan out.”

But press coverage of preprints is fraught. Hastily conducted and reported scientific studies are an unfortunate hallmark of the current pandemic, as journalist Christie Aschwanden wrote recently in Wired. She says journalists should have their guard up. “Where I’ve seen reporters go wrong on this is when they sort of grab these [preprints] because they want to be first,” Aschwanden says. “We do have pressure to be first and to break news, and I think preprints seem to offer this shiny opportunity to do that. But most of the time that’s not really going to pan out.” Reporters, she says, might even benefit by assuming any given preprint is “probably a false lead.”

A Skeptical Eye

Perhaps the most central question for journalists is whether or not to even cover a preprint in the first place. In some cases, it might be worth covering a low-quality preprint that’s going viral on social media in order to shine a sobering light on its flaws. However, journalists need to be cautious about putting preprints in the spotlight. “Even if you are putting all of these qualifiers in and [saying,] ‘This is preliminary,’ people don’t always pay enough attention to that,” Aschwanden says. “By virtue of even covering it and making it a story you’re giving it attention that it wouldn’t otherwise have.” If a study is so weak that it warrants only critical coverage, she says, “then it might be a good decision not to give it more attention with a story.”

Reporting on preprints requires that journalists take special precautions to vet the research they present, says Camille Carlisle, science editor at Sky & Telescope magazine. She says journalists should check whether the preprint document indicates it is has been submitted to a journal. “If there’s no comment indicating that the paper is in the process of being published, then remember that it might be because the results aren’t worth publishing,” she says. “There are also scientists for whom the arXiv is essentially a brainstorming blackboard. Proceed with skepticism.”

That advice may hold especially true for preprints in the biomedical realm because of their potential consequences for public health, suggests Matt Davenport, a reporter at Chemical & Engineering News. Whereas he solicits comment from about three independent experts for stories about peer-reviewed papers and proceeds if he gets comment from one of them, Davenport’s bar is higher for preprints. “For a preprint, maybe I double that and then follow up with folks who don’t get back to me to find out exactly why,” he says.

Likewise, Aschwanden says she seeks comment from multiple independent sources who can assess whether a preprint has a reasonable study design and can confirm that the experiment’s goalposts were not moved during the course of the study. She also checks whether an unusually large number of participants dropped out of a clinical trial while it was in progress—and if so, why. Finally, Aschwanden recommends asking outside commenters to weigh in on the manuscript’s statistical robustness. She notes that the American Statistical Association can be an excellent resource for science reporters.

It’s also crucial to get context for a preprint—and when writing, to give readers a sense of the broader research landscape. That includes getting context about the scientists involved in the study. Anyone writing about preprints should “look at previous research the authors have published—both to check their credentials, and also to see how those findings may contribute to their latest work,” says Melody Schreiber, a freelance journalist and editor of a forthcoming nonfiction book on premature birth called What We Didn’t Expect.

Independent sources may have differing opinions when asked to comment about the value of a preprint. Reporters and editors have to decide how to handle such situations on a case-by-case basis.

Although scanning social media doesn’t replace speaking directly with independent sources, doing so can provide an initial picture of how other scientists have responded to a new preprint. (Some preprint platforms, including bioRxiv and medRxiv, even display, on each manuscript’s page, a rundown of how it has been mentioned by blogs and in social media posts.) But Aschwanden cautions against relying on online discussion to guide decisions about whether and how to cover preprints. “You have to be careful because there are so many competing interests there,” she says. When looking for outside comment on a paper, she says, “you want people who don’t have a dog in the fight.”

There’s no magic number of independent sources to consult on preprints before proceeding with a story. Oftentimes, it depends on how big a finding the manuscript describes. “The higher the stakes of the paper, the more sources you contact,” says Ed Yong, a staff writer at The Atlantic.

Sometimes, independent sources may have differing opinions when asked to comment about the value of a preprint. Reporters and editors have to decide how to handle such situations on a case-by-case basis, Morello says. In some cases, she says, one or two outside commenters may flag an important caveat to a study or even share serious reservations, but the study might still have news value for the general public and be worth covering.

Other times, those caveats might be deal-breakers. “Sometimes after hearing a bunch of folks say a study is good,” Morello says, “you talk to somebody who brings up a statistical objection and then you ask other people about it and they say, ‘Oh wait … that … eek!’ And then you might not end up writing the story.” Ultimately, covering contentious preprint studies requires journalists and their editors to make careful decisions on how to proceed—there’s no easy rule of thumb.

Communicating Caveats

Just as important as the decision about whether to cover a preprint manuscript is how the research is described to readers, both in the story itself and on social media. Any coverage of a preprint should clearly convey that the study hasn’t been vetted in the same way published papers have been. But Oransky cautions against simply saying that a study hasn’t undergone “peer review” because lay readers might not be familiar with that phrase.

“As shorthand, it’s better than nothing, but what I much prefer is to say no one has formally reviewed and critiqued this paper in [the] way a scientific journal might,” says Oransky, who also teaches medical journalism at New York University. Similarly, Aschwanden recommends using straightforward, accessible language, such as by writing that the work “has not yet been checked for errors.”

New Tools Put Spotlight on Preprints

Recently, some groups have tried creating resources to help nonscientists gain more context about preprints. In 2018, Coates and a group of about 100 volunteer collaborators, all early-career scientists, established a highlight service called Prelights to spotlight interesting preprints in the life sciences. The group has now added coronavirus research to its purview. In April, they debuted a website in which COVID-19 preprints with major shortcomings, such as inadequate sample sizes, are flagged in yellow. (A limitation of the project, as Coates notes, is that the team does not yet include any epidemiologists or infectious disease specialists.)

There are other similar efforts underway. As Hannah Thomasy reported for Undark, a new system for curating preprints, called Outbreak Science Rapid PREreview, allows academics to review outbreak-related preprints. The project, which is funded by the British charitable foundation the Wellcome Trust, has collected more than 60 reviews.

Even seemingly small wording choices make a difference. For example, describing a preprint document as a “manuscript” rather than a “paper” and referring to it as having been “posted” rather than “published” online helps underscore, for readers, that the document is still in a preliminary stage.

The same caution should apply to headlines. Morello suggests that, as with stories about peer-reviewed but preliminary studies, media outlets should “be careful that the headline matches the caution of the story.” For example, titles might include words like “preliminary analysis” and “suggests.”

The most careful reporting can be undermined by an incautious tweet, so how stories are framed on social media is also important. As Morello observes, some people don’t actually click on the stories they come across on social media. That may be why, as she’s noticed, some reporters make a habit of promoting preprint stories not in a single Tweet, but in threads that provide context and note caveats that might otherwise be found only in the article text.

It can also be risky for reporters to tweet about preprints they haven’t reported on themselves, Yong says. That’s why he cautions journalists against tweeting about a preprint manuscript (or even a published paper) that they’ve found if they haven’t yet done reporting on the work. “To do so is functionally equivalent to writing a story without talking to anyone, which I think we can all agree is a bad idea,” he says. “There may have been a time when this practice was acceptable, and when journalists could use Twitter as a means of reporting—as a way of canvassing opinion, or testing ideas. But this is no longer that time.” Yong says that the stakes have become too high and the public is more vulnerable than ever to poorly vetted claims.

Peer-Reviewed Studies Also Require Caution

Amid the concern about unvetted preprints getting undue media attention, it would be easy to overcorrect, treating peer-reviewed papers as if they are flawless. As freelance journalist Wudan Yan wrote in a recent New York Times article on media coverage of preprints, published journal articles also sometimes receive exaggerated coverage. And Coates suggests that regarding peer review as the gold standard may cause some journalists to let their guard down and not critique published journal articles as carefully as they should. For that reason, he says, “published papers are potentially more dangerous.”

Penny Sarchet, news editor of New Scientist, holds a similar view. “We don’t automatically treat preprints like they’re that much lower in quality than peer-reviewed published papers,” she says. “We don’t believe peer-review to be a gold-standard guarantee that science holds up.” As she notes, many articles published in scientific journals don’t ultimately hold up when others try to replicate them. And, she says, there have also been “really shoddy” studies published on COVID-19 since the pandemic began.

How much manuscripts change as they go through the peer-review process remains uncertain. In March, Jeffrey Brainard reported in Science that one recent examination of 76 preprint papers, mostly in genetics and neuroscience, found that of the 56 manuscripts that were ultimately published, most underwent relatively few changes after peer review. But the study was small and is itself still only in preprint form.

Regarding peer review as the gold standard may cause some journalists to let their guard down and not critique published journal articles as carefully as they should.

Still, its findings mirror the experience of London-based geneticist and writer Adam Rutherford. Many of the papers he used in the research for one of his books were preprints about paleogenomics that had been posted on bioRxiv. “It occurred to me that all of them were subsequently published in mainstream journals, so I checked to see how much they had changed, and the answer was almost not at all,” Rutherford says.

There’s an important bias inherent in such analyses, though: They only consider preprints that eventually get published in peer-reviewed journals. But not every preprint makes it to publication. A 2019 analysis found that around two-thirds of preprints posted between 2013 and 2017 were later published. And some studies are even retracted while still in the preprint stage.

If a study does change considerably between the preprint stage and publication after peer review, Oransky says, news outlets should include “an update at the top to explain what changed.” And if a study is retracted, he says, “I’d add an update at the top noting that, and saying why it was retracted.”

Sarchet says the stakes are obviously very high at the moment for preprints, given the global concern about the current pandemic. “We’re being a lot more careful with the COVID crisis because there’s obviously much more potential there for people to take them [the preprints] very seriously and apply them to their lives,” she says. But, she says, it’s also important to share preliminary findings with readers if those discoveries are robust and intriguing, to give them a sense of the evolving nature of knowledge. Understanding how scientists “are exploring new ideas and testing those out” is an important part of covering science, she says. “That’s why we’ve always covered preprints.”

Roxanne Khamsi is a science journalist based in Montreal, Canada. Her work has appeared in publications such as Scientific American, Wired, The Economist, and The New York Times Magazine. Follow her on Twitter at @rkhamsi.