“One of the big things I hear from editors is this fear: ‘Oh my God, this is data journalism, I’ve never done it, I don’t know what to do, the story would take too long.’” This comment, from data journalism trainer MaryJo Webster, reflects a common dilemma, especially in smaller newsrooms, in which some younger reporters have data skills, but they collide with a “fear of rows-and-columns” at the assignments desk.
Webster, data editor at the Minneapolis Star Tribune (in the northern US state of Minnesota) has developed a guide designed to show these editors an easy way forward. First, that they don’t need advanced skills to champion data-driven coverage; second, that simple techniques can open a new world of unique stories; third, and most important, that there’s no need for fear – because, she says, if you know what you need from human sources, then you’ll know everything you need from data sources too.
While Webster acknowledges that “data is bad for the ‘why?’ questions,” she says editors without data skills can otherwise apply their traditional source approaches in almost every case – right up to treating data just like human quotes in story copy. (Editors know not to stack too many quotes together; to check them for accuracy; to paraphrase when the comment is complex – and it’s exactly the same, she says, for numbers.)
In a session at the NICAR22 data journalism conference – organized by Investigative Reporters & Editors — Webster laid out steps to get past the “mystification” of the craft and develop a new way to think about datasets as unique, friendly sources.
NICAR stands for the National Institute for Computer-Assisted Reporting, IRE’s center that began in 1989 and played a key role in introducing data journalism to the world.
“We need every editor thinking: ‘Hey, maybe there’s a data opportunity here,’” said Webster. She also pointed out the unfamiliarity that data-novice editors have with numbers can be an advantage for the story, as they know how to present the evidence for audiences likely to have the same layperson’s understanding.
She says any editor can think of the story value of data in three ways:
- As a guide for where to send your reporters and photographers. Webster says basic data – such as municipal data on a COVID-19 spike in a certain neighborhood, or the proportion of minority-owned shops damaged in riots – is an excellent and underused staff deployment guide for assignment editors. “‘Who should we interview? What part of the city should we focus on?’ – even if that’s all that data does for a story, that’s good,” she noted.
- As the spine of a traditionally-reported investigation. “This is the kind that holds up all the other traditional reporting,” she said. As an example, she pointed to evidence from the Star Tribune’s 2018 sexual assault justice investigation, in which a self-built dataset proved that police and prosecutors in the state had handled sex assault cases poorly, or not at all.
- For context – where data defines the problem, or helps the audience understand the issue. “This is probably the most common use, where the story could probably be done without data, but you get something more: the examples, the detail,” she noted. “Data is good for the who, what, when, and where.” Visualizations are particularly important in this role. For instance, listing different emergency response times between city neighborhoods – or even using graphs – is unlikely to show the true policy failure, which could be discrimination against minority communities. Instead, a map of the city, color-coded for the data ranges, will likely show what’s really going on.
The Story Assignment Process
Webster outlines several optional steps in her guide – from conducting your own surveys to asking reporters to bulk-send freedom of information (FOI) requests – but, at the NICAR session, she emphasized an even more important step.
Brainstorm data opportunities early – right at the story conception stage. “This is especially important when meeting with beat reporters, and brainstorm[ing] is the time to think about those opportunities, and the datasets to ask for,” she said. “Some stories just seem like natural fits. Other times, you realize the information you need is not readily available elsewhere.” She identified several story types from those brainstorm sessions that always warrant a data search and analysis request for reporters:
- Stories that need to measure something. Ask reporters to seek datasets if audiences will want to know how big-of-a-deal an issue is, or if questions arise at the news meeting, like “How does the phenomenon compare to other places? How has it changed over time? Where does it happen most often?” For instance, one Star Tribune data story found that injuries from car accidents involving pedestrians were more severe in suburbs, and that only 25 drivers were charged with crimes from over 3,000 pedestrian crashes. The reporter tracked down driver names from police report codes, then searched for those names in court records, and built an in-house database of pedestrian crashes.
- The larger story behind a single or “gee-whiz” number that reporters mention. For example, if a reporter discovers from a source that building fires are “up 25%” over the previous year, editors can ask them to seek datasets that answer questions like: “Were these mainly house fires? Was there a cause, like smoking or space heaters, driving the increase?”
- Stories that test whether a public promise was kept, or if a program worked. “Money was spent on something. There should be data to show whether it went where it was supposed to, or was wisely spent,” Webster explained.
- “Is it really true?” and he-said-she-said stories. Data can test popular myths or theories. “For example, we asked, ‘Is the claim true that schools in Minneapolis and St. Paul had re-segregated, after extensive integration efforts in the ‘90s?’” she said. “The answer was: Yes, they did.”
- Serious on-going stories. “When you have that issue that just seems on repeat, all the time – at some point, you have to step back and do the big picture story,” Webster noted. “The tough part about these is you’ll have a tough time getting real-time data, but data does help move beyond the [everyday] stories.”
Webster also cautions editors against asking reporters to find a single datapoint “to just drop into a story or graphic” as a kind of token statistic. “The better use is to ask: ‘What are the questions we can get from data that we maybe can’t get elsewhere?’, or to support what people are telling us,” she said.
Checklist: Is a Data-driven Story Worth the Time?
“Many editors love to say: ‘Oh, but it’s going to take so much time!’” Webster said, referencing the extra reporting often needed for data stories. “But maybe it’s worth it. There are questions to check.”
- Is it a watchdog story with the potential to make an impact or drive change?
- Can we write a quick news story now, and run the data-driven piece later?
- What are the likely time horizons for access to the key information and documents?
- Is it a pressing issue on your reporter’s beat?
- Is it likely to be a mid-size or large enterprise piece?
- Is the analysis already being done by someone else – so you could save time?
The Data Is In. Now What?
- Editors should start the vetting and bullet-proofing process as soon as the data starts arriving – not when the story is filed.
- Watch out for, and triple-check, too-good-to-be-true data points.
- Treat each dataset like a whistleblower: “What do we know about it? How do we know it’s reliable? How did it come to us?”
- Ask whether the data findings differ from evidence from traditional reporting. The data could be wrong.
- Identify the “star” finding: the newsiest takeaway.
- Ask the data analysis reporter to simplify and explain the findings to you as a layperson.
Tips for Editing with Numbers:
- Refresh your knowledge of understandable numbers and ratios with good guides, like Sarah Cohen’s “Numbers in the Newsroom” guide.
- Treat numbers like quotes in copy – never in bunches, only where needed, and paraphrasing where necessary.
- Do what you ask your reporters to do: Don’t be afraid to ask dumb questions about the numbers you’re presented with.
- Avoid decimals – especially for things that can’t be divided in reality, like a person.
- Don’t make your readers do the math.
- Use familiar analogies where possible.
- Use number formats that audiences can comprehend – like “7 out of 10,000” rather than “0.07%”, or use spending per person, rather than aggregate numbers for an entire country.
- Use visualizations like bar graphs when there are too many numbers to list.
- Pick a “star” number, and focus on that high in the story.
“Equating data journalism with traditional journalism makes it less like this magical thing going on in the corner of the newsroom, and makes it something they already know,” Webster explained. “And editors do already know exactly how to judge the human sources their reporters find. It’s exactly the same with data.”
Rowan Philp is a reporter for GIJN. Rowan was formerly chief reporter for South Africa’s Sunday Times. As a foreign correspondent, he has reported on news, politics, corruption, and conflict from more than two dozen countries around the world.