Editor’s Note: When Computer-Assisted Reporting was first published in 1995, data journalism was still in its relative infancy. But a growing legion of journalists were discovering the importance of using computing power to gather, analyze and present stories. Author Brant Houston was among the pioneers, doing data-fueled reporting projects at U.S. newspapers during the 1980s and early ’90s. Houston joined Investigative Reporters and Editors in 1994 to direct what became the National Institute of Computer-Assisted Reporting, expanding NICAR from a handful of boot camps to 50 workshops a year. Interest in CAR — computer-assisted reporting — exploded, both in the United States and then overseas. Take the latest NICAR conference, for example–nearly 1,000 attendees went to the annual conference devoted to data journalism (check out the slides, links, and tutorials here).
Houston likes to call investigative reporting the R&D department of journalism. The media’s now widespread embrace of data journalism has certainly proved him right on that point. Over 20 years, Computer-Assisted Reporting has helped train and educate thousands of reporters while helping usher in an extraordinary new era of data-driven journalism. With this newly revised, fourth edition, Houston has now expanded on his previous work. We at GIJN are pleased to reprint the introduction to this latest look at how to use the tools of the trade.
It is in computer-assisted reporting where the real revolution is taking place, not only on the big analytical projects, but also in nuts-and-bolts newsgathering. New tools and techniques have made it possible for journalists to dig up vital information on deadline, to quickly add depth and context.
—“We’re All Nerds Now,” Joel Simon and Carol Napolitano, The Columbia Journalism Review (1999)
CAR gives journalists the opportunity to dig for truth in data, and the comparative analysis that a computer can do often reveals pertinent questions. What reporters are able to learn from using CAR provides readers with knowledge and insights that can cut through the clutter of opinionated noise and celebrity obsession. It also can allow even relatively small news operations to delve into problems affecting the global community, yet speak to readers and viewers right around the block.
—“The Benefits of Computer-Assisted Reporting,” Jason Method, Nieman Reports (2008)
The words in the first quote were written as the 20th Century was coming to an end, but they remain true as we move deeper into the 21st Century. The words in the second quote, nearly a decade later, show how crucial computer-assisted reporting has become in creating credibility and in recognizing the globalization of news. But there is still a revolution going on in journalism when it comes to data, both at basic and extraordinarily high levels.
In the past decade, software for analysis has continued to become much simpler to use. An overwhelming amount of data is now online and easy to download. Storage space is immense on hard drives, flash drives, and in the “Cloud.” The computing power on a laptop, tablet, or mobile phone dwarfs the power available only a few years ago. The ability to visualize data for better understanding and analysis has become pro forma. Furthermore, a new generation of computer programmers has joined traditional journalists to tackle the problems of capturing data from the Web, cleaning and organizing it, and creating fascinating presentations to be shared with the public and to encourage citizen participation and analysis.
At the same time, many fundamental truths remain the same. Databases are still created by people, and thus they naturally have omissions and errors that people have made and that must be noted and corrected. Every database also is a slice in time and thus is outdated the moment it is acquired and used.
Also remember that a database alone is not a story. Instead, it is a field of information that needs to be harvested carefully with insight and caution. It needs to be compared with and augmented with observation and interviews.
More important than ever is determining the accuracy of a database before using it. Equally important is careful analysis of the data, since one small error can result in monstrously wrong conclusions. The idea of uploading data on the Web and hoping the public or volunteers will consistently make sense of it with reliable analysis has proven unreliable. In fact, journalists—not advocates—are needed more than ever to deliver a well-researched understanding of information and data, and to tell a compelling story using data. Yet, despite changes in technology and the availability of mega-data, some scenarios have not changed.
As you will learn in this book, the techniques described in these scenarios are known as computer-assisted reporting, also referred to as CAR, and they are a part of everyday journalism. Journalists use these and other techniques for daily reporting, reporting on the beat, and for the large projects that win Pulitzer Prizes.
Computer-assisted reporting does not refer to journalists sitting at a keyboard writing stories or surfing the Web. It refers to downloading databases and doing data analysis that can provide context and depth to daily stories. It refers to techniques of producing tips that launch more complex stories from a broader perspective and with a better understanding of the issues. A journalist beginning a story with the knowledge of the patterns gleaned from 150,000 court records is way ahead of a reporter who sees only a handful of court cases each week.
Computer-assisted reporting doesn’t replace proven journalistic practices. It has become a part of them. It also requires greater responsibility and vigilance. The old standard—“verify, verify, verify”— that one learns in basic reporting classes becomes ever more critical. “Healthy skepticism” becomes ever more important. The idea of interviewing multiple sources and cross-referencing them becomes ever more crucial.
“Computers don’t make a bad reporter into a good reporter. What they do is make a good reporter better,” Elliot Jaspin, one of the pioneers in computer-assisted reporting, warned three decades ago. Many practicing journalists have sought training in the past 20 years and have become proficient in the basic skills of computer-assisted reporting. They have overcome computer and math phobia, and they now put these skills to use on a daily basis. And this has led to more precision and sophistication in their reporting.
To quote Philip Meyer, a pioneer in database analysis for news stories, “They are raising the ante on what it takes to be journalist.” Aiding in the progress and acceptance of these skills has been the proliferation of the Web and social media, the development of inexpensive and easy-to-use computers and software, and the increased attention to the value of data and techniques of analysis in newsrooms.
Computer-assisted reporting is no longer a sidebar to mainstream journalism. It is essential to surviving as a journalist in the twenty-first century. The tools of computer-assisted reporting won’t replace a good journalist’s imagination, ability to conduct revealing interviews, or talent to develop sources. But a journalist who knows how to use computers in day-to-day and long-term work will gather and analyze information more quickly, and develop and deliver a deeper understanding. The journalist will be better prepared for interviews and be able to write with more authority. That journalist also will see potential stories that would have never occurred to him or her.
The journalist also will achieve parity with politicians, bureaucrats, and businessmen who have enjoyed many advantages over the journalists simply because they had the money and knowledge to utilize databases and digital information before journalists did. Government officials and workers have long been comfortable entering information into computers and then retrieving and analyzing it. Businesses, small and large, routinely use spreadsheet and database software. Advocacy groups frequently employ databases to push their agendas.
Without a rudimentary knowledge of the advantages and disadvantages of data analysis, it is difficult for the contemporary journalist to understand and report on how the world now works. And it is far more difficult for a journalist to do meaningful public service journalism or to perform the necessary watchdog role.
For years, journalists were like animals in a zoo, waiting to be fed pellets of information by the keepers who are happy for journalists to stay in their Luddite cages. But a good journalist always wants to see original information, because every time other people select or sort that information, they can add “spin” or bias, which can be tough to detect. Computer-assisted reporting can help prevent that from happening.
Many journalists and journalism students now learn the basic tools of computer-assisted reporting because they realize that it is the best way to get to the information since most governmental and commercial records are now stored electronically. Despite security concerns, there still are a mind-boggling number of databases on government and international Websites. So without the ability to deal with electronic data, a journalist is cut off from some of the best and untainted information. The old-fashioned journalist will never get to the information on time—or worse, will be brutally trampled by the competing media.
For a journalist or journalism student, this knowledge also is crucial in the competition to getting a good job. At many news organizations, an applicant who has these skills—which are far more than the ability to surf the Web—gets his or her résumé moved to the top of the stack.
A journalist does not have to be a programmer or someone who knows software code, although that also can make a huge difference. A journalist who can use a spreadsheet or database manager is free to thoroughly explore information, reexamine it, and reconsider what it means in relation to interviews and observations in the field. The journalist can take the spin off the information and get closer to the truth. A journalist may not be a statistician, but a good journalist knows enough about statistics to know how easy it is to manipulate them or lie with them. In the same way, if a journalist understands how data can be manipulated, he or she can better judge a bureaucrat’s spin on the facts or a government’s misuse of a database.
Journalists have found, too, that if they let a person whose job is only to process data do the analysis, nuances or potential pitfalls of the data may be missed. A data programmer also does not necessarily think like a journalist; what may be significant for the journalist may seem unimportant to the programmer. Using a data programmer to do all the work is like asking someone else to read a book for you.
The conscientious journalist also does not want to fall into a cycle of asking for a report in some frozen digital format, studying the report, coming up with more questions, and then asking for another report. Why get into a lengthy back-and-forth when you can engage in a rapid, multidimensional conversation with the data on your computer screen?
Most important, computer-assisted reporting is at the heart of public service journalism and of vigilant daily reporting. This is true whether writing about education, business, government, environmental issues, or any other topic.
This excerpt is from the just-released fourth edition of Computer-Assisted Reporting: A Practical Guide by Brant Houston, and is reprinted by permission of Routledge.
Brant Houston (@branthouston) is the Knight Chair in Investigative Reporting at the University of Illinois at Urbana-Champaign. He is board chair of the Global Investigative Journalism Network and oversees the community news project CU-CitizenAccess.org. From 1997 to 2007, he served as executive director of IRE. He is author of the newly revised Computer-Assisted Reporting.