The use of data has become an integral part of investigative journalism. Increasingly, reporters need to know how to obtain, clean and analyze the growing archive of digitized information.
Here is a list of resources to get you started, but we want to keep updating our community with the best resources available. Do you know of a great data tutorial we haven’t listed, perhaps in a language other than English? Help us keep this resource guide comprehensive by sending your favorite resource to: firstname.lastname@example.org. ¿Habla español? For resources in Spanish, click here.
Where It Came From: To know where we’re going, it helps to know where we’ve come from. Here’s a great history of data journalism, Fifty Years of Journalism and Data: A Brief History, tracing the field’s origins from the use of big mainframe computers in the 1960s to computer-assisted reporting in the ’90s to the current boom in data journalism. Written by GIJN’s own Brant Houston, author of Computer-Assisted Reporting: A Practical Guide, now in its fourth edition.
Best Practices for Data Journalism is a 2018 guide written by Kuang Keng Kuek Ser, an award-winning digital journalist, and produced by the Media Development Investment Fund. It covers setting up and using data teams as well as tools, techniques and presentation of data journalism.
The National Institute for Computer-Assisted Reporting, a project of Investigative Reporters and Editors, launched in 1989 to train reporters around the world on how to use data for investigations. In addition to “boot camps” and in-office training, NICAR offers data library offers analysis services and a data library and hosts the original annual conference on computer-assisted reporting.
DataViz.tools is “a curated guide to the best tools, resources and technologies for data visualization,” with 21 categories that include mining, cleaning, scraping, and interactive story-telling.
The Poderomedia Foundation in partnership with the University Alberto Hurtado, published the Manual de Periodismo de Datos Iberoamericano (Latin American Handbook of Data Journalism), with tips and tutorials on data mining, deep web searches, data visualization, and more.
The International Consortium of Investigative Journalists provides a selection of video tutorials on basic Excel functions, as well as how to background a person or company, or find federal court documents in the U.S.
Hacks/Hackers is a global movement bringing together computer programmers and investigative journalists to tell powerful data-driven stories. Trainings offered through regional chapters.
The Investigative Dashboard provides a collection of the most useful public data sources, on corporate ownership and more.
The Data Journalism Handbook is an international, collaborative effort involving dozens of data journalism experts. The free guide is available for download in English, French, Georgian, Russian, and Spanish.
The Open Data Handbook discusses the legal, social and technical aspects of open data, with case studies and handy tips.
Codementor offers online tutorials for a fee data science techniques in programs such as Python and R.
KDnuggets offers a wide variety of tutorials focusing on data mining, analytics and data science, including 3 Viable Ways to Extract Data from the Open Web, 4 lessons for Brilliant Data Visualisation, Mining Twitter Data with Python and Text Mining 101:Topic Modeling.
Chandoo a blog started in 2007 that aims to “to make you awesome in excel and charting”.
Visualization & Mapping
Flowing Data is run by statistician Nathan Yau, author of Data Points: Visualization that Means Something and Visualize This: The FlowingData Guide to Design, Visualization, and Statistics.
Visualisingdata.com offers a directory of compelling infographics, how-to info, and more.
Your Friendly Guide to Colors in Data Visualisation A 2018 blog post by Lisa Charlotte addressing color choice, including links to useful tools.
Gustavo Faleiros created JEO, a WordPress theme for launching geodata-based sites. It allows news organizations, bloggers and NGOs to publish news stories as layers of information on digital maps.
The Data Visualisation Catalogue is an on-going project to “help you find the right data vizualization method for your data”.
Data Viz Done Right, a blog that highlights data viz best practices around the web.
Google Maps Mania, a good blog for following the development of digital cartography (not only Google products).
Visual Loop, a website that displays “the world’s best infographics and data visualizations.”
Helpmeviz aims to help people with everyday data visualizations, designed to facilitate discussion, debate, and collaboration from the data visualization community.
Flowing Data is run by statistician Nathan Yau, author of
Coursera offers a number of online statistics courses including:
- Passion-Driven Statistics through Wesleyan University
- Statistics, Making sense of data, offered though the University of Torono
- Statistics One, offered through Princeton University
- Introduction to Statistics, offered through the University of California Berkeley
Recommended Books on Statistics:
- Damned Lies and Statistics, Joel Best
- Data Analysis for Politics and Policy, Edward Tufte
- Designing Social Inquiry, by King, Keohane, amd Verba
- The Drunkard’s Walk: How Randomness Rules Our Lives, Leonard Mlodinow
- How To Lie with Statistics, Darrel Huff
- Naked Statistics: Stripping the Dread from the Data, Charles Wheelan
- The Signal and the Noise, Nate Silver
- Thinking, Fast and Slow, by Daniel Kahneman
Data & Technology Blogs
Data Blog, the Guardian’s blog on computer-assisted reporting
Nacion Data, Spanish-language data journalism blog of the Argentinian daily La Nación.
Online Journalism Blog, by the UK’s Paul Bradshaw, covers data journalism, citizen journalism, blogging, vlogging, and more.
Open Knowledge Foundation, global movement to open up knowledge around the world and see it used and useful.
Toledol, a Portuguese-language blog about computer-assisted reporting.
Computational Reporting, all about data mining.
Dajore, data journalism research.
Driven by Data, how data journalism is sifting through the facts.
Vis4.net, random thoughts on information visualization and data journalism.
Reporter’s Lab, Duke University’s blog on tools, techniques and research for public affairs reporting.
Tow Center for Digital Journalism, Columbia’s blog on how technology is changing journalism, its practice and its consumption.
FiveThirtyEight, founded by renowned statistician Nate Silver.
The Upshot, a data journalism site by the New York Times dedicated to politics, policy and economic analysis.
Washington Post Information Graphics, a blog that gives an overview of the data journalism articles produced by the newspaper.
NPR Visuals Team, a blog that focuses on the methodology behind data journalism projects and that also shares open tools.
Source blog, a Mozilla/Open News project that offers guides, tutorials and regular features by top data journalists.
Storybench, a collaboration between the Media Innovation track at Northeastern University’s School of Journalism and Esquire magazine.
Getting Started in Data Journalism is a manual published by the Balkan Investigative Reporting Network in Albania which aims to introduce journalists to data-driven reporting techniques that are essential to contemporary investigative journalism.
Computer-Assisted Reporting: A Comprehensive Primer, By Fred Vallance-Jones and David McKie
Computer-Assisted Reporting: A Practical Guide, the E-version by Brant Houston
Computer-Assisted Research: Information Strategies and Tools for Journalists, By Nora Paul and Kathleen A. Hansen
The Data Journalism Handbook is an international, collaborative effort involving dozens of data journalism experts. The free guide is available for download in Arabic, English, French, Georgian, Russian, and Spanish.
Finding Stories in Spreadsheets, by Paul Bradshaw
Mapping for Stories: A Computer-Assisted Reporting Guide, By Jennifer LaFleur and Andy Lehren
Scraping for Journalists (second edition), by Paul Bradshaw. This book introduces a range of scraping techniques.
NICAR hosts the original annual conference on computer-assisted reporting, which is attended by hundreds, and also puts on data-specific boot camps.
The International Journalism Festival in Perugia, Italy, includes a School of Data Journalism training.
The Global Investigative Journalism Conference, held every two years, hosts a broad range of data-specific trainings.
Ghana Databootcamp trains participants in Ghana on how to locate, obtain and analyze public data on the extractive industries.
Data Journalism UK is a new annual conference organized by Birmingham City University’s Paul Bradshaw.
The European Data and Computational Journalism Conference aims to bring together industry, practitioners and academics in the fields of journalism and news production and information, data, social and computer sciences.