Data Journalism
-
Key Resources
-
Data Mining
-
Data Analysis
-
Visualization & Mapping
-
Statistics
-
Data & Technology Blogs
-
Books
-
Conferences
The use of data has become an integral part of investigative journalism. Increasingly, reporters need to know how to obtain, clean and analyze the growing archive of digitized information.
See the presentations on data journalism made at the IJAsia18 conference here.
Here is a list of resources to get you started, but we want to keep updating our community with the best resources available. Do you know of a great data tutorial we haven’t listed, perhaps in a language other than English? Help us keep this resource guide comprehensive by sending your favorite resource to: hello@gijn.org. ¿Habla español? For resources in Spanish, click here.
Key Resources
The Data Journalism Handbook 2, revised and expanded, was published in 2019. It is available online in English and will be translated into Spanish, French and German. Edited by Jonathan Gray and Liliana Bounegru of the Public Data Lab, the 10 chapters focus on: “What is data journalism? What is it for? What might it do? What opportunities and limitations does it present? Who and what is involved in making and making sense of it?” Update notices come via email. The first Data Journalism Handbook, issued in 2012 and still available online, includes more how-to material. It is available in Arabic (PDF), Azerbaijani, Chinese, English (PDF), French, Greek, Japanese, Spanish and Ukrainian.
Where It Came From: To know where we’re going, it helps to know where we’ve come from. Here’s a great history of data journalism, Fifty Years of Journalism and Data: A Brief History, tracing the field’s origins from the use of big mainframe computers in the 1960s to computer-assisted reporting in the ’90s to the current boom in data journalism. Written by GIJN’s own Brant Houston, Data for Journalists: A Practical Guide for Computer-Assisted Reporting, (5th Edition).
Best Practices for Data Journalism is a 2018 guide written by Kuang Keng Kuek Ser, an award-winning digital journalist, and produced by the Media Development Investment Fund. It covers setting up and using data teams as well as tools, techniques and presentation of data journalism.
The National Institute for Computer-Assisted Reporting, a project of Investigative Reporters and Editors, was launched in 1989 to train reporters on how to use data for investigations. It holds frequent bootcamps and other training sessions on data journalism and an annual conference. NICAR ‘s website has a collection of video tutorials on mapping, visualization, data and other online journalism tools. There’s a description of Data Wrangling and Analysis tools. It has a library of US databases. For members, there are tipsheets and additional materials, including practice datasets.
The Poynter Institute offers a Digital Tools section featuring a Digital Tools Catalog and a one-hour digital tools webinar tutorial. Sign up for a digital tools newsletter.
Getting Started with Data Journalism, a video with Aron Pilhofer (2017).
Quick Guide to Data Journalism, a 2016 “8-step guide to becoming a data journalist, complete with tools, resources, and tips” by Karlijn Willems of Data Camp.
Nils Mulvad, a co-founder and board member of the Global Investigative Journalism Network, wrote this article for GIJN in February 2015 about free or nearly free data tools.
Datajournalism.com offers a collection of resources for journalists who want to use data.
DataViz.tools is “a curated guide to the best tools, resources and technologies for data visualization,” with 21 categories that include mining, cleaning, scraping, and interactive story-telling.
Periodismo de Base de Datos provides tutorials and resources on data journalism for Spanish-speaking reporters.
The Poderomedia Foundation in partnership with the University Alberto Hurtado, published the Manual de Periodismo de Datos Iberoamericano (Latin American Handbook of Data Journalism), with tips and tutorials on data mining, deep web searches, data visualization and more.
Arab Reporters for Investigative Journalism offers this brief introduction to data journalism (2019) (in Arabic and English).
The International Consortium of Investigative Journalists provides a selection of video tutorials on basic Excel functions, as well as how to background a person or company, or find federal court documents in the U.S.
The International Journalists’ Network maintains a blog of the latest trainings, tools, and resources for data journalists.
Hacks/Hackers is a global movement bringing together computer programmers and investigative journalists to tell powerful data-driven stories. Trainings offered through regional chapters.
The Investigative Dashboard provides a collection of the most useful public data sources, on corporate ownership and more.
Open Data in Europe and Central Asia produced a Data Journalism Manual (in English and Russian) with modules on understanding data, data visualization and data-driven stories.
Training
Data Journalism, a series of training sessions from the Google News Initiative.
Getting started with data journalism, a video tutorial series by Alastair Otter on Media Hack (2017).
Introduction to Data Journalism, a series of tutorials by Workbench (Also see related tab “Tutorials.”)
Datajournalism.com has several sources available, on Python, R, and “when to Hire a Data Journalist.
Code Academy offers a series of free interactive trainings on the basics of HTML, CSS, JavaScript, Python, Ruby, and PHP.
Massachusetts Institute of Technology offers a series of free online courses in computer programming with Python, Java, and C++.
Michael Hartl publishes an open-source textbook on how to program with Ruby on Rails.
ProPublica practices data journalism, and teaches it. See videos of five of the lessons from ProPublica’s Data Institute: Introduction to Code, How Websites Work, HTML, Basic CSS and CSS Classes. Also posted is a detailed outline of the curriculum for the Institute. ProPublica also sells data at its Data Store.
Online Journalism published an introduction to using QuickCode to obtain data from the web.
Codementor offers online tutorials for a fee data science techniques in programs such as Python and R.
Analytics Vidhya offers a tutorial to learn the basics of R programming for data science, which covers data analysis and data manipulation.
KDnuggets offers a wide variety of tutorials focusing on data mining, analytics and data science, including 3 Viable Ways to Extract Data from the Open Web, 4 lessons for Brilliant Data Visualisation, Mining Twitter Data with Python and Text Mining 101:Topic Modeling.
Data Analysis
School of Data offers a series of tutorials – from finding datasets, to basic Excel skills and using the results to tell a story.
Mr. Excel is a site dedicated to tips and tricks for Excel.
Sandhya Kambhampati shares some key recommendations to start creating your own database.
Troy Thibodeaux, a data editor at the Associated Press, offers a “Gentle Introduction to SQL.”
Visualization & Mapping
Edward Tufte’s books and courses are industry standards for vizualizing data.
Flowing Data is run by statistician Nathan Yau, author of Data Points: Visualization that Means Something and Visualize This: The FlowingData Guide to Design, Visualization, and Statistics.
Fundamentals of Data Visualization by Claus O. Wilke is being published in 2019 by O’Reilly Media, Inc. It “is meant as a guide to making visualizations that accurately reflect the data, tell a story, and look professional.” See online preview.
Visualisingdata.com offers a directory of compelling infographics, how-to info, and more.
Your Friendly Guide to Colors in Data Visualisation A 2018 blog post by Lisa Charlotte addressing color choice, including links to useful tools.
Esri offers a series of free online courses for those interested in mapping with ArcGIS.
Gustavo Faleiros created JEO, a WordPress theme for launching geodata-based sites. It allows news organizations, bloggers and NGOs to publish news stories as layers of information on digital maps.
Peter Aldhous put together a primer on using Excel’s free social network plugin, NodeXL.
The Data Visualisation Catalogue is an on-going project to “help you find the right data vizualization method for your data”.
Data Viz Done Right, a blog that highlights data viz best practices around the web.
Visual Loop, a website that displays “the world’s best infographics and data visualizations.”
Helpmeviz aims to help people with everyday data visualizations, designed to facilitate discussion, debate, and collaboration from the data visualization community.
Here you can check out a list of visualization tools from IJAsia 2016.
Statistics
OpenIntro hosts this free textbook on statistics
Knight Digital Media Center provides free, two-day online courses.
Coursera offers a number of online statistics courses including:
- Passion-Driven Statistics through Wesleyan University
- Statistics, Making sense of data, offered though the University of Torono
- Statistics One, offered through Princeton University
- Introduction to Statistics, offered through the University of California Berkeley
Recommended Books on Statistics:
- Damned Lies and Statistics, Joel Best
- Data Analysis for Politics and Policy, Edward Tufte
- Designing Social Inquiry, by King, Keohane, and Verba
- The Drunkard’s Walk: How Randomness Rules Our Lives, Leonard Mlodinow
- How To Lie with Statistics, Darrel Huff
- Naked Statistics: Stripping the Dread from the Data, Charles Wheelan
- The Signal and the Noise, Nate Silver
- Thinking, Fast and Slow, by Daniel Kahneman
- Precision Journalism, by Philip Meyer
- Statistics with R: A Beginner′s Guide, by Robert Stinerock
News Data & Technology Sites
ProPublica Nerd Blog, secrets of data journalists and newsroom developers
Data Blog, the Guardian’s blog on computer-assisted reporting
Nacion Data, Spanish-language data journalism blog of the Argentinian daily La Nación.
Online Journalism Blog, by the UK’s Paul Bradshaw, covers data journalism, citizen journalism, blogging, vlogging, and more.
Open Knowledge Foundation, from a “community of civic hackers, data wranglers and ordinary citizens intrigued and excited by the possibilities of combining technology and information for good.”
Computational Reporting, all about data mining.
Vis4.net, random thoughts on information visualization and data journalism.
Tow Center for Digital Journalism, Columbia’s blog on how technology is changing journalism, its practice and its consumption.
FiveThirtyEight,a data-driven site founded by renowned statistician Nate Silver.
The Upshot, a data-driven site by The New York Times dedicated to politics, policy and economic analysis.
Washington Post Information Graphics, a collection of projects by the Post’s graphics team.
NPR Visuals Team, a“collection of projects by NPR’s visuals team.
Source, a project of OpenNews that offers guides, tutorials and regular features by top data journalists.
Books
The Data Journalism Handbook 2, revised and expanded, was published in 2019. It is available online in English and will be translated into Spanish, French and German. Edited by Jonathan Gray and Liliana Bounegru of the Public Data Lab, the 10 chapters focus on: “What is data journalism? What is it for? What might it do? What opportunities and limitations does it present? Who and what is involved in making and making sense of it?” Update notices come via email. The first Data Journalism Handbook, issued in 2012 and still available online, includes more how-to material. It is available in Arabic (PDF), Azerbaijani, Chinese, English (PDF), French, Greek, Japanese, Spanish and Ukrainian.
The Data Journalist, by Fred Vallance-Jones and David McKie (2017).
Data Journalism: Past, Present and Future, by Richard Lance Keeble, John Mair, Megan Lucero (2017)
Getting Started in Data Journalism is a manual published in 2018 by Lawrence Marzouk and Crina Boros of the Balkan Investigative Reporting Network in Albania, which aims to introduce journalists to data-driven reporting techniques that are essential to contemporary investigative journalism.
Best Practices for Data Journalism by Kuang Keng and Kuak Ser for the Media Development Investment Fund.
Getting Started with Data Journalism, by Claire Miller is subtitled, “Writing data stories in any size newsroom.”
Data for Journalists: A Practical Guide for Computer-Assisted Reporting, (5th Edition), by Brant Houston (2018). Includes step-by-step instructions on how to do basic data analysis.
Finding Stories in Spreadsheets, by Paul Bradshaw (updated 2016).
Mapping for Stories: A Computer-Assisted Reporting Guide, by Jennifer LaFleur, David Herzog and Charles Minshew (updated 2017).
The Curious Journalist’s Guide to Data Journalism by Jonathan Stray (2016)
Precision Journalism: a Reporter’s Introduction to Social Science Methods, by Philip Meyer (updated 2002 – but still a must-read for data journalists).
Scraping for Journalists (second edition), by Paul Bradshaw. This book introduces a range of scraping techniques.
Computer-Assisted Reporting: A Comprehensive Primer, by Fred Vallance-Jones and David McKie (2009).
Conferences
NICAR, a project of Investigative Reporters and Editors, hosts the original annual conference on data journalism as well as periodic training sessions.
Data Harvest is held in conjunction with the European Investigative Journalism Conference.
The International Journalism Festival in Perugia, Italy, includes a School of Data Journalism.
The Global Investigative Journalism Conference, held every two years, hosts a broad range of data-specific trainings.