Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler

Stories

Topics

Drilling Down: A Quick Guide to Free and Inexpensive Data Tools

Read this article in

Русский

Newsrooms don’t need large budgets for analyzing data–they can easily access basic data tools that are free or inexpensive. The summary below is based on a five-day training session at Delo, the leading daily newspaper in Slovenia. Anuška Delić, journalist and project leader of DeloData at the paperinitiated the training with the aim of getting her team to work on data stories with easily available tools and a lot of new data.

“At first it seemed that not all of the 11 participants, who had no or almost no prior knowledge of this exciting field of journalism, would ‘catch the bug’ of data-driven thinking about stories, but soon it became obvious” once the training commenced, said Delić.

Introducing Data Tools

In addition to demonstrating basic Internet searches (see below), advanced Excel, Google Fusion, OpenRefine, and Helium Scraper, which I also included in trainings at the European Data Journalism Conference (Data Harvest), I offered training in PDF-extraction with CometDocs, DocumentCloud, Datawrapper, and CartoDB.

helium scraper

It turns out there is a lot of good data in Slovenia that can be used for stories, from the statistical office, for example. Such data can even be sorted according by municipality, which is also the case in other European Union countries.

Internet Search Tips

Paul Myers researchclinic
google verificationHenk van Ess on Facebooksearch

Google Tools

Two-step verification
Google Offline
Table Capture for Chrome

Importing PDFs

We extracted data from PDFs, using  CometDocs and OnlineOCR.net. See also this overview of good tools for importing PDFs. CometDocs will solve most needs of PDF extraction while also recognizing special characters in alphabets of different countries. For members of Investigative Reporters and Editors (IRE), CometDocs is free.

DocumentCloud

DocumentCloud-logoDocumentCloud is free to use. It’s a good tool for embedding notes in a document, giving readers an opportunity to review the entire document.

OpenRefine

OpenRefine (formerly Google Refine) is a free-to-use powerful tool for working with messy data, cleaning it, and transforming it from one format into another. Here is a good tutorial on OpenRefine.

Data Scraping

The basic version of Helium Scraper, which is a good tool, costs US$100. It is the easiest way to begin scraping, I think. It works on PCs, but not on Macs.

Here you can also find other tools for scraping data from the web.

Google Fusion

Google Fusion is a great mapping tool and free to use in most cases. It’s important to try to get the right version of the map of municipalities in your country and import it as a standard-map into Google Fusion. Below are some good links for working with Fusion:

color brewerSearch for fusion tables
Your Google drive
List of icons
http://www.diva-gis.org/
Converting shape-files
http://www.december.com/html/spec/colorsafe.html
http://colorbrewer2.org/
Layer Builder

Data Wrapper

Data Wrapper is a very easy tool for making good interactive graphs, but embedding the graphs from the company’s server requires payment.

Instead, you can run them on your own server and use WinSCP as the system for file transfers. WinSCP is free and works on PCs, but not on Macs.

The server can also be used for maps created via Google Fusion, but remember to structure your drives.

CartoDB

CartoDB is a great alternative to Google Fusion with a lot of possibilities to make maps in new ways.

cartodb3In the free version, it’s possible to upload an unlimited number of maps and tables, however, the total data limit is 50 MB, which is enough in most cases. In the free version, there is limited access to geocoding, which then needs to be done with another tool. Or the newsroom has to acquire at least one paid account.

TimelineJS

TimelineJS is a free, open-source tool that enables users to build visually-rich interactive timelines. It’s available in 40 languages. You can easily build the content in a Google Spreadsheet and then import it to TimelineJS.

Happy data drilling!


MulvadNils Mulvad is a co-founder and board member of the Global Investigative Journalism Network, as well as Investigative Reporting Denmark. He is also editor at Kaas & Mulvad, a data journalism consulting firm and associate professor at the The Danish School of Media and Journalism. He was CEO for the Danish International Center for Analytical Reporting 2001-2006 and European journalist of the year in 2006.

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article


Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

Read Next