The past few years have seen an explosion of digital tools that can be used to enhance journalism research and reporting. In this new monthly feature GIJN’s IT Coordinator Alastair Otter takes a look at some of the best and latest tools and techniques for enhancing investigative and data-driven journalism.
Most journalists shy away from spreadsheets but in reality almost all investigative journalism projects require some level of data analysis. And when it comes to spreadsheets Microsoft’s Excel and Google Sheets are the usual go-to tools, but what if you want something a little easier to use but more powerful? Enter Workbench, a new take on traditional spreadsheets that is designed to make it easy to search through datasets and find stories. Workbench also makes it easy to monitor sites over time for changes, as well as scrape data from websites. The tool is free and built with journalists in mind, so it’s pretty easy to get up and running.
If Workbench is not quite what you’re looking for, then it’s worth checking out Airtable. Although also based on the concept of a spreadsheet, Airtable makes it easy to create different views of your data and save those, so you can switch between them as needed. You can, for example, view your data in rows, or as a calendar, or as a grid or gallery. One of Airtable’s great features is the option to create and publish online forms to collect data. When users submit the forms, their entries are automatically stored in your chosen Airtable.
Digital security for journalists is a hot topic at the moment, particularly for journalists working in hostile environments. Fortunately, there are a growing list of resources with excellent information to help journalists secure their work and protect themselves and their sources. For a thorough but easy-to-understand guide to the basics of digital security, visit the Rory Peck Trust’s new digital security guide. The guide covers everything from securing your essential accounts and passwords to assessing your digital risk and getting across borders safely.
For a more detailed review of digital security, the Field Guide to Security Training in the Newsroom is designed as a tool to teach digital security but also offers a wealth of information on securing your work, your devices and your sources.
And, of course, there is GIJN’s own digital security guide which provides links to dozens of digital security resources.
Doing background research can be time consuming and tedious as you search through dozens of sites looking for mentions of someone or something. IntelTechniques simplifies the process by combining searches across dozens of sites at once. With IntelTechniques, it’s relatively painless to find the the social media accounts of specific users, find the places they’ve checked into and any updates they’ve posted. The service searches most major search engines, all of the popular social networks, and has searches for looking up email, telephone numbers, domains, as well as doing reverse searches for particular images. Happy searching.
Bulk Map Marker
It’s relatively easy to find the GPS coordinates of any building or place by using Google Maps (tip: the GPS coordinates are in the URL after you’ve done a search or clicked on a place). But what if you have dozens of addresses you want to look up? A relatively new tool, Batch Geocoder for Journalists from Dutch developers LocalFocus, does exactly this. Copy and paste your list of place names or addresses into Batch Geocoder, select a country and a few seconds later you’ll have a list of matching GPS coordinates that you can then copy into a spreadsheet or a mapping tool. From experience the geocoder is pretty reliable, unless you’re searching for really unfamiliar names rather than addresses.
Tech Tool: Fuzzy Matching
Imagine you have two datasets and there are common names in both files but they are too long to manually compare. And to make matters worse the names are not always consistently spelled. What do you do? You may want to take a look at CSV Match, a command line tool that does exactly what you want: it compares two datasets and finds overlaps between them. It also uses a technique called “fuzzy matching” which will match names that look similar but may be spelled differently in each list. CSV Match is a command line tool built in Python, so there is no point and click interface, but don’t let that put you off. There is an excellent, detailed tutorial that runs through the entire process of running a fuzzy match.
And finally, ever wondered how many people are living in a particular area? This excellent tool does exactly that. When you draw a circle or polygon on the map it returns a population estimate and land area estimate. It’s a great idea well done.
If you have any tools or tips that you think worth sharing, you can email them to me at firstname.lastname@example.org.
Alastair Otter is GIJN’s IT Coordinator. He is also a managing partner of Media Hack Collective, a data journalism initiative based in Johannesburg, where he programs interactive data visualizations and manages a number of online media sites.