Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler
GIJC21 Video Series - Data
GIJC21 Video Series - Data

Image: GIJN

Stories

Topics

Video Resources for Data Investigations

GIJC21 Video Series - Data

For 20 years, GIJN conferences have helped spread data journalism around the world. Our last Global Investigative Journalism Conference — GIJC21, held in November — was no different. GIJN’s first fully online conference featured a full track of data workshops and panels, ranging from analysis with spreadsheets and SQL to programming with R and Python, from tips on scraping and cleaning to data visualization and social network mapping. The sessions were led by a team of all-star trainers from seven countries.

This is the latest installment of GIJC21 videos, which until now have been available only to conference attendees. Other installments of this video series include “Investigating Organized Crime and Corruption” and “Funding Your Investigations and Business Strategies.”

Spreadsheets: Averages, Medians, Percentages, and Ratios

Spreadsheets are the building blocks of data journalism. In this session at GIJC21, data expert Mark Horvit covered Excel and other spreadsheet calculations that include averages, medians, percentages, and ratios.

Spreadsheets: The Power of Sorting, Filtering, and Pivot Tables

Using spreadsheets today is as basic to journalism as a reporter’s notebook or a smartphone. This introduction, from NYU Professor Andrew Lehren and digital journalist Sanjit Oberai, introduced the basics of data journalism. This session covers spreadsheet functions like sorting, filtering, cleaning data, and pivot tables, with the aim of helping you better understand your data.

 

What is R, and Why Use It?

Get to know your first programming language using simple verbs and nouns in a powerful language that can handle large public records databases and cleanse ugly spreadsheets. The use of R and R Studio for reporting has grown in newsrooms and become one of the standards in the growing trend of reproducible research in data journalism and visualization. This session with Arizona State University Professor Sarah Cohen introduced some of the capabilities of R, and features tutorials as well as examples to use in your own investigations.

 

Summarizing with Structured Query Language (SQL)

Ready to take the next step after spreadsheets? Gota Media’s Helena Bengtsson showed how using SQL makes it possible to handle large amounts of data, safely share data with others, and combine tables together. In this session — and the follow-up session “Joining Tables with Structured Query Language” (see next video) — audiences can learn how relational databases work, use SQL to ask database questions, and summarize and filter data. Before viewing the session, it’s recommended to download DB Browser SQLite (for both PCs and Macs).

 

Joining Tables with Structured Query Language (SQL)

In this companion piece to “Summarizing with Structured Query Language (see video above), Gota Media’s Helena Bengtsson demonstrated a deeper dive into SQL for understanding how relational databases work, using it to learn from your database, and how to summarize and filter data. Again, it’s helpful to download DB Browser SQLite (for both PCs and Macs) before the session.

 

 

What Is Python, and Why Use It?

Python is one of the world’s most popular programming languages, and it is used widely among data journalists. Among its helpful features: cleaning and analyzing data, and scraping websites. In this workshop at GIJC21, Dutch journalist Winny de Jong offered an introduction to the most common applications of Python for data journalists. De Jong is author of the online course “Python for Journalists”.

 

Free Google Tools for Investigations

Learn how to use Google News lab tools in your investigative newsroom. This hands-on GIJC21 workshop with multimedia and data journalist Mike Reilley explored basic tools such as Google Public Data Explorer, Google Dataset Search, and MapChecking.com for crowd size estimates. He also explained how to use the Google Earth Measure tool, Google Earth Engine Timelapse, Google Earth Pro, and Google Earth Studio.

 

NodeXL for Social Network Analysis

Social network analysis (SNA) is a powerful research method with many practical applications for journalists, especially when applied to social media data. This GIJC21 session, from Digital Space Lab’s Harald Meier, provided an overview of the SNA tool NodeXL Pro, a plugin for Microsoft Office Excel. No prior knowledge of social network analysis or Excel is required. Learn how to identify network data and conduct a social network analysis on any network dataset.

 

Data Visualization with Google Flourish

Flourish is now one of the most popular data tools. It offers rich, easy-to-use  visualization options, especially for newsrooms engaged in data journalism. In this GIJC21 session, GIJN’s Turkish editor Pinar Dağ offered a tutorial on Flourish’s rich data visualization library, and explained how to make basic charts, projection maps, heatmaps, and hierarchy maps. Case studies were also included in the session. To practice on the data discussed during the session, viewers can create their own Flourish account. You can access the datasets in the presentation, or you can download the datasets from the Github page.

 

Data Analysis Through Visualization

Well-executed data visualizations can often present the findings of an investigation more clearly than paragraphs of text. This GIJC21 session, featuring GIJN technologist Alastair Otter and the Center for Public Integrity’s Jennifer LaFleur, showed how to use the analysis tools in GIS and other software to explore the data for patterns and trends.

 

Tools to Clean Data

Cleaning your data is an essential part of good data journalism. View this practical GIJC21 session presented by data expert Nils Mulvad on how to use Open Refine for data cleaning. It focused on sifting and sorting text and dates. To get the most out of the session, we recommend installing Open Refine on your computer beforehand. 

 

Web Scraping Made Easier

Web scraping is used to collect large amounts of online information, such as statistics, incidents, and social media activity. It has become an increasingly valuable tool in journalism, and it has never been easier than with Web Scraper, a free extension to Google Chrome and Firefox. This GIJC21 demonstration session featuring data experts Tommy Kaas and Nils Mulvad demonstrated how to employ the Web Scraper tool in your investigations.

Other installments of this video series include “Investigating Organized Crime and Corruption,” and “Funding Your Investigations and Business Strategies.” Next week we’ll be releasing GIJC21 videos related to safety and security. Check back then to find more.

Additional Resources

GIJN Resource Center: Data Journalism

Data Journalists Offer Tools for the Future

GIJN’s Top 10 Data Journalism Projects of 2021

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article


Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

Read Next

Data Journalism News & Analysis

From Space to Story in Data Journalism

Over the past 10 years satellite imagery has become an important component of data journalism. In the next 10, it will likely evolve further, from a tool used primarily for illustrating stories to an integral part of research and investigative reporting.