WEBINAR - Uncovering AI’s Human Cost: A Non-Technical Toolkit for Investigative Reporters
June 30, 2026 • 10:00
-
day
days
-
hour
hours
-
min
mins
-
sec
secs

Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler
GIJN Academy - Data Collection for Journalists on-demand course - Pinar Dag
GIJN Academy - Data Collection for Journalists on-demand course - Pinar Dag

Image: GIJN

Global Academy

» On-Demand Courses

LAST TICKETS: Data Collecting for Journalists — Practical Training

Note: On-demand courses are fee-based, allowing us to sustainably support our team and trainers while continuing to provide free, high-quality investigative journalism training opportunities worldwide.

In today’s journalism, data has become one of the most critical elements of reporting. Going beyond traditional news sources and gathering information from publicly available data, websites, or digital documents is no longer just a nice-to-have skill, it is an essential competency for the profession. This training aims to teach journalists the data collection process using practical tools and hands-on techniques.

This program, built around collecting both qualitative and quantitative data, digital research, and information extraction, will enable participants to strengthen their investigations and diversify their sources. Each module is designed with contemporary examples and prioritizes applied learning. Participants are encouraged to join with their personal computers and complete pre-class preparatory tasks to accelerate the learning process.

The training will be led by GIJN’s data trainer and Turkish editor, Pınar Dağ, and will span four days, each day lasting two hours. This modular structure is designed to take participants from beginner to advanced levels. Throughout the program, the techniques taught focus on practical skills that can be directly applied in the newsroom and strengthen data collection processes in journalism.

Date: June 29 – 30 and July 1 – 2, 2026.

Time: 8:00 – 10:00 am EDT

Number of trainees: 20

Trainer’s bio: Pınar Dağ is a leading data journalism educator with over a decade of teaching experience and is widely regarded as one of the top trainers at the Global Investigative Journalism Conference (GIJC). She is a lecturer at Kadir Has University and a co-founder of the Data Literacy Association (DLA), Data Journalism Platform Turkey, and DağMedya. For more than 15 years, her work has focused on data literacy, open data, data visualization, and data journalism. She also serves as both a jury member and jury chair for the Sigma Data Journalism Awards.

Training Modules

Module 1.1 – Data Scraping for Journalists

Description: Fundamentals of automated data collection from the internet and its applications in investigative journalism.

Three Learning Outcomes:

  • Approaches to extracting structured data from web pages
  • Basic principles of HTML structure
  • Ethical and legal considerations in automated data collection

Goal: Ensure journalists understand how to retrieve the data they need from web pages for their news projects.

Module 1.2 – Scraping Data from PDF Files with Tabula

Description: An effective way to extract data from PDFs using Tabula.

Learning Outcomes:

  • Identifying table data in PDF documents
  • Exporting data with the Tabula tool
  • Strategies for handling complex PDF tables

Goal: Convert reports, budgets, tenders, or public data presented as PDFs into accurate datasets quickly.

Module 1.3 – Scraping Data from PDF Files with Tableau Public

Description: Extracting data from PDFs and preparing it for visualization using Tableau Public.

Learning Outcomes:

  • Using PDF data import features
  • Workflow for cleaning and transforming data
  • Preparing datasets for interactive visualizations

Goal: Enable journalists not only to collect data but also to produce visually analysis-driven content.

Module 1.4 – Scraping Data from the Web Using Google Spreadsheets

Description: Collecting online data using Google Sheets formulas.

Learning Outcomes:

  • Using functions like IMPORTXML, IMPORTHTML, and similar tools
  • Parametric data retrieval and editing
  • Methods for pulling live data from the web

Goal: Teach participants to gather online data quickly without coding.

Module 1.5 – Scraping Websites from the Web Using Data Miner

Description: Extracting data from websites using the Data Miner browser extension.

Learning Outcomes:

  • Installing and setting up browser-based data collection extensions
  • Selecting and exporting data
  • Tips for working with different site structures

Goal: Develop flexible data collection skills from simple HTML tables to complex web pages.

Training Methodology

  • Live online sessions
  • Hands-on sample data projects
  • Q & A and feedback sessions
  • Applying learned techniques to real-world news scenarios

GIJN offers tailored training programs designed for journalists and newsrooms with specific needs. On-demand courses are fee-based, allowing us to sustainably support our team and trainers while continuing to provide free high-quality investigative journalism training opportunities worldwide. Email academy@gijn.org to book a session or discuss your project or if you are interested in designing a fully customized training program for your newsroom. For more free trainings, please visit: gijn.org/academy. 

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
Large Organization Ticket
Purchase a maximum of 15 tickets.
$200.00
15 available
Small Organization (below 20 staff)
Purchase a maximum of 10 tickets.
$80.00
8 available
Freelancers' Ticket
Purchase a maximum of 10 tickets
$50.00
0 available
Sold Out
Student Tickets
Purchase a maximum of 2 tickets
$40.00
0 available
Sold Out