Image: GIJN, YouTube

Topics

» Reporting Tools & Tips

Tips for Using AI as a Reporting Tool to Uncover Wrongdoing

by Serdar Vardar • March 27, 2026

Resource Guide

The Investigative Agenda for Tech and AI Journalism

Resource Guide Chapter

Radical Collaboration: Why It’s the Antidote to Big Tech

Resource Guide Chapter

Confronting the AI Paradox: Potential Source of Abuse and Misinformation vs. Game-Changing Newsroom Reporting Tool

A brightly coloured illustration which can be viewed in any direction. It has many elements to it working together: men in suits around a table, someone in a data centre, big hands controlling the scenes and holding a phone, people in a production line. Motifs such as network diagrams and melting emojis are placed throughout the busy vignettes. GIJN Tech Focus Project

Resource Guide Chapter

Holding the Power of Big Tech Accountable

Resource Guide Chapter

Gabriel Geiger Shares Tips and Tools on Investigating Government Use of AI

Building blocks are overlayed with digital squares that highlight people living their day-to-day lives through windows. Some of the squares are accompanied by cursors.

Resource Guide Chapter

Making Tech Surveillance a Reporting Beat

Resource Guide Chapter

John Scott-Railton Shares Tips and Tools to Protect Yourself Digitally

This image features 3 images of a street. Overlying the image are different shapes which are arranged to look like QR code symbols. These are in white/blue colors and intersect one another. The first image is clear, but the second is slightly more pixelated, and the final image is very pixelated.

Resource Guide Chapter

Investigating Location-Tracking Surveillance Systems

GIJN Tech Focus project - A collage featuring a vintage illustration of a woman’s head mapped with labeled sections resembling a phrenology chart. The mapped sections are overlaid by a neutral network diagram– depicting crisscrossing black lines. Two anonymous hands extend from the left side, pulling on two wires from the diagram. In the background is a panel of the Turing Machine with numerous knobs and switches, highlighting a connection between the history of computing, psychology, biology, and artificial intelligence.

Resource Guide Chapter

Investigating Disinformation in the Age of AI

Resource Guide Chapter

Karen Hao on AI Narratives Reporters Should Deconstruct

GIJN Tech Focus Project This image shows an individual with orange hair interacting with a large, abstract digital mirrored structure. The structure is composed of squares in varying shades of green, orange, white, and black which are pieced together to reflect the individual’s figure. The figure's hand is extended as if pointing to or interacting with the mirrored structure. Behind the structure are streams of binary code (0s and 1s) in orange, flowing towards the digital grid.

Resource Guide Chapter

Leveraging AI and Technology to Investigate Power

Resource Guide Chapter

Tips for Using AI as a Reporting Tool to Uncover Wrongdoing

Resource Guide Chapter

Gina Chua on 4 Tips for Innovative Journalism in the Age of AI

Global Academy Webinars Resource Guide Chapter

Webinar: Detecting AI-Generated Content – Updated Tools and Techniques

Resource Guide Chapter

Athandiwe Saba Shares Practical Tips on Investigating Big Tech in Africa

GIJN Tech Focus Project, Cooling pipes hug data servers, extracting water from a shared reservoir while people collect water from the same source, set against a background of eroded soil textures.

Resource Guide Chapter

Investigating the Human Cost of Tech

Resource Guide Chapter

Techniques for Investigating Data Centers

The image shows a surreal landscape with vast green fields extending toward distant mountains under a cloudy sky. Embedded in the fields are digital circuit patterns, resembling an intricate network of blue lines, representing a technological infrastructure. Five large computer monitors with keyboards are placed in a row, each with a Navajo woman sitting in front, weaving the computers. In the far distance, a cluster of teepees is visible.

Resource Guide Chapter

Investigative reporters often face a common dilemma: the data exists, sometimes in vast quantities, but it is inaccessible, unstructured, or too large to examine manually. The problem lies not only in discovering wrongdoing, but also in building systems that make patterns visible and tips actionable.

At the 14th Global Investigative Journalism Conference (GIJC25) in Kuala Lumpur, the session “Uncovering Wrongdoing Using AI: Methods, Techniques, and Challenges” brought together newsroom leaders and academics who have built those systems. The panel centered on replicable methods that journalists can apply across industries and countries.

Three approaches stood out: building searchable ownership and financial databases; creating AI-powered “digital democracy” systems that generate reporting tips; and designing AI agent and OCR (optical character recognition) workflows to process massive document archives.

Building a Searchable Fishing and Ownership Database

Fabrizio Palumbo, associate professor of data journalism at OsloMet and founder of the AI Journalism Resource Center, described how his team partnered with Norwegian newsrooms to work with official fisheries data.

Norway publishes detailed records of every registered fishing trip. Each entry includes the vessel, species caught, weight, time, and fishing area. The initial idea was straightforward: cross-check reported catches against quotas to detect underreporting or overreporting.

After two years of analysis, the team found no clear discrepancies.

Instead of abandoning the project, they pivoted. “We spent more than two years trying to figure it out… and we couldn’t get anything out of it informatively,” Palumbo said. So they built an infrastructure that journalists could use directly.

Collect and Normalize the Data

The team gathered gigabytes of fisheries data and standardized the records into a machine-readable database. The emphasis was on open source, free tools and GDPR-compliant workflows. Palumbo stressed that any tool must meet “ethical requirements” and be explainable to journalists.

Add Search and Ownership Layers

The database allows reporters to search for individual vessels or companies. Once extracted, the data is mapped into a graph network to visualize ownership links and quota allocations.

This enables reporters to answer questions such as:

Who owns this vessel?
Which companies share ownership?
Are quotas concentrated among related entities?
How do money flows move between firms?

The model is replicable across other sectors. Any industry with licensing regimes, quotas, permits, or regulatory filings can be structured in the same way: energy concessions, mining permits, pharmaceutical licenses, or public procurement contracts.

Automate Personalized Newsletters

One of the most practical outputs was an automated weekly newsletter powered by a large language model (LLM). Journalists can predefine topics such as cod catches in northern Norway, new vessel registrations, or regulatory changes. The system generates customized updates from the database.

This idea can travel easily. A newsroom covering extractive industries could create weekly alerts on new drilling permits. A health reporter could receive alerts on pharmaceutical company filings. The key is not the generative AI itself, but the structured database behind it, and automated updates.

As Palumbo noted, the tool must be understandable and trustworthy: journalists must “be able to understand what you’re doing and… trust so that they will publish what you actually find out.”

Creating AI-Powered ‘Digital Democracy’ Tip Systems

Image showing head shots of the legislators in California

The CalMatters site allows users to filter by key topic, and to dive into data on different state politicians. Image; Screenshot, CalMatters

Sisi Wei, chief impact officer at CalMatters, described Digital Democracy, a system that tracks every word spoken by any California state legislator, every vote, and every campaign contribution.

The project was born from a reporting gap. Many local newsrooms no longer send reporters to the statehouse. Lawmakers were operating with limited scrutiny. “We start with the why,” Wei said. “AI is not always our answer.”

Build a Comprehensive Data Backbone

Digital Democracy ingests:

Transcripts of legislative hearings (via AI transcription)
Facial recognition to identify speakers
Bill sponsorship and voting records
Campaign donations
Gift disclosures

Importantly, generative AI is used only for transcription. Human staff review transcripts daily to correct speaker names and entities.

The heavier analytical work uses machine learning models built and controlled internally. Wei emphasized the difference: with in-house models, “we control all the inputs.”

Train the Model to Generate Story Leads

The core innovation is not public-facing dashboards. It is a password-protected section for journalists that generates story leads from the database. The model weighs variables such as:

Did a legislator vote against the interests of their top donors?
Are there patterns of abstention?
Are financial interests aligned with specific votes?

Initially, the system produced “boring” leads. Journalists provided feedback. The model was retrained based on what editors considered newsworthy.

One example revealed legislators who avoided voting “no” on controversial bills by refusing to vote at all, allowing bills to fail without public accountability. The model detected this pattern, flagged it, and reporters investigated it for months. The resulting story changed legislative behavior.

Offer Personalized Newsletters

Digital Democracy also generates weekly personalized newsletters for citizens, summarizing what their representatives did.

This model is globally transferable. Any country with parliamentary transcripts, voting records, and campaign finance disclosures can replicate it. The technical stack may vary, but the method remains:

Centralize legislative data.
Build explainable machine learning models.
Use journalist feedback loops.
Deliver actionable leads, not abstract dashboards.

Wei cautioned that AI is not always the best tool. In one project digitizing disclosure forms, generative AI failed to extract reliable structured data. The newsroom reverted to PDF Plumber, an open-source Python library built by a journalist. The lesson: use the simplest tool that solves the problem.

Practical Takeaways

These two cases have several common principles:

Start with the reporting problem, not the tool.
Build structured, searchable databases first.
Use machine learning for pattern detection, not final conclusions.
Integrate feedback loops from journalists.
Keep detailed documentation of every step.
Publish methodology when possible.
Maintain human oversight at every stage.

As Wei put it, “AI is simply a tool.”

The most transferable lesson is that using AI effectively in investigative journalism is not about adopting the newest model — it’s about building systems that turn complex data into reproducible reporting leads. Whether tracking fishing quotas or legislative votes the workflow remains the same: collect, structure, test, verify, and report.

Serdar Vardar is an investigative journalist at Deutsche Welle’s Environment Desk, specializing in cross-border environmental crimes, climate crisis coverage, corruption, and tax evasion. Winner of the EU Investigative Journalism Awards in the Western Balkans and Turkey, he has uncovered significant stories including the Qatargate scandal, Turkish corporate propaganda in the Balkans, and China’s Belt and Road environmental impacts in Peru and Colombia. Vardar has also worked on major global investigations including ICIJ’s Pandora Papers, Shadow Diplomats, and Deforestation Inc. with his work appearing in Deutsche Welle, Al Jazeera, and through collaborations with ICIJ and OCCRP.

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License

Republish our articles for free, online or in print, under a Creative Commons license.

Read other stories tagged with:

AI marine tracking Tech Focus Project

Republish this article

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License

Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

<h2>Tips for Using AI as a Reporting Tool to Uncover Wrongdoing</h2> by <a href="https://bio.site/SerdarVardar">Serdar Vardar</a> for Global Investigative Journalism Network &bull; March 27, 2026 Investigative reporters often face a common dilemma: the data exists, sometimes in vast quantities, but it is inaccessible, unstructured, or too large to examine manually. The problem lies not only in discovering wrongdoing, but also in building systems that make patterns visible and tips actionable.At the <a href="http://gijc2025.org">14th Global Investigative Journalism Conference</a> (GIJC25) in Kuala Lumpur, the session &ldquo;<a href="https://gijc2025.org/program/schedule/sessions/444405cbc31cd26f36182c0a59ead817/">Uncovering Wrongdoing Using AI: Methods, Techniques, and Challenges</a>&rdquo; brought together newsroom leaders and academics who have built those systems. The panel centered on replicable methods that journalists can apply across industries and countries.<aside>&ldquo;We start with the why, and AI is not always our answer.&rdquo; &mdash; Sisi Wei from CalMatters</aside>Three approaches stood out: building searchable ownership and financial databases; creating AI-powered &ldquo;digital democracy&rdquo; systems that generate reporting tips; and designing AI agent and <a href="https://www.adobe.com/acrobat/guides/what-is-ocr.html?msockid=3ff6fbf9eca56c0621ebee4eed066dde">OCR (optical character recognition) workflows</a> to process massive document archives.<h4>Building a Searchable Fishing and Ownership Database</h4><a href="https://gijc2025.org/speakers/fabrizio_palumbo.28se1b93/">Fabrizio Palumbo</a>, associate professor of data journalism at <a href="https://www.oslomet.no/en/">OsloMet</a> and founder of the <a href="https://www.oslomet.no/en/research/research-projects/ai-journalism-resource-center">AI Journalism Resource Center</a>, described how his team partnered with Norwegian newsrooms to work with official fisheries data.Norway publishes detailed records of every registered fishing trip. Each entry includes the vessel, species caught, weight, time, and fishing area. The initial idea was straightforward: cross-check reported catches against quotas to detect underreporting or overreporting.After two years of analysis, the team found no clear discrepancies.Instead of abandoning the project, they pivoted. &ldquo;We spent more than two years trying to figure it out&hellip; and we couldn&rsquo;t get anything out of it informatively,&rdquo; Palumbo said. So they built an infrastructure that journalists could use directly.<ul>
<li>Collect and Normalize the Data</li>
</ul>The team gathered gigabytes of fisheries data and standardized the records into a machine-readable database. The emphasis was on open source, free tools and GDPR-compliant workflows. Palumbo stressed that any tool must meet &ldquo;ethical requirements&rdquo; and be explainable to journalists.<ul>
<li>&nbsp;Add Search and Ownership Layers</li>
</ul>The database allows reporters to search for individual vessels or companies. Once extracted, the data is mapped into a graph network to visualize ownership links and quota allocations.This enables reporters to answer questions such as:<ul>
<li>Who owns this vessel?</li>
<li>Which companies share ownership?</li>
<li>Are quotas concentrated among related entities?</li>
<li>How do money flows move between firms?</li>
</ul>The model is replicable across other sectors. Any industry with licensing regimes, quotas, permits, or regulatory filings can be structured in the same way: energy concessions, mining permits, pharmaceutical licenses, or public procurement contracts.<ul>
<li>&nbsp;Automate Personalized Newsletters</li>
</ul>One of the most practical outputs was an automated weekly newsletter powered by a large language model (LLM). Journalists can predefine topics such as cod catches in northern Norway, new vessel registrations, or regulatory changes. The system generates customized updates from the database.This idea can travel easily. A newsroom covering extractive industries could create weekly alerts on new drilling permits. A health reporter could receive alerts on pharmaceutical company filings. The key is not the generative AI itself, but the structured database behind it, and automated updates.As Palumbo noted, the tool must be understandable and trustworthy: journalists must &ldquo;be able to understand what you&rsquo;re doing and&hellip; trust so that they will publish what you actually find out.&rdquo;<h4>Creating AI-Powered &lsquo;Digital Democracy&rsquo; Tip Systems</h4><a href="https://gijc2025.org/speakers/sisi5/">Sisi Wei</a>, chief impact officer at <a href="https://calmatters.org/">CalMatters</a>, described <a href="https://calmatters.digitaldemocracy.org/">Digital Democracy</a>, a system that tracks every word spoken by any California state legislator, every vote, and every campaign contribution.The project was born from a reporting gap. Many local newsrooms no longer send reporters to the statehouse. Lawmakers were operating with limited scrutiny. &ldquo;We start with the why,&rdquo; Wei said. &ldquo;AI is not always our answer.&rdquo;<ul>
<li>Build a Comprehensive Data Backbone</li>
</ul>Digital Democracy ingests:<ul>
<li>Transcripts of legislative hearings (via AI transcription)</li>
<li>Facial recognition to identify speakers</li>
<li>Bill sponsorship and voting records</li>
<li>Campaign donations</li>
<li>Gift disclosures</li>
</ul>Importantly, generative AI is used only for transcription. Human staff review transcripts daily to correct speaker names and entities.The heavier analytical work uses machine learning models built and controlled internally. Wei emphasized the difference: with in-house models, &ldquo;we control all the inputs.&rdquo;<ul>
<li>Train the Model to Generate Story Leads</li>
</ul>The core innovation is not public-facing dashboards. It is a password-protected section for journalists that generates story leads from the database. The model weighs variables such as:<ul>
<li>Did a legislator vote against the interests of their top donors?</li>
<li>Are there patterns of abstention?</li>
<li>Are financial interests aligned with specific votes?</li>
</ul>Initially, the system produced &ldquo;boring&rdquo; leads. Journalists provided feedback. The model was retrained based on what editors considered newsworthy.One example revealed legislators who avoided voting &ldquo;no&rdquo; on controversial bills by refusing to vote at all, allowing bills to fail without public accountability. The model detected this pattern, flagged it, and reporters investigated it for months. The resulting story changed legislative behavior.<ul>
<li>Offer Personalized Newsletters</li>
</ul>Digital Democracy also generates weekly personalized newsletters for citizens, summarizing what their representatives did.This model is globally transferable. Any country with parliamentary transcripts, voting records, and campaign finance disclosures can replicate it. The technical stack may vary, but the method remains:<ol>
<li>Centralize legislative data.</li>
<li>Build explainable machine learning models.</li>
<li>Use journalist feedback loops.</li>
<li>Deliver actionable leads, not abstract dashboards.</li>
</ol>Wei cautioned that AI is not always the best tool. In one project digitizing disclosure forms, generative AI failed to extract reliable structured data. The newsroom reverted to <a href="https://www.pdfplumber.com/">PDF Plumber</a>, an open-source Python library built by a journalist. The lesson: use the simplest tool that solves the problem.<h4>Practical Takeaways</h4>These two cases have several common principles:<ul>
<li>Start with the reporting problem, not the tool.</li>
<li>Build structured, searchable databases first.</li>
<li>Use machine learning for pattern detection, not final conclusions.</li>
<li>Integrate feedback loops from journalists.</li>
<li>Keep detailed documentation of every step.</li>
<li>Publish methodology when possible.</li>
<li>Maintain human oversight at every stage.</li>
</ul>As Wei put it, &ldquo;AI is simply a tool.&rdquo;The most transferable lesson is that using AI effectively in investigative journalism is not about adopting the newest model &mdash; it&rsquo;s about building systems that turn complex data into reproducible reporting leads. Whether tracking fishing quotas or legislative votes&nbsp; the workflow remains the same: collect, structure, test, verify, and report.<hr><a href="https://gijn.org/wp-content/uploads/2023/01/Serdar-Vardar.png"><img class="alignleft wp-image-610669 size-thumbnail" src="https://gijn.org/wp-content/uploads/2023/01/Serdar-Vardar-140x140.png" alt="Serdar Vardar" width="140" height="140"></a><a href="https://bio.site/SerdarVardar">Serdar Vardar</a> is an investigative journalist at Deutsche Welle's Environment Desk, specializing in cross-border environmental crimes, climate crisis coverage, corruption, and tax evasion. Winner of the EU Investigative Journalism Awards in the Western Balkans and Turkey, he has uncovered significant stories including the Qatargate scandal, Turkish corporate propaganda in the Balkans, and China's Belt and Road environmental impacts in Peru and Colombia. Vardar has also worked on major global investigations including ICIJ's Pandora Papers, Shadow Diplomats, and Deforestation Inc. with his work appearing in Deutsche Welle, Al Jazeera, and through collaborations with ICIJ and OCCRP.
	This <a target="_blank" href="https://gijn.org/resource/tech-focus-project-using-ai-reporting-tool/">article</a> first appeared on <a target="_blank" href="https://gijn.org">Global Investigative Journalism Network</a> and is republished here under a Creative Commons license.
	<img id="republication-tracker-tool-source" src="https://gijn.org/?republication-pixel=true&amp;post=657947&amp;ga=UA-21528033-17">

The Investigative Agenda for Tech and AI Journalism

Power has become more concentrated than ever before in the hands of Big Tech companies, whose economic — and increasingly political — influence has reached unprecedented levels.

Resource Guide Chapter

Holding the Power of Big Tech Accountable

Covering AI requires examining the power structures and decisions that shape how these systems are built and deployed, and who ultimately benefits from them.

Resource Guide Chapter

Making Tech Surveillance a Reporting Beat

While journalists are frequently victims of digital surveillance, they have also turned the scrutiny of spyware and surveillance systems into an investigative beat all of itself.

Resource Guide Chapter

Investigating Disinformation in the Age of AI

In high-velocity information wars, investigative value lies less in disproving every falsehood than in documenting patterns, tactics, and systemic vulnerabilities.

Topics

Tips for Using AI as a Reporting Tool to Uncover Wrongdoing

Resource Guide

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Global Academy Webinars Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Related Resources

Tech Focus Project

Introduction to Fundraising for Investigative Journalism

John Scott-Railton Shares Tips and Tools to Protect Yourself Digitally

Gabriel Geiger Shares Tips and Tools on Investigating Government Use of AI

Share

Building a Searchable Fishing and Ownership Database

Creating AI-Powered ‘Digital Democracy’ Tip Systems

Practical Takeaways

Related Resources

Tech Focus Project

Introduction to Fundraising for Investigative Journalism

John Scott-Railton Shares Tips and Tools to Protect Yourself Digitally

Gabriel Geiger Shares Tips and Tools on Investigating Government Use of AI

Related Stories

The Investigative Agenda for Tech and AI Journalism

Holding the Power of Big Tech Accountable

Making Tech Surveillance a Reporting Beat

Investigating Disinformation in the Age of AI

Read other stories tagged with:

Republish this article

Read Next

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter

Resource Guide Chapter