WEBINAR - From the Panama Papers to the Epstein Files: Investigating Leaks and Large-Scale Data in the Age of AI
June 18, 2026 • 09:00
-
day
days
-
hour
hours
-
min
mins
-
sec
secs

Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler
Search government contract spreadsheet text
Search government contract spreadsheet text

Image: Shutterstock

Stories

Topics

Journalism-First Tools to Search Spreadsheet Text, Track Website Changes, and Mine Investigative Resources Through the Years

Read this article in

In this latest installment of GIJN Toolbox, we highlight three new powerful but easy-to-use investigative tools that have been developed with the needs of journalists in mind.

These were showcased at the recent NICAR26 data journalism summit in the US.

Meaningfully

Meaningfully tool to search spreadsheets

The Meaningfully tool can be used to do contextual text searches in spreadsheets. Image: Screenshot from NICAR26

What do you do if you have a vast spreadsheet from government data and want to find references, for instance, to employees being fired? The problem here is that government agencies use all kinds of synonyms and euphemisms when they fire people, not to mention typos. The same applies to numerous other searches, such as contracts, which might be hidden in terms such as “tenders” or descriptions of bids in a spreadsheet.

Keyword or ‘Control-F’ searches won’t work effectively in these situations, and using AI chatbots carries all the risks of error and hallucinations. A good, “middle-ground” solution for this type of analysis involves journalist-created, open source tools that identify the meaning of data entries.

One of these is a free semantic search tool called Meaningfully, created by Jeremy Merrill, a data reporter at The Washington Post. Another free, journalist-made tool called Semantra performs a similar function.

At the recent NICAR data journalism conference, Merrill demonstrated the search-for-firings example case by uploading a dataset from a securities filing and simply typing “he got fired” into his tool’s search bar. The tool found data cells such as the following: “[Person X] will not receive severance and will forfeit all equity that was not vested as of his termination date.” There was no direct reference to “dismissed” or “fired,” but Meaningfully understood that this man clearly had been let go from his job.

Merrill also used Meaningfully to help reveal some of the 3,000 ways that the US federal government is using AI tools, in a recent investigation into the adoption and deployment of this technology under the Trump administration. Wondering whether the US government was using AI to look for whales, he had typed “looking for whales” into his tool, and it found eight use cases for counting whales and dolphins — both more literal, like “automated whale-blow detections,” and semantic, such as “marine mammal surveys.”

“If you have a spreadsheet with a bunch of text in it, and you also have hypotheses about what might be in that text, Meaningfully is the tool that will help you do that,” he said.

Merrill acknowledged that Meaningfully can be slow to process, and may not be useful for gigantic datasets of over 100,000 rows. Still, he emphasized that it’s simple to upload and search once installed, and that the tool itself is free. Meaningfully can be installed via the steps on this page.

Jeremy Merrill, a data reporter at The Washington Post, used the Meaningfully tool to help him dig into the vast embrace of AI technology inside the Trump administration. Image: Screenshot

Jeremy Merrill, a data reporter at The Washington Post, used the Meaningfully tool to help him dig into the vast embrace of AI technology inside the Trump administration. Image: Screenshot

Visualping — and its Free Journalist Plan

Hundreds of reporters and researchers around the world have used the free public level on the Visualping tool to get automated alerts to changes on websites relevant to their investigations.

The AI-powered tool is a timesaver and an online sentinel for journalists: monitoring target sites, and sending alerts on changes and updates that you’d otherwise likely miss.

Better still, this easy-to-use service now has expanded features – such as the ability to monitor only a small part of a website, and avoid unwanted alerts for new ads and routine updates elsewhere on that site. In addition, it now offers a free Journalist Plan.

This enhanced access for reporters, with a commercial value of $120 per year, provides up to 1,000 checks per month compared to the 150 available with the public free level, and the monitoring of 25 web pages at a time, rather than five. You can tailor alerts for any public site and in any language, whether a politician’s social media page or a government procurement site.

Kayla Zhu, manager of the service’s Journalist Program, told NICAR attendees that one potentially powerful investigative use is to set alerts for any changes to Big Tech ‘terms of service’ pages — such as to xAI’s terms of service page — to catch new privacy or usage risks.

“I love this application for investigative journalists, because these pages have huge walls of text, and change of a word here or there could have a big impact, and normally no one would notice,” says Zhu.

You can also view exact, color-coded text additions and deletions on the dashboard, using a slider function.

Screenshot from a Visualping analysis of changes to OpenAI's privacy policy.

A Visualping analysis of changes to OpenAI’s privacy policy. Image: Screenshot Visualping

After signing up for your account as a journalist, begin by clicking on “New Job,” enter your target URL, set your search criteria, and, finally, enter the frequency of page checks you’d prefer.

Zhu also offered detailed tips on how to optimize the tool for investigations, including these:

  • While you can simply select search options from the Suggestions list provided, Zhu recommends that reporters rather type in AI-style prompts to tailor their criteria, such as “Alert me when a new filing regarding [Company X] is made,” or “Alert me when a court filing regarding [Person X] is amended.”
  • In addition to websites of traditional interest, such as procurement, court dockets, and environmental action sites, remember that it also works for any updates to open data portals — a useful verification step on the eve of story publication.
  • Reach out to the Visualping Journalist Program staff for specific project requests.
  • Check the tool’s tips page.

Newly-Launched Archive.ire.org Search Engine

GIJN member Investigative Reporters and Editors has amassed tens of thousands of resources over several decades — from digital tipsheets to award entries and conference audio recordings — which are of use to investigative journalists not only in the US but also around the world.

In February of this year, a collaborative archiving project between volunteers and IRE staff launched a remarkable search engine tool that not only surfaces these resources online, but uses semantic and AI features to suggest similar materials that could easily trigger new leads, tool applications, and collaborative contacts for live investigative projects today.

For instance: if you type “oligarchs” into the IRE Resource Center’s familiar-looking search box, it offers a tipsheet from a 2019 IRE panel titled “Dirty Money, European Banks and Russian Organized Crime” – as well as IRE awards entries on money laundering in the Baltics from Sweden’s SVT News and Russian troll factories from CNN. Once you then click on that tipsheet, the search engine, which combines semantic features with keyword search, then offers five additional resources that may be closer to what you originally hoped to find. In this case, these include a link to a competition entry form from the Organized Crime and Corruption Project (OCCRP) that describes, in detail, how they investigated Eastern European business agents who offer turnkey solutions for money laundering. This result, in turn, offers further suggestions, including clear audio recordings from an IRE panel called “Inside the Global Offshore Money Maze,” and a presentation on cross-border organized crime by speakers including Mexico’s Jesus Ibarra and GIJN founding director David E. Kaplan.

In announcing the launch at NICAR, Ben Welsh — an editor at Reuters and part of the Archive.ire.org development team — said the database included 33,449 resources, and that the bulk of these involved IRE’s 25,000 contest entries over the years. The entry forms that accompany these on the site represent a potential goldmine for reporters, because they include detailed methodologies, explanations of how they developed human sources, the names of all contributors, and the contextualized challenges and opportunities for digging into pressing public interest issues that other reporters may wish to pursue in their own countries.

Data journalist Derek Willis, a co-developer of the archive, added: “From my standpoint, my previous use of IRE resources has largely been around conference tipsheets, but I do think the contest entries can be very valuable in different ways.”

IRE Resource Center page

Results from the new IRE Resource Center archive search page. Image: Screenshot, IRE

“About three quarters of the resources do have some type of attachment, be it a PDF, audio file, or something else cool,” Welsh noted. “We tried to create a common index for everything, though it’s not perfect — but they do have title; category; description; and most have authors listed.”

He conceded that the new site has some shortcomings, including a lack of date information for about a quarter of the dataset — and he reminded journalists they must use an IRE member login to access the database. However, the tipsheets and slides include numerous opportunities for collaboration — including valuable contact details for investigative editors — and the contest entries, and even random site browsing, represent a treasure trove of potential investigative story ideas.


Rowan Philp is GIJN’s global reporter and impact editor. He was formerly chief reporter for South Africa’s Sunday Times. As a foreign correspondent, he has reported on news, politics, corruption, and conflict from more than two dozen countries around the world.

 

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article


Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

Read Next

Reporting Tools & Tips

GIJN Toolbox: Hunting for Secret Money and Financial Conflicts of Interest

In this edition of GIJN Toolbox, we profile three brand new — or newly expanded — tools to dig into financial secrecy and hidden gains from corruption or crime. Our list includes a user-friendly database to search for sanctions and conflict-of-interest red flags, a site that uses an algorithm to detect hidden bank accounts, and a newly expanded database on the true owners of offshore companies.