Old-school techniques like persistence, teamwork, and knocking on doors remained the central drivers of investigative projects around the world in 2023. Even The New York Times’ high-tech visual forensic investigation into possible Russian war crimes in Bucha, Ukraine, was dependent on two young reporters — Yousur Al-Hlou and Masha Froliak — going door to door, asking survivors if they had captured phone video from the invasion last March.
But, in many other cases, open source tools provided the leads to chase, and the verification and illustration of evidence found through traditional reporting. In addition, easy-to-use mega-tools were popular, including free digital instruments like Google Earth, the Aleph database, and Boolean search terms. And this past year also saw journalists in Latin America and Africa increasingly using advanced data platforms like PinPoint in investigations such as News24’s Silenced series, which exposed the murder of a South African government whistleblower as a corruption cover-up hit.
For those reporters with advanced computer science skills, 2023 saw a notable expansion of website-digging resources — such as “passive DNS data” history tools like DNSDB Scout and RiskIQ, which can map IP addresses to domains and vice versa.
But this column is about tools that almost any reporter can deploy: easy-to-use digital solutions that are either new in 2023, have new features, or found new applications for investigative reporting. Having flagged the top tips at conferences like the NICAR23, IRE23, and the 13th Global Investigative Journalism Conference (GIJC23) in Sweden — and interviewed numerous reporters after award-winning stories — GIJN offers the following 10 user-friendly tools that you might consider in your next investigations.
1. The space bar to find potential whistleblowers on LinkedIn.
This year, online research expert Henk van Ess detailed dozens of advanced tips and “tricks” in his seven-chapter GIJN Online Research Guide for searching major social media platforms. But one technique that is likely to stick with all journalists in their daily work — due to its simplicity and the growing importance of the platform — is Van Ess’s space bar tip for LinkedIn. He revealed that you can bypass LinkedIn’s algorithm and focus searches for people by simply clicking in the blank search field, entering a space, and hitting “Enter.” This brings up an “All filters” menu, where reporters can then click on the “People” tab that pops up, for instance, and combine it with other filters to drill directly for your target, without being redirected by the algorithm.
As Van Ess points out in his LinkedIn guide chapter: you can also use the “Past Company” function to find former employees who may be willing to talk or share documents, or who can connect you with potential whistleblowers who are currently employed at the company. He also recommends first finding consultants in the relevant field by using LinkedIn’s various “Consulting” tabs.
2. The “super-powered” Telepathy tool for researching Telegram.
The growing importance of Telegram for journalists was illustrated by its central role in both discussions and disinformation related to the Russian invasion of Ukraine. A new, journalist-built tool called Telepathy has quickly become known as “the Swiss army knife of Telegram tools,” because it can not only show how channels are linked, but can also archive entire chats, identify top posters, and collect member lists. While disinformation researchers call it “user-friendly,” the tool does need basic open source computer skills to install and run. It has both free and paid tiers.
At GIJC23, disinformation research expert Jane Lytvynenko explained that Telepathy “is like a super-powered tool built on top of the Telegram API; a great tool for reporters looking to get started on Telegram investigations.” Lytvynenko also suggested that journalists use a website like metadata2go.com to directly dig into videos and images on Telegram. Other trusted free tools for Telegram investigations include Tgstat, and the following shortcut in Google to identify channels of interest: “site:t.me/*”.
3. Osint.industries email and phone connections tool.
When asked for “the hottest digging tool you’re using right now,” Lara Dihmis — an enterprise investigator at OCCRP — was unequivocal: “Osint.industries.” This is one of those people-tracking tools that can cause your hair to stand up if you use it to search the internet for yourself. You can enter an email or phone number for a subject and discover the many websites associated with them, as well as real identities behind usernames and hidden digital footprints. “If you have a phone number or email address, it’s amazing for finding any accounts registered with either,” said Dihmis. “I would highly recommend it. And the best part is that it’s free.” Verified journalists can request extra access via email@example.com.
4. The Aleph cross-reference feature.
As most data journalists know, Aleph is a vast, follow-the-money data platform and leaks repository created by OCCRP that independent media routinely mine for story leads and data. It includes 370 million public records — among them, bank statements, sanctions lists, court archives, and corporate emails — from more than 140 countries, as well as a platform for live projects and collaborations.
Now, Aleph has an updated cross-reference tool that allows you to automatically search for names or companies of interest across hundreds of other existing datasets in every corner of Aleph. With a single click — and a minute’s wait for the platform’s computations — you could find hidden connections involving your subject that you never imagined. You do need to clean and properly format your uploaded data for the process to work. But OCCRP has just released a point-by-point checklist to help you use this tool. It also released a detailed new user guide on how to search within datasets.
5. Bellingcat Auto Archiver to preserve video evidence in seconds.
Quickly downloading social media clips about public incidents from various platforms can be complex, time-consuming, and problematic. Some downloading options require coding skills, and some take so long that the target posts could be taken down by the user or the platform before they’re able to be saved. To combat this, Bellingcat’s tech team has created an Auto Archiver system, which allows you to complete the process in seconds: just copy the post’s URL, paste it into a dedicated Google Sheet… and that’s it! The tool automatically chooses the ideal downloading and archiving strategy for each URL, and does the downloading on its own while you continue your search for other evidence. It also uses the Wayback Machine as a backup. Bellingcat investigative tech team lead Johanna Wild said: “We used this for our Ukraine work; you can just copy-paste links of videos and social media posts from Telegram, TikTok, Twitter, and more, and drop it in the sheet, and it gets archived.” The tool only requires computer science skills in the set-up phase. Journalists — or their IT colleagues — can set up the tool by following the steps at the bottom of this page, or watching this video tutorial.
6. Epieos tool to track bad actors from their Google reviews.
Tools that exploit traits from human nature are always a favorite category for GIJN. Craig Silverman’s “Pub/UA” trick to find the hidden owners of websites is a great example: where you can right-click on any website, then enter “pub” in the Control-F field, and immediately see if there is a Google Adsense identifier in the source code to learn who is receiving the advertising revenue from the site. It works because many bad actors who conceal their ownership of problematic sites just can’t resist taking even small amounts of ad money.
Likewise, it turns out that many sanctioned individuals, oligarchs, and other bad actors seeking to operate below the radar simply can’t help themselves from posting on Google critical reviews of businesses and restaurants they’ve encountered in their personal lives. In response, a new reverse-email search tool called Epieos provides a history of someone’s service reviews in Google Maps-form — instantly showing, for instance, where and when they have reviewed restaurants in the past. Tool founder Sylvain Hajri told GIJN that the search engine is particularly attractive for investigative journalists because the user is never alerted, and because the tool deliberately keeps no record of your searches. You do need a subject’s email address for these searches, but you can “guess” them with the hunter.io tool.
7. Affordable GPS devices to track waste and smuggling routes.
The new availability of reliable, small, and inexpensive GPS tracking devices offers both an exciting opportunity and a sober ethical consideration for investigative journalists. But these devices have emerged as a clear game-changer for tracking goods, and for revealing the routes that organized criminals use to smuggle contraband. This year, Sweden’s premier investigative TV program, Mission Investigate, stepped in to expose a crime pattern that didn’t interest police: to find out what happened to millions of euros of donated toys and clothes that were routinely stolen from locked charity bins. Having sewn Yepzon tracking devices into these items, the team was able to follow the movement of those goods to Eastern Europe via phone apps. At GIJC23, Mission Investigate reporters explained that the “trick” with using these devices is to balance the battery life of the device with your story goal. If you’re only seeking the final destination, they say, then you should either enable a “sleep” mode, or use the app’s “History View” feature to see a summary of the tracker’s past movements.
8. Junkipedia: the tool that hopes to become the ‘CrowdTangle for everything.’
Originally designed only to monitor disinformation and “junk news,” the Junkipedia tool has rapidly expanded its features into a global, all-purpose social media analysis and digging engine. In addition to its shared database of problematic social media content, Junkipedia now allows users to track and build lists of social media accounts from — remarkably — 12 different platforms, including fringe sites like GETTR and Gab, as well as major sites like TikTok, Facebook, and Telegram. However, it does not have the most comprehensive datasets, or access to every public Facebook page, as CrowdTangle does, and it cannot yet mine LinkedIn. But the platform — developed by the Algorithmic Transparency Institute — has been built for journalists, by journalists, and it’s worth exploring its ever-expanding toolkit for unique story leads. For instance, few newsrooms have the time or stomach to listen to dozens of hours of far-right podcasts. Junkipedia provides automatic transcriptions of English-language podcasts, which you can search for frequently used terms. Apply to use the tool here, using your institutional email address.
9. Global Forest Watch and the MapBiomas Alert database for deforestation investigations.
While Global Forest Watch has been growing for the past nine years, investigative journalists beyond the environmental beat are increasingly turning to this open source platform to track global changes in forests in near real-time, and identify any possible misconduct tied to those patterns. It now features an open data portal, and a free MapBuilder tool that allows newsrooms to integrate their own data with the vast datasets on land use available on the site.
Meanwhile, in 2023, freelance investigative journalist Fernanda Wenzel used a powerful satellite-based tool called MapBiomas Alert to expose a hidden form of land-grabbing in the Amazon. Her story for The Intercept Brasil, “Ladrões de Floresta” (“Forest Thieves,”) showed how land-grabbers have used a bureaucratic loophole to go after “undesignated public forests” equivalent to the area of Spain. The database also offers deforestation alerts and coordinates that reporters can cross-reference with land claims registries, or use with the EcoCrime Data tool, which includes data on everything from cattle ranching to illegal mining.
10. DocumentCloud add-ons for redactions and PII.
DocumentCloud — the primary source document management platform — added a slew of remarkable new features for data journalists recently.
At NICAR23, Sanjin Ibrahimovic, Open Source Fellow at the MuckRock Foundation, said add-ons to the core functions have been created by the DocumentCloud community themselves — users, fellows, and journalists — to address problems and opportunities they encountered during live projects. Better still, he said: “The idea is that smaller newsrooms can use this without needing programming skills.”
New add-ons include:
- The ability to automatically find and highlight personally identifiable information (PII) scattered throughout huge datasets. These include numerous details you may either need to exclude, or need as story leads, like email addresses, social security numbers, ZIP codes, credit card numbers, and physical addresses in the small print.
- A “Bad Redactions” add-on feature, which helps journalists in two crucial ways. It automatically analyzes and surfaces all the supposedly redacted passages in a single spreadsheet, so you can sometimes reveal what the agency intended to conceal. And it gives you the option to complete the redaction job.
Access to DocumentCloud requires creating an account — ideally, using your institutional email address — which is followed by a quick verification step. Access to the growing library of new features involves clicking on “Add-Ons,” and then “Browse All Add-Ons.”
Rowan Philp is GIJN’s senior reporter. He was formerly chief reporter for South Africa’s Sunday Times. As a foreign correspondent, he has reported on news, politics, corruption, and conflict from more than two dozen countries around the world.