Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler
facial recognition techniques panel GIJC23
facial recognition techniques panel GIJC23

Reporters were given tips on the best tools and latest facial recognition techniques to help their investigations, at a panel at GIJC23. Image: Wiktoria Gruca for GIJN



Beyond Facial Recognition: State-of-the-Art Research Techniques

Read this article in

Investigative journalism thrives on a blend of high-tech tools and traditional research techniques, and in the field’s fast-evolving landscape, having the best tools and techniques can help newsrooms and reporters stay ahead of the curve and tell impactful stories.

In a panel on state-of-the-art research techniques at the 13th Global Investigative Journalism Conference (#GIJC23) and moderated by dataLEADS CEO Syed Nazakat, OCCRP research head Karina Shedrofsky and ICIJ training manager Jelena Cosic discussed recent investigations they’ve worked on and their favorite methods and tools, from facial recognition services to document categorizing.

Beyond Facial Recognition

Shedrofsky cited one of her latest projects — investigating an alleged cryptocurrency scam among teachers in Russia — to demonstrate how she uses one of her favorite tools. She had a photograph of a potential subject, but nothing else, so she started with PimEyes, a reverse image search service with facial recognition capability.

PimEyes returned many results and links, including an individual’s name — and another photograph, possibly of the same person. So she turned to Amazon’s Rekognition, which compares faces to determine whether two images are indeed the same person — which confirmed a 98% match. But Shedrofsky stressed the importance of verifying facial recognition results — even those with very high confidence rates — because sometimes these services can fail.

When in another case PimEyes yielded no results, she turned to search4faces, a service that indexes VK (VKontakte), a popular Russian social media platform. Here, she found what she needed.

The OCCRP’s Karina Shedrofsky discussed a number of her recent investigations using facial recognition technology.  Image: Smaranda Tolosano for GIJN

As a third example, she cited a story on a businessman said to be acting as a proxy for a sanctioned Russian oligarch. A reporter asked Shedrofsky for help proving the identity of the businessman’s son — when the reporter had spoken to the son, he had denied that the man in question was his father.

What Shedrofsky knew were their full names, the son’s date of birth, and the son’s place of work — a Russian majority state-owned bank. This time, she used Pipl, a tool that’s very good at combining a person’s physical presence with their online presence; entering an email or a phone number can yield a person’s social media accounts or physical address.

Because the son has a fairly common name, the search returned many results, but one of them had an email with the bank’s domain. Then, she found a Facebook account she thought belonged to him, but it was a private account, with no information. This seemed like a dead end, but she knew an important trick: on private Facebook accounts, you can still search by clicking the three horizontal dots “(…)” in the right corner of the profile page — anything publicly posted on a timeline is searchable.

Shedrofsky searched for birthdays, relevant names, anything that came to mind. She saw that this profile had received birthday messages on the date she knew to be the son’s birthday.

At that point, she was pretty confident that this was the correct profile. Searching for the word “love,” she found out the name of his wife. She then ran custom Google searches with their first names and surname, and found the website of the photography company that shot their wedding — and to her very welcome surprise, that the company had posted the entire wedding album. When she found a photo in the wedding album of the man she suspected was the father, she ran it through PimEyes — which found a match.

Shedrofsky said that her favorite tools lately — besides OCCRP’s Aleph database — are OpenCorporates, and contact information apps such as Truecaller and Rocketreach.

Old-School Techniques, Cutting Edge Tools

Cosic, from the International Consortium of Investigative Journalists, said she uses a combination of old-school techniques with modern tools. She showcased one of ICIJ’s latest projects: Deforestation Inc., a cross-border investigation that exposed companies branded, with the help of environmental certifications, as “sustainable” but which are accused of having contributed to forest destruction and of committing human rights violations. This wasn’t an easy project and they had to create their own database using a variety of sources.

  • Certification bodies and auditors.
  • EUTR’s (EU Timber Regulations) list of violations by country.
  • Reports on environmental violations from NGOs and country reports.
  • Trade data from ImportGenius.
  • FOIs, corporate documents, marketing materials, and court filings.
  • Accessing parent companies’ data through Orbis and Factiva.

All these datasets had to be harmonized so that the information could be accessed from a single master database. Cosic highlighted the importance of identifying research methodology before starting investigations of this kind.

Cosic said her all-time favorite tool is ICIJ’s Datashare — which runs OCR (optical character recognition) technology on uploaded documents to make them searchable. It also automatically detects and filters documents by people, organizations, and locations, making searches more efficient.

‘More Art than Science’

Shedrofsky and Cosic both acknowledged investigative journalism’s many challenges, with the former observing that the field is “more art than science,” and is constantly changing. “Staying on top of evolving crimes is a continuous challenge,” Shedrofsky cautioned.

For her part, Cosic highlighted the difficulties obtaining information from China, the limitations imposed by GDPR — the EU’s data protection regulation — and the need to navigate offshore data and domain registration complexities.

But, the pair pointed out, there are ways to stay ahead of the curve. Here are some of their tips for investigative journalists:

  • Spreadsheets are your best friends — use them to organize your data.
  • Seek guidance from data experts for effective data management.
  • Label and organize downloaded documents into folders.
  • Explore Chrome add-ons for capturing entire web pages and use Wayback  Machine add-ons for search history preservation.
  • Structure and tag documents for effective categorization.
  • Collaborate securely using double-encrypted open-source platforms.
  • Recognize the value of diverse skills and backgrounds in investigative journalism.

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article

Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to

Read Next

Data Journalism

GIJN Toolbox: Satellite Data, Tracking Usernames, and Facial Recognition

In this edition of The GIJN Toolbox — which surveys the latest tips and tools for investigative journalists — we’ll take a look at the process of analyzing satellite imagery derived from infrared radiation, a technique The New York Times used to cover a West Coast wildfire. We’ll also explore the controversial practice of using facial recognition technology, how to request NASA satellite data, a new document search tool from Google, and more.


How Journalists Can Investigate on Telegram

Telegram is an invaluable research tool, helping journalists mine for information, investigate groups of people whose content is otherwise banned or limited on social media, and track protests and political movements in authoritarian countries. Here’s how to get started using it.

Methodology Reporting Tools & Tips

Q & A: Investigating TikTok Content Across the Russia-Ukraine Border

To investigate what the Russian invasion looked like to TikTok users in Russia and Ukraine, and how the content available differed from one side of the border to the other, a team of journalists from the Norwegian broadcasting company NRK set out to investigate the social networking site’s algorithms and how a user’s location provides differing digital narratives about the war.

Data Journalism Reporting Tools & Tips

Interpreting Data: Tips to Make Sure You Know How to Read the Numbers

When using data for investigative stories, it is important to learn how to obtain and clean the information. But it is also vital that you interpret your findings correctly and extract the right conclusions from the numbers, filters, and spreadsheets. If you do the math correctly but fail to read the answers properly, you may end up misleading your audience.