Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler

Stories

Topics

Journalism’s Deep Web: 7 Tips on Using OCCRP Data

Read this article in

বাংলা | Русский

The Organized Crime and Corruption Reporting Project (OCCRP) Data Team has developed new features on OCCRP Data in the past six months and brought together more than 200 different datasets. Its new software is now configured to let reporters search all of those at once.

OCCRP Data, part of the Investigative Dashboard, offers journalists a shortcut to the deep web. It now has over 170 public sources and more than 100 million leads for public search – news archives, court documents, leaks and grey literature encompassing UK parliamentary inquiries, companies and procurement databases, NGO reports and even CIA rendition flights, among other choice reading. (All this is publicly available. If you’re associated with OCCRP, you’ll have access to more than 250 million items).

Uniquely, the database also contains international sanctions lists detailing persons of political or criminal relevance.

The new platform makes searching diverse types of objects, such as emails, documents and database entries from corporate or land registries into a unified user experience, with an appropriate way to display each type of data.

Here are seven tips to help you get the most out of OCCRP Data:

Browse Directly on Your Screen

OCCRP Data has emails, PDF and Word documents, contracts, old news archives, even Rudyard Kipling poems (from Wikileaks, to be fair). Its brand new interface makes it easier for you to view documents, search within them and preview them in the browser without having to download and open them, making research a faster and more seamless process.

New Search Filter Options

OCCRP Data lets you filter search results by sources, document type, as well as emails, phone numbers, addresses, entity names, countries and more on its left-hand column, after you’ve run your search.

Highlight Connections

You can explore structured data in new ways because OCCRP Data uses entity extraction on documents and emails to find phone numbers, names of people and companies, addresses, ID numbers and other key linkage details of interest. Just click on an entity and see the “Tags” option in the preview screen.

Do Bulk Comparisons

OCCRP Data is capable of cross-referencing the information on two lists; it also ranks data that closely matches and lets you compare the information. Click on a source and then click on the “Cross Reference” option to choose another source with which to do the comparison.

Monitor Search Terms, Receive Alerts

OCCRP Data now has an alerts feature that allows you to monitor a search term so when new information is added to the database you will receive a notification. Simply switch on the bell icon right next to your search query.

Foreign Language Support

OCCRP Data now supports content in foreign languages. The interface is translated and supports Russian and Bosnian-Serbo-Croatian. Search results on the database can also be filtered by language. The data team is working on adding other languages, such as German and Spanish.

Advanced Search Operators

You can use complex search operators to do things such as proximity searches, exact term searches, take into account spelling errors and combine queries.

Any Questions?

Anyone accessing OCCRP Data can check out the the Aleph Wiki where the data team covers its uses, function and development roadmap. Journalists and technologists alike can read the user manual or contact data@occrp.org to give us feedback.


The Organized Crime and Corruption Reporting Project (OCCRP) is an investigative reporting platform formed by 40 non-profit investigative centers, scores of journalists and several major regional news organizations worldwide. 

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article


Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

Read Next

Methodology Reporting Tools & Tips

What’s in a Name? Investigating People Across Borders

It’s increasingly common for criminals to operate across borders. So investigative reporters also need to know how to identify and investigate people whose names use different alphabets, are formed by unfamiliar conventions, or carry unexpected cultural signals. Read these tips from the OCCRP on best practices for investigating names from outside the English-speaking world.