How Univision Revealed Flaws in Costa Rica’s Judicial System

Four years of work and 8,000 judicial rulings later, the team at Univision Data shows how in Costa Rica, a person is more likely to be convicted of a crime if they are assigned a public defense attorney than if they have a private one. Their methodology included web scraping, R and logistic regression — a statistical method common in social sciences but practically unexplored in newsrooms.

On the Ethics of Web Scraping and Data Journalism

Web scraping is a way to extract information presented on websites. As I explained it in the first installment of this article, web scraping is used by many companies. It’s also a great tool for reporters who know how to code, since more and more public institutions publish their data on their websites.
With web scrapers, which are also called “bots,” it’s possible to gather large amounts of data for stories. But what are the ethical rules that reporters have to follow while web scraping?