Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler

Stories

Topics

How One Mexican Data Team Uncovered the Story of 4,000 Missing Women

missingwomen

Mexican newspaper El Universal has put a face to the 4,534 women who have gone missing in Mexico City and the State of Mexico over the last decade: Ausencias Ignoradas (Ignored Absences) aims to put pressure on the government and eradicate this situation.

Daniela Guazo, from the data journalism team, explains how they gathered the data and presented the information not as numbers but as close people:

Breaking Down the Numbers

The Mexican government reported 4,281 missing women from 2005 to 2014, of which they are still looking for 2,000. The number was there — but nobody broke it down.

“The Mexican government declares reports and statistics without uploading the data. Therefore, when you want to check the information, there isn’t any document to follow or refer to.”

Scraping the Data

El Universal Data worked with Morlan, a company specialized in data analysis and programming, to gather the information from Odisea and Capea. Both are official websites which hold information on missing people but don’t present them in a downloadable format.

They were able to scrape 1,480 records (pictures and text) from Odisea in a JSON format before the website was closed down in November last year.

However, they could not scrape the data on Capea: the structure was extremely bad and journalists had to transcribe the information by hand in Excel.

By February 2016 the website had 6,787 records of which 3,054 could be systematized:

“We started reading record by record and filtered them by gender. Once we got all the missing women, we followed the structure from Odisea and started building the dataset for Mexico City.”

Once this process was completed, they matched and cleaned both data sets. This left 4,534 faces with some patterns (such as the age, body size, height or the color of the eyes), which they brought to the Mexican authorities.

“When it comes to missing people, there isn’t open data. Authorities don’t want to upload databases with all these details and all you have online is messy data in non readable formats such as JPGs that have to be scraped or copied by hand.”

4534

Families Waiting for Their Daughters

Although they presented the story using one case as the backbone, they spoke to at least ten families in Mexico City. All complained about the same things:

  • Unhelpful authorities
  • Daughters would have called the family to say goodbye and that they are safe
  • They did not pack their suitcases
  • Mobiles phones are disconnected on the same day
  • Families are the ones who look for the missing people because the government mainly categorizes these cases as “not located”, “lost” or “absent”, meaning that there isn’t a crime.

Data Journalism in Latin America

Daniela has worked as a data journalist for the last six years. She says that there are several countries such as Peru or Argentina that are growing open data and improving data journalism skills.

However, Mexico isn’t part of that:

“They are now understanding that data journalism is not only about graphics, numbers or statistics. It has a very strong journalism component. But resources are very scarce.”

The El Universal Data team, comprising Lilia Saúl and Daniela Guazo, was able to create Ausencias Ignoradas thanks to the Mike O’Connor Scholarship from the International Center for Journalists (ICFJ).

The story took six months including planning, gathering and analysing data, taking pictures, talking to families, writing the article, programming and designing — and it received a strong response from the audience and major organisations.

The next step is to update the information from 2016 and create another database for missing men in the city.


This story originally appeared on the Online Journalism Blog and is reprinted with permission. 

twitterpicMaria Crosas is a journalist interested in data journalism and visualizations. She has worked as a visual journalist in Spain and she’s currently finishing an Online Journalism MA at Birmingham City University, where she has been experimenting with the use of virtual reality and bots in journalism. On her blog, she writes about data, journalism, and visualizations inside and outside newsrooms.

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article


Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

Read Next

Data Journalism

Data Journalism Top 10: Hot Dogs, Ransomware, Earth’s Hottest Places, Miami Building Collapse, Bezos Empire

High vaccination rates in some parts of the world are helping to curb the spread of COVID-19 and allowing communities to resume normal life. But vaccinations can also give a false sense of security, with new variants threatening to prolong the pandemic. Our NodeXL mapping from June 28 to July 4, found Portuguese newspaper Público creating a tool to help readers find out what activities they can do after getting the vaccine at minimal risk. In this edition, we also take a look at a piece examining forest fires in Mexico, an analysis of the worst cyberattacks by Bloomberg, and a lively data-driven essay on same-gender lyrics by The Pudding.

Data Journalism

GIJN’s Data Journalism Top 10: Data Complexity, Forking Paths, Post-Brexit

What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from August 12 to 18 finds information designer Giorgia Lupi discussing how to embrace data complexity, The New York Times Opinion building a forking path visualization to predict an individual’s political leanings, the Guardian visualizing Brexit’s potential impact on the UK’s food imports, and El Universal Mexico looking at the incidences of crime claiming young victims.

Data Journalism

GIJN’s Data Journalism Top 10: Moscow Garbage, Mexican Homicide, EU Ideologies

What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from May 13 to 19 finds a preview snippet on sensible charts from @albertocairo’s upcoming book “How Charts Lie,” @ladatamx’s report on homicides in Mexico, @RepublikMagazin’s analysis on the changing ideologies of political parties in the European Union, and a recap of the Data Journalism UK conference by @paulbradshaw.

Data Journalism

GIJN’s Data Journalism Top 10: Rio’s Militias, OCCRP’s Database and Brexit’s Brits

What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from April 2 to 8 finds an alarming piece by @iamdylancurran on how much data Facebook and Google have actually gleaned from us, @OCCRP’s powerful database of public records and leaks, @davidottewell’s take on the evolution of data journalism and an investigation by @TheInterceptBr into the militias in Rio de Janeiro.