How Periodistas de Datos Aggregated the Profiles of Over 300 Data Journalists
In July, an aggregator of data journalists from Spain and Latin America was launched under the name Periodistas de Datos. Maria Crosas Batista interviewed Félix Arias, project lead with Miguel Carvajal, to find out more about how the project came about — and where they plan to take it next.
The project came about as the result of a specific need of journalists (and professors) driving the Innovation in Journalism MA (MIP) at the Miguel Hernández University in Elche, Spain.
Arias and Carvajal were looking for a tool to use in their lessons that could show the potential of data journalism, as well as outstanding projects, to their students.
Although there was some information about data journalism in Spain and Latin America, they faced two main constraints: information was spread out on several social media channels (presented through Twitter lists, for instance) and it was outdated.
Arias found that journalism and the use of data was growing and there was a need to aggregate this on a unique platform (listen to the audio clip below to hear more).
After scraping Twitter and other social media channels for two months, they gathered information on more than 300 data journalists in the region on a simple but user-friendly website. Today the site includes 356 professionals, and can be filtered by country.
The first step was to create a Twitter list that included related hashtags (such as #ddj #periodismodatos and #periodismodedatos); journalists who had added “data journalism” on their Twitter bio; participants on the MIP course; journalists that had participated in at least two or three data journalism projects; and media that use data journalism in their work.
The next step was to scrape data from Twitter, such as profiles and location, as well as to add members of data journalism meetups.
When cleaning the data, apart from removing duplicate entries, they also disregarded journalists who weren’t really focusing on data, and developers who were far away from journalism. In the audio below, Arias describes how automation is used when getting, analyzing and visualizing the data:
The last step was to sort the data by country and city, add the media channel, project and gender and tags.
What’s Next?
Currently, MIP students are consolidating the information that is already on the platform, filtering the data to see what should or shouldn’t be included and increasing the database through a survey.
“This is a niche project for a specific audience,” says Arias.
And in the long-term, the project, still in an idea-forming phase, has two main steps planned:
- In addition to the professional bio, the team plan is to add relevant information on the projects that these people are working on. The challenge here is to define what a project is.
- They also plan to turn the site into an aggregator where people can modify the data. In order to accomplish this point, specific profiles (such as developers) and money are required. The outcome would be to connect similar profiles and promote collaborations.
The ultimate goal of this project, as part of the MA, is to create a tool that helps professionals throughout their research at university, creating papers and writing books,.
And as they try to get more people involved in the project, they have already had an early success, with the creator of Chicas Poderosas, Mariana Santos, lined up to share the lessons she has learned as an innovator in digital media with other Latin American women.
This post originally appeared on the Online Journalism Blog and is reproduced here with permission.
Maria Crosas Batista is a journalist interested in data visualization and new technologies. For the past three years, she has been experimenting with chatbots and machine learning. On her blog, she writes about digital journalism inside and outside newsrooms.