The Sigma Awards were set up to celebrate the best in data journalism around the world. Image: An illustration from the awards, partially made using generative AI
Q & A with Sigma Awards Executive Director Marianne Bouchart
Read this article in
After several battleground states in the United States reinstated the right of former prisoners to vote after being convicted of a felony, journalists from The Marshall Project decided to investigate.
They extracted, cleaned, and merged data from four different states about prison releases and voter registration records, and also used text messaging to conduct a survey with former inmates.
With machine learning, natural language processing, and through the use of programming platforms, they found that only one in four formerly incarcerated people in these states registered to vote for the 2020 presidential election, a rate that’s significantly lower than among the general population, where three in four eligible voters are registered.
They also learned that most ex-prisoners did not know about the voting rights they had recovered – and that states had done little to notify them of the change.
The Marshall Project story was published in partnership with the Louisville Courier-Journal and USA Today Network, and won a data journalism Sigma Award in 2022, a project to celebrate the best data journalism from around the world. The jury noted the project empowered community members to use data to hold the powerful accountable, particularly in a population that can be challenging to cover.
That is, in fact, one of the most important characteristics of award-winning projects — the ability to bring value to a community — according to Marianne Bouchart, executive director of the Sigma Awards.
In conversation with GIJN, she said that the latest round of award submissions included 600 entries, all of which will be added to a database of 2,200 high standard data journalism stories and portfolios. The winners of this year’s competition will be announced in April 2024, during the International Journalism Festival in Perugia, Italy.
We spoke to Bouchart about the role of The Sigma Awards in the open data community, plus its evolution, current trends, and strategies that work well for reporters working in this field.
GIJN: What makes an award-winning data project or portfolio?
Marianne Bouchart: That’s a question we ask ourselves each year, because the selection of winners doesn’t work as a long list of boxes to tick. But I can mention some elements an award-winning data project has. First of all, we see the data work behind the project, whether that’s the data collection process or the analysis done of the data. Ideally, we like to see projects that bring light on issues of public interest that would otherwise not be known by the public.
There’s also the impact of the project, but we understand how tricky it is for media teams to measure that. And it’s also difficult for the jury themselves to evaluate that impact, so that’s why the evaluators are people from various countries, so that their varied expertise and culture from different regions help them see what people who live on the other side of the world would not understand.
Another criteria is about storytelling and engagement. The use of visuals, interaction, and using innovative ways to report on certain topics. When people come up with something that we haven’t seen before, for example, an idea to report on the war in Gaza or some environmental issues in a way that no one else has explored before, of course they’re going to stand out.
Finally, what makes an award-winning data project could be the value of a project to a community. The way a piece of journalism can empower a group of people, a community, or help others understand important information on certain topics.
GIJN: How does the jury balance the assessment so that winning an award does not depend on the budget or on the tools and resources a project can afford?
MB: The mission of the Sigma Awards is to bring light on the best data journalism from around the world, wherever it comes from, whatever country, whatever media organization, no matter how many people worked on it, no matter how much resources have been put on the project. We strive each year to get to the best that has been done in each region.
So it’s a very ambitious vision and that’s why the process is made this way: first, it’s a free competition. Secondly, we are open to anyone, whether you’re an established media organization, a newcomer, even a blog, radio, TV, or online outlet, whether you’re a freelancer or a team of 10, everyone is welcome to take part. Also, you can apply in eight languages, just to cut the barrier of the English language, which has been there for years and years.
Finally, this year we gathered a jury and a prize committee that could handle entries in multiple languages. So, we have a group of 22 people in the jury, from 16 countries, and we made it so that the eight languages are represented in the jury. They will go through each entry multiple times until we get to a shortlist. And then, we’re going to have a second round of entry processing that will be done by another group called the prize committee. They also represent different parts of the world. They look at the shortlist and decide which of these great outstanding works deserve to be in the lineup of the winners of the competition.
GIJN: One of the purposes of the Sigma Awards is to understand the evolution of data journalism throughout the years. After four editions, how would you describe that evolution and the current state of data journalism?
MB: What we get reminded of each year is that data journalism is really international, truly global, not just affecting one region or a couple of regions. Each year we have new countries represented in the competition. Another thing we’ve learned is that, thankfully, data journalism is getting more accessible and that it has become more crucial for newsrooms covering all types of topics, sometimes under very challenging conditions. So over the past few years, some people who had never worked with Excel before, but were super great day-to-day journalists, had to get new skills on the go and sometimes not in the comfort of their home, but in the field.
We have also seen that small newsrooms, media that don’t have resources to hire big teams of data analysts or developers, still manage to do data reports, which shows that the tools and skills are becoming more accessible. For us, it’s exciting to witness such exponential use of data skills in newsrooms all over the world.
Unfortunately, the news from around the world hasn’t been the most positive over the past few years and some very important issues that matter to a lot of people have become very intricate, difficult, and polarizing. So, we have seen that data helps journalists inform more objectively and bring evidence that makes it easier for people to understand those very difficult issues. That’s when we understand how valuable data journalism can be to some communities.
GIJN: What strategies are data journalism teams using to engage their audiences, in an era when social networks crowd the web and readers report feeling an information overload?
MB: A very effective way that news organizations have used to engage with their audiences on data-loaded subjects is the personalization of the information. They enable people to identify with the topic by creating characters or actually bringing humans to their publications. We use robots a lot in data journalism, but having humans telling the stories in the published work and telling the data is one of the most effective ways to engage.
A good example is a winning project from NRK, Norway’s public broadcasting company, that analyzed the TikTok data about the Russian-Ukraine war. That data analysis was quite groundbreaking and the techniques were quite intricate and geeky. It could have been a nightmare explaining all of that process to make people understand. But in the end, they used such great, personalized storytelling, that it worked perfectly.
There are lots of other projects all around the world that have come to us that are very engaging. Some of them have this news game feel that we’ve been using for years, that lets you select your journey throughout the data, or makes you feel like you’re a player in the story while learning about the information.
And there are innovative narratives, like another project we honored last year, from Cuestión Pública. The project was called Game of Votes [in reference to the TV series Game of Thrones, to show Colombia’s complex political patronage system]. I admire how Cuestión Pública manages to reach the audience by creating formats that match their likes and bring a very serious topic on the table, for an audience that might not even be used to reading about politics.
So, in the end, telling the data through the humans is the catchier way. No sexy interactive is going to match the impact you can have when you tell data through the story of human beings.
GIJN: Which new techniques are journalists finding useful for collecting, analyzing, and reporting on data?
MB: There are trends that we see evolving each year, like the use of AI in many different ways, mainly over the past two, three years. I think it is great to see that growth, especially in the Global South. Also, the use of more diverse sources of data, like data from social media content.
Satellite imagery is so widespread now that it is fantastic to see its evolution. I’ve been in data journalism for almost 15 years and when I go back to the days we used to play with satellite imagery tools that were really not made for journalists, I remember we tried to create stuff that we knew were not fully well made but would do the work. Today we do have tools made for journalists to help us produce some incredible satellite imagery-driven formats.
I have also noted the use of illustrations and the way news teams collaborate with artists and graphic designers to recreate 3D modeling comic strip kind of illustrations to talk about some difficult topics.
GIJN: How are data journalists taking advantage of AI?
MB: Artificial Intelligence is not just used in this science fiction side that we see on TV, not only for AI generated content. It’s not a tool to create things and in which you can rely upon unconditionally. In journalism, we have been using the useful part of AI, which is the one that helps with production, as a little assistant that helps you get to your goal faster.
You can use AI to process and analyze data, to look for patterns and outliers, to summarize documents, reports, to transcribe audio to text or even text to text, and to convert files like scans or PDFs into readable spreadsheets. This is, actually, still a problem in many countries, but the tools have evolved and now we have more powerful, faster solutions.
We all put all these possibilities in the keywords as AI, but it’s actually about algorithms, machine learning, automatization, just computer programs, all of that.
And a very important reality about AI is that everyone is not at the same level. Unfortunately, a lot of the AI tools that big newsrooms around the world are using to make incredible formats are still not accessible to many people. Two clear examples are ChatGPT, which isn’t accessible in all countries, and transcription tools that are only available in a few languages. The question of languages is a big deal, and I hope that the tech people behind the AI tools decide to adapt to less-spoken and local languages. I’ve worked in sub-Saharan Africa and countries in the Sahel, such as Niger, Mali, and Burkina Faso, where there are around five main languages in each country. So, I know that some important information would become really useful if AI tools could handle those languages.
GIJN: Data journalists usually team up with experts and specialists from other fields. What are the most common fields in those partnerships?
MB: Well, I’m going to take the time machine again to go 10 or 15 years back when the data journalist was that guy at the corner of the newsroom, just working on his own. And we have seen how throughout the years collaboration has become key to expand the field and to create even better data journalism. Collaboration started within the same organization and then it opened up further, collaborating between media organizations to achieve international investigations. Then, collaborative work with other organizations that have nothing to do with journalism appeared, because we’ve learned that it’s always better to be surrounded by experts than to try to do it all by ourselves.
We’ve seen great examples of collaborative work among journalists and also with representatives from the environmental and the scientific sector. Actually, the rise of satellite imagery in journalism began thanks to collaborations with NASA and other big satellite experts who helped journalists understand the tech. And, of course, during the COVID outbreak, we had no choice but to talk to scientists and epidemiologists to understand all the reports and data we had in front of us.
GIJN: What’s next for data journalism?
MB: I would like to see more use of data in more places and in more media organizations. There are too many countries where access to data is still a problem, either for political, technical (data that is not digitalized), or situational reasons, and as a result, the work of data journalists is quite difficult. So, when you ask me what’s next for data journalism I hope that it is more data accessible to more people.
And then, more language. I’m hoping that in the future, those tools that we take for granted in the West can expand and be as useful to other communities in a language-agnostic fashion.
Additionally, it seems that year in, year out, there are more international issues to cover. So I guess there will be more collaborations across regions to understand difficult international impactful issues, such as the ongoing wars and conflicts or big outbreaks or environmental problems that touch everyone.
Miriam Forero Ariza is a Colombian freelance investigative and data journalist whose work has been published by VICE, Colombiacheck, and El Espectador. She has more than a decade of experience in collaborative investigations, data analysis, and visualizations. She is co-author of the Iberoamerican Data Journalism Handbook.