Register for #GIJC25
November 20, 2025 • 09:00
-
day
days
-
hour
hours
-
min
mins
-
sec
secs

Accessibility Settings

color options

monochrome muted color dark

reading tools

isolation ruler
Nodjadong Boonprasert
Nodjadong Boonprasert

Illustration: Nodjadong Boonprasert for GIJN

Resource

» Guide

Guide to Investigating Social Media Algorithms 

The impacts of social media algorithms are manifold: Algorithmically powered newsfeeds on social media threaten to destabilize countries and governments around the world, deeply impact mental health, and break down our relationships. Here are a few notable examples:

Covering social media algorithms can take many forms. Some investigative approaches can take weeks and may require digital forensics and coding skills. But many other investigations can be done with little to no technical know-how, simply by tracing content across different accounts or by running experiments on social platforms to illustrate how they do harm.

Whatever approach you choose, accountability stories on social media algorithms require a bit of creativity. That’s in large part because private companies like Meta or TikTok do not have to disclose how their companies build their algorithms. Outside of the occasional leak of documents, most journalists will need to find ways to reverse engineer how algorithms may operate, or simply prove that they do harm by pushing out hateful or misleading content.

What Are Algorithms and How Do They Work?

Algorithms are sets of rules or calculations made to fulfill a task. In the realm of social media, algorithms evaluate input data — information including what content a user may have liked, who their connections on that platform are, or what posts they may have interacted with — to determine what information the user will see on their news feed, timeline or stream and to predict what these users might be more likely to engage with.

These algorithms are being constantly changed by their corporate owners, and have become increasingly sophisticated over time. Facebook’s algorithm, for instance, took in about 10,000 signals to predict what kinds of posts a user might be most likely to interact with, according to a 2021 story from The Washington Post.

Washington Post 2021 investigation Facebook algorithm

In 2021, The Washington Post did a deep dive into Facebook’s changing algorithm and how it affects what users see on their feeds. Image: Screenshot, The Washington Post

But not only are algorithms on social media deeply complex, the companies that produce them will also not disclose how they work. Because of this, investigative reporters will primarily focus on the harmful output of the algorithms when doing accountability journalism about the platforms. This may include looking at the kinds of content that goes viral on these platforms, understanding what kind of videos may be served to certain vulnerable populations, setting up dummy accounts to test out how the algorithms work, or looking into how pervasive trolling — intentionally trying to provoke or “bait” someone with offensive or inflammatory posts — or false information is on a platform.

Covering stories about the social web also entails discussing the flow of information. When you’re tackling investigations in that realm, it can be helpful to develop a common understanding of terms and concepts since many phenomena that we observe online are unique to social media.

Disinformation, Misinformation, Hate Speech on Social Media: What Are They and How Do They Spread?

While there are a plethora of resources relevant to journalists covering the flow of online information, like the misinformation glossary created by the nonprofit research organization EU DisinfoLab, three terms are the most popular in public discourse: misinformation, disinformation and online hate.

Misinformation. This term is often used as a catch-all to describe a plethora of online issues. But it’s actually much more narrow than that: misinformation is information that is false but often believed to be true by those who share and spread it. For instance, during breaking news scenarios, people may mistakenly share photos from a different or past event thinking it’s related to a developing story.

Disinformation is false information intentionally created or shared to mislead or harm. For example, during the 2016 US presidential election, an investigation conducted by the US Senate and House Intelligence Committees in cooperation with Facebook found that Russian agents pretended to be US voters and spread disinformation about fake protests to bring about political division. Russia has denied any election meddling.

Hate speech is another important kind of harmful information that is often spread online. It expresses and stokes violence against a group based on their characteristics, including their race, religion, gender, or sexual orientation. For instance, Myanmar political leaders used Facebook to dehumanize and demonize the Rohingya, a Muslim ethnic minority, which led to violence against the group. Facebook subsequently banned 20 organizations and individuals in Myanmar — in response to a report from the United Nations — to stop “the spread of hate and misinformation.” The country’s military officials rejected the findings of this report.

It is equally important for journalists to understand who the main players are inside any given media ecosystem. Algorithms create personalized online experiences, which means each of us inhabits our  own unique little universe of information, often called “echo chambers” or  “filter bubbles.”

There are the audience members, or largely passive users who are part of this ecosystem and may share content but don’t necessarily create large amounts of it. Then there are active content creators who produce a lot of information. They tend to have a lot of influence on the kind of information audience members consume and can include government entities, news organizations, and influencers, as well as bad actors who are trying to sow discord. Last but not least there are the algorithms — the equivalent of a set of data-driven rules that determine what content is surfaced on people’s timelines — that help shape the experiences in these media ecosystems.

Three Approaches to Investigating Social Media Algorithms

A few years ago, reporter Jane Lytvynenko and I also developed a framework for investigating these issues. The key is to tackle stories with a central focal point of one of the three entities that make up a media ecosystem: the audience, the content creators or the algorithm.

1.  Investigating How Audiences Experience Algorithmic Feeds

Focusing on audience members, or social media users is a good place to start. Often, these are unsuspecting people who are using social media to connect with their friends or consume news and information.

As we discussed earlier, how someone experiences mis- or disinformation is very individualized. To delve into the distorted worldviews that algorithms and newsfeeds create, journalists could consider focusing on the data of one user’s views online — or do a “quantified selfie” of one person’s experience of the social web.

That approach entails asking for access to a person’s data to look into how they experience social media. Thanks to the European Union’s regulations on data and security (GDPR), many social media companies that operate in the EU, including TikTok, Meta and X, are required to give users more control over their own data, meaning they allow people to access, download, and delete data of their activities online.

In this story for Documented, for instance, reporter Malick Gai gathered TikTok viewing archives from five different Senegalese migrants who were influenced to come to the United States in part based on the information they found on TikTok.The migrants were still heavily dependent on the platform for information in their native language. Through the archive, the reporter was able to see that the migrants were watching a lot of videos giving them inaccurate or misleading information about immigration processes. For instance, some provided incorrect information on how to fill out an asylum application, that if acted upon, could have had negative consequences on their immigration proceedings. Another video falsely stated that the mayor of New York City was distributing US$50 million among migrants. ByteDance, the company that owns TikTok, did not respond for comment.

Similarly, reporters have investigated the devastating impact of social media on the mental health of young women. NBC News reporter Kat Tenbarge found several user groups on X that promoted eating disorders and that had amassed between 2,000 and 173,000 users, with some identifying as young as 13 years of age. According to a statement provided to NBC News: “X prohibits content that promotes or encourages self-harming behaviors and has zero tolerance for child sexual exploitation. In this case, after a thorough review, we have suspended the Community for violating our Rules.”

The 2025 documentary “Can’t Look Away” features whistleblowers who previously worked at Facebook, TikTok, and Instagram. It explores how algorithms can cause harm in the way that they show content, especially to young people. In a discussion about the documentary and his work at TikTok, Charles Bahr, one of the whistleblowers featured, says that algorithms optimize everything without emotional intelligence, which leads to harm. He spoke about how TikTok specifically showed content that users spent more time on, as opposed to content they searched for. In other words, if a user watches a depressing video, their feed will very quickly become an echo chamber of similar content. Tik Tok argues that it offers “safeguards for younger audiencesscreen-time management toolssafety tools for parents and guardians, and other similar resources”.

Arturo Bejar, another whistleblower who worked at both Facebook and Instagram, said that the companies like to “pretend” that there are ways for users to control content they no longer want in their feed through the “Not Interested” button. However, these controls aren’t effective because the algorithm is still mostly showing users content that they spend more time on. Meta though claims “it has rolled out more than 30 features to improve the experience of teens on their platforms.”

Bahr also said that if it is easy to replicate the problem that users are experiencing on a platform, especially in terms of disturbing or distressing content, it means the problem is more structural. So journalists can try reverse engineering what their feed looks like if they interact more with a particular kind of content to get an understanding of how it might impact general users as well.

2.  Investigating How Content Creators Take Advantage of Algorithms to Amplify Harmful or False Information

If you choose to investigate problematic content creators in your investigations, you may want to look at false or misleading viral content and trace it back to its origins.

A lot of bad actors who run disinformation campaigns use false or misleading content and post it again and again on various platforms.

In one story from Code for Africa, reporters found that a network of 16 Facebook accounts, not based in Ghana, systematically spread misinformation about alleged arrests of pro-Russian protesters in Ghana by repeatedly copying and pasting the same post. According to the reporters, the accounts were based in Côte d’Ivoire, Burkina Faso, Mali, France, and Spain.

For another story, reporters from the Swiss public news organization Schweizer Radio und Fernsehen (SRF) used machine learning to identify fake accounts. They purchased 5,000 fake accounts, trained their machine learning model on the characteristics of those fake accounts, and then identified other fake accounts using this trained model.

Code for Africa, pro-Russian influence campaign in Ghana

In an investigation of a pro-Russian influence campaign inside Ghana, Code for Africa found which outside countries the different accounts were actually based in. Image: Code for Africa

3.  Investigating How Algorithms Work

Another approach is to write about the algorithms that power much of social media.

These stories examine the system that governs much of our online experience, which includes the impact of algorithms on the propagation of information or on access to opportunities.

But examining or auditing social media algorithms can be extremely difficult — they are opaque, complex systems that aren’t easy to understand and inaccessible to the public. According to The Washington Post, for example, the Facebook algorithm takes in more than 10,000 different signals to predict whether a person may interact with a post.

This means that journalists are often left with having to reverse-engineer how some algorithms work based on data collection and experiments with test accounts. For instance, The Wall Street Journal set up 100 TikTok accounts to test what impact different actions — from lingering on a post to rewatching videos — had on the content these accounts would see.

But not everyone needs to use a high-tech approach to investigate technology. As a reporter, you don’t always have to explain systems. Sometimes it’s enough to “poke the system” and show that it does harm.

In this story from ProPublica, reporters set up an account to place Facebook ads and found that the platform would not stop them from placing ads that excluded people based on race, gender, and other personal characteristics. Showing that they were able to find ways to break the law proved how Facebook failed to prevent harm from happening. Facebook told ProPublica it prohibits advertisers from using their services in discriminatory ways.

Investigating How Social Media Companies Fail to Mitigate Algorithmic Harm

It can also be helpful to try to dig deeper into the inner workings of social media companies. Journalists can do this by developing relationships with current and former employees of social media companies, by digging into lawsuits launched against them, or by working with whistleblowers. Insider reporter Tekendra Parmar, for instance, investigated the death of Meareg Amare, an Ethiopian professor who was murdered after hateful, false content was spread about him on Facebook. Parmar spoke to several people who worked with Facebook through the company’s Trusted Partner program, a network of civil society organizations who had the linguistic and cultural knowledge necessary to identify and help flag hateful information to the company. Six participants in the program told Parmar that Facebook “routinely ignored their warnings of hateful content”, including warnings about disinformation that circulated around Amare. They also allowed Parmer to review their communications with the company. Meta, Facebook’s parent company, told Insider they aimed to review content reported by Trusted Partners as quickly as possible, and that review time may vary depending on the case.

As these examples show, there are various reporting methods you can use for investigations about the social web. This can include high-tech approaches like scraping data from the platforms for analysis, or using digital forensics methods to track down bad actors. But reporters can also be successful with low-tech solutions, such as interviewing people who have experienced harm on the platforms or running experiments on platforms to prove harmful outputs. As a burgeoning field, the main key to successful investigations of social media algorithms is a creative approach.

Further Reading

Tools for Journalists

Research


Lam Thuy VoLam Thuy Vo is a journalist who marries data analysis with on-the-ground reporting to examine how systems and policies affect individuals. She is currently an investigative reporter working with Documented, an independent, non-profit newsroom dedicated to reporting with and for immigrant communities, and an associate professor of data journalism at the Craig Newmark Graduate School of Journalism. Previously, she was a journalist at The Markup, BuzzFeed News, The Wall Street Journal, Al Jazeera America and NPR’s Planet Money.

Republish our articles for free, online or in print, under a Creative Commons license.

Republish this article


Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

Read Next

Reporting Tools & Tips

New Investigative Tools for Monitoring Social Media Platforms

Social media platforms are among the most difficult sites to scrape for data across the internet. A recent session at NICAR23 unveiled several dynamic new tools — including Junkipedia, a possible CrowdTangle replacement — that can perform a wealth of social media monitoring tasks, from tracking down who is behind harmful ads to identifying conspiracy groups or influencers spreading disinformation.