Image: Leon de Korte / De Correspondent
Margot* is not your average white nationalist. She’s about 50, a devout Christian from a rich part of the Netherlands, and she likes opera. Around 2015, she started commenting on YouTube videos about the refugee crisis. She was afraid of an influx of hundreds of thousands of refugees and immigrants from mostly Islamic countries.
That fear was certainly widely shared around that time, but slowly, Margot’s behavior on YouTube became more extreme. After the election of Donald Trump in 2016, she started looking at alt-right video channels, like those of Canadian vlogger Stefan Molyneux (about 1 million subscribers), The Rebel Media, and Sargon of Akkad.
She also started to express herself differently, becoming more aggressive towards feminists, foreigners, and “social justice warriors.” Her comments were now sprinkled with words and phrases usually found on alt-right forums.
For instance, Margot talked about her “redpilling,” a term that refers to a scene in the film The Matrix. Swallow the blue pill and you stay a slave, a battery to power the machine of the Matrix. But swallow the red pill and the truth will be revealed — and in her view, this truth was that the world was being ruined by left-wing liberals.
Around the time that our investigation began, in early 2018, Margot had shifted further to the right. She still watched alt-right channels, but was also interested in white nationalist and antisemitic videos. White culture, she declared in one comment written in Dutch, was being undermined by left-wing “assholes” and feminists. She also blamed Jews for many societal ills.
Margot is just one of the thousands of users we encountered in our research into radicalization on YouTube. This was a collaboration between the Dutch news site De Correspondent, where I run the data desk, and two reporters from the Dutch daily newspaper De Volkskrant, Hassan Bahara and Annieke Kranenberg.
We started by investigating the rise of the alt-right on more obscure forums like 4chan and 8chan as well as on the popular chat app Discord. We were soon struck by the many references extremists made to YouTube videos, and decided to explore this platform.
The amount of right-wing extremist content available on YouTube turned out to be overwhelming: There was racism, antisemitism, antifeminism, white nationalism, and conspiracies on “cultural marxism,” “white genocide,” and “the great replacement” (the idea that white people are slowly being replaced by nonwhites through birth rates and immigration).
Around the same time, researchers began to worry that YouTube’s recommendation algorithm was exacerbating the spread of extremist content by slowly pulling viewers into a rabbit hole of hate. The recommendations that popped up when users were watching videos would slowly become more extreme: A user could start out with a left-leaning video on racism and slowly but surely end up, through a series of recommendations, watching right-wing extremist content.
Researching a platform as huge as YouTube poses some formidable challenges. Each minute, about 400 hours of content is uploaded to YouTube. The recommendation algorithm isn’t a simple recipe that you can reverse engineer: It is a very complex system, a deep neural network that takes many variables into account to deliver an experience that is unique for each user.
Our approach was first to map the prevalence of right-wing content on YouTube and then focus on the dynamics of how this content was recommended and consumed by viewers.
We started out by making lists of known right-wing organizations, named as such in academic papers, in reports by antifascism groups like Hope not Hate, and in a few other sources, like Reddit. For each organization or group, we tried to find their social media accounts, including a YouTube channel, if available. We also looked at some right-wing extremist forums, where we scraped all links to YouTube and collected the names of the channels.
Early on, we decided to repeat this exercise for left-wing channels so that we could later make a comparison. We used this data to get a sense of the relative size of the right-wing media bubble on YouTube, which seems to be much larger and more popular than the left-wing bubble.
One of the first major hurdles was the problem of definitions. What makes a channel extremist (on the right or left)? Who gets to decide? Different countries have different definitions. And channels themselves change over time. Some start out politically centrist, but slowly turn more and more right-wing, while other channels lose their political edge over time.
Our solution was simple and effective: We cast a wide net, from politically right- and left-of-center to the extremes. The benefit of this approach is that we looked at a political spectrum and actually saw people moving over time from the center to the extremes.
In the end, we compiled a list of 1,500 channels, about evenly spread on the left-right spectrum. YouTube has a very liberal API with which you can query the database for a lot of metadata. We wrote extensive software (packaged in a reusable Python library) to examine:
- 600,000 videos
- 450,000 transcripts of those videos (by using YouTube’s automatic closed-captioning service, which is not available for all videos)
- 120 million comments on those videos
- 20 million recommendations automatically generated from viewing those videos
That’s a lot of data — about 100 GB. But what to do with it?
To make sure we got the best results, we decided to organize two hackathons, and invited readers to join us at our office for a day to work with our data. The aim was to explore a few methodologies.
One of the insights was that we should look at the comments section to find clues on radicalization. Every user has a unique ID, so we could track all comments made by the same person. We then looked at who commented in an approving way under the most extreme content (white nationalist, antisemitic, etc.) and looked at where these commenters had been active before.
That still left us with a list with over 50,000 unique YouTube users who we could assume held quite extremist views. We decided to look for the Dutch people on this list — people who were easy for us to find in real life, and with whom we could talk in depth.
To find these people, we simply looked at which of them were active in the comments section on Dutch channels. We were also able to use the Google Translate API to find all comments written in Dutch. That resulted in a list of about 200 people that we were very certain were Dutch.
We could plot over time which channels they commented on and could see some big changes. Many of the people in this group started out commenting under left-wing videos from The Young Turks, a popular progressive channel, but then slowly started to more frequently comment on alt-right channels and outright white nationalist channels.
When we analyzed the contents of the comments, we were struck by their increasingly antisemitic nature. At first, hardly a word was said about Jews. A lot of angry comments were about Islam, immigration, feminism, and “social justice warriors.” But over the course of 2017 and 2018, there is a clear rise in antisemitic slurs within this group.
In the end, we selected six people from this group and had in-depth conversations with them in person. How did they use YouTube? How were they radicalized? (They often objected to the term “radicalization,” and saw it as “an awakening.”) How did they explain their shift from left-leaning to right-wing in such a short time-span?
These people confirmed the role of the recommendation algorithm in their journey to new content and communities. They described how they often started out on left-wing channels, but were alerted to right-wing content through the recommendations. Slowly, the recommendations became more extreme.
Their anecdotes are very valuable. However, our efforts to thoroughly test the workings of this system with automated means have failed.
We wrote a little program that volunteers could run on their computers. The program would query ten search terms — hot-button issues like “feminism,” “immigration,” etc. — and log the recommendations for us.
The problem was that these recommendations were only semi-personalized. The volunteers weren’t logged into YouTube, because it was too complicated and costly to write code that tracked logged users while at the same time protecting their data and privacy. The results therefore weren’t a good representation of natural (and personalized) YouTube usage, so we decided not to use these findings. In the end, we had to make do with dozens of interviews with these volunteers about their experience with the recommendations, as well as with sparse academic literature on the subject.
All in all, we worked on this story for about seven months, but it was worth it. We managed to show that YouTube is the mothership of online hate, dwarfing obscure forums like 4chan and 8chan in size and influence. There is a staggering amount of extremist content available on the platform. But that is not surprising. YouTube has long had a laissez-faire approach to this kind of content: As long there is no media outrage, basically anything goes.
And although YouTube vehemently denied to us that the recommendation algorithm had anything to do with the radicalization of its users, the system was recently completely overhauled. It now recommends a lot more mainstream content.
In the end, we only scratched the surface of what is happening on this massive video platform. We hope that our efforts and our code (which is publicly available) will inspire other journalists and researchers to take on this challenge. It’s difficult, but it can and must be done.
*Margot is a pseudonym.
Dimitri Tokmetzis is an investigative journalist for De Correspondent, an ad-free Dutch media platform. He is head of the data desk, which utilizes the latest technologies to investigate and design investigative stories. He is currently working on stories on human trafficking in Europe and on the crypto-economy.