The use of open source tools, user-generated content (UGC), and advanced search filters has allowed reporters to break major stories on the COVID-19 pandemic from home quarantine.
Yet experts warn that even the best open source intelligence research is only half the job — and it is not automated. Painstaking effort is required to learn how to search systems, verify data, and reach unfamiliar sources to master the other half of the story.
In the sixth webinar in GIJN’s series, Investigating the Pandemic, investigative researchers shared key insights on the tools and techniques that have unearthed facts and visuals beyond the reach of traditional field reporting.
Gisela Pérez de Acha, a digital verification expert at the Human Rights Center at the University of California at Berkeley, focused on techniques for working around algorithms on major social platforms. A journalist and a human rights lawyer, Pérez de Acha emphasized the need for strong traditional reporting principles for open source work, and for journalists to protect and disguise their digital footprints when pursuing risky stories.
Charlotte Godart, an investigator and trainer for the independent investigative website Bellingcat, told last week’s global webcast audience about using effective search filters for open source content.
Godart says it is important to keep searches narrow enough to handle, and that journalists should not forget the often-overlooked value of everyday engines like Google Search and TweetDeck.
For a great example of a story that combined effective open source research with traditional reporting, Godart cited Craig Silverman’s recent BuzzFeed story on how online rumors and hoaxes were putting nurses in danger. For a story that combined advanced open source searches — in this case, satellite imagery and ship tracking data — she commended a recent story in The New York Times on the impact of the coronavirus on sanctioned imports to North Korea.
Four Corners, the Australian Broadcasting Corporation’s investigative news program, also used open source tools to verify UGC from Wuhan to make Coronavirus, their startling documentary. Sean Nicholls, one of the reporters on the story, told the GIJN webinar audience last week how the team used reverse image searches, comparisons between background detail on video clips, and archived social media footage which all proved critical in verifying the UGC at the heart of their story.
Key Open Source Tips
1. It can be a mistake to start “cold” with a fistful of open source tools. Instead, develop an online research mindset through practice, and by reading guides and books written specifically for reporters. Silverman’s comprehensive Verification Handbook can help guide reporters when considering open source content.
2. TweetDeck remains one of the best tools for filtering key content on Twitter. Create columns on TweetDeck relevant to the topics you’re investigating, click the filters icon at the top right of each column, and search the specific date ranges that Twitter users were likely to have been discussing your specific search terms.
3. Use geocodes to find out what is happening in specific areas.
- If, for example, a riot breaks out, a right-click on that area on Google Maps offers a “What’s here” option which, in turn, provides coordinates.
- These coordinates can then be copied over to the search box on TweetDeck.
- Type “geocode”: then paste the coordinates, followed by the radius you need in order to see tweets from that area in real time. Take care to leave no spaces, and to delete the dash between the longitude and latitude, before hitting “enter.”
- You can then filter further with keywords, like “coronavirus.”
- Remember that tweets by location only reflect content from those users with their location settings turned on; this represents fewer people than before since Twitter changed to an “opt-in” system.
4. Consider the Hunch.ly, a tool for archiving material during extended open source research periods. Hunch.ly even analyzes your open source search history, and helps you retrieve early search data which may become important later in the investigation.
5. Learn how to work around Facebook’s often-confusing and unhelpful search algorithms. One option is “Who Posted What?” This tool gets around Facebook’s prioritization of personal search histories to fetch the most neutral results possible. It generates a user ID from the Facebook page you’re interested in, and allows a search of activity on specific days. Tip: Be sure to use precise search terms and dates.
6. Consider the ethics of joining online chat groups where other users don’t know you’re a reporter. Pérez de Acha said: “From a journalism perspective, I could get in trouble if I don’t disclose I’m a journalist, and I’m just hanging out in a private (chat) group. It could be considered trespassing.” One technique Pérez de Acha employs, when needed, is to use a “burner” social media account strictly to passively gather leads, and then to revert to her personal account to actively reach out to potential sources, with her reporting intentions declared.
Find all of the past webinars in GIJN’s Investigating the Pandemic series on our YouTube channel. Our next in the series, Africa Reports on COVID-19, will be on Thursday May 14, 2020 at 9:00 EST. Sign up link to come soon.
Rowan Philp is a reporter for GIJN. Rowan was formerly chief reporter for South Africa’s Sunday Times. As a foreign correspondent, he has reported on news, politics, corruption, and conflict from more than two dozen countries around the world.