What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from April 8 to 14 finds a zine focused on machine learning-powered investigative journalism produced by @bxrobertz, a video explainer from @FT on if big corporations are really generous or just avoiding taxes, @knowtheory and @amandabee reviewing seven optical character recognition tools, and @workbenchdata with a tutorial on visualizing @Twitter data.
AI-Powered Investigative Journalism
Computational journalist Brandon Roberts produced an interesting zine on how machine learning can be applied to investigative journalism. In the first issue, he interviews the Minneapolis Star Tribune’s Chase Davis, profiles a new web scraping tool and dissects a machine learning-powered investigation into local property tax evasion.
Philanthropy or Tax Avoidance?
At the World Economic Forum this year, economic historian Rutger Bregman questioned whether big corporations and their bosses were truly being generous in donating billions to causes, or just avoiding taxes. Financial Times’ journalists Federica Cocco and John Burn-Murdoch attempt to investigate Bregman’s question and explain their findings with sketched charts, Monopoly money and candy.
Extracting Data From PDFs
Trying to get data out from pesky PDFs but not sure which tool to use? Ted Han and Amanda Hickman, from Factful, went through seven optical character recognition tools so that you don’t have to. Here is their side-by-side comparison and review of the tools.
Visualize Twitter Data
Want to analyze and visualize Twitter data in four steps? Workbench prepared a tutorial on how to use the Twitter API to load tweets and associated data from an account and then visualize results of how often that account tweets a specific word. Data Journalism Turkey translated the tutorial into Turkish here.
Google’s Data Offerings
Google News Initiative is helping to boost the field of data journalism with more trainings, online resources and tools. What’s coming: free data training for local newsrooms in the United States and Canada, in partnership with Investigative Reporters and Editors, as well as Google tools training in collaboration with the Society of Professional Journalists. Also, data journalism MOOCs will be launched with the Knight Center for Journalism in the Americas this fall.
Python for Journalists Course
Datajournalism.com has released a four-course module on Python for data journalism. Listen to data journalism trainer Winny de Jong teach you how to set up Python on your computer, clean up messy datasets, analyze data and conduct web scraping. Best of all, the course is free!
Unequal Income Distribution
“The more beautiful the view, the higher the income inequality.” SRF Data took a hard look at wage differentials in Switzerland and created an interactive map showing the income distribution in each community. The most uneven income distribution was found in Anières, a municipality in the canton of Geneva. (In German.)
The UK’s Gender Pay Gap
BBC journalists have dug into data reported by British firms on the difference between what they pay men and women. They found that 8,124 companies pay men more while just 1,424 pay women more.
Relationship Between Pharma and Doctors
In Switzerland, pharmaceutical companies inject millions into medical companies that conduct training for doctors. Le Temps dives into the issues surrounding the relationship between big pharma and the medical community, and finds a debate casting doubt on the independence and impartiality of doctors. Here’s how the team investigated this story. (In French.)
Does a goal scored just before half-time significantly affect the outcome of the game? Der Spiegel analyzed data from more than 45,000 matches from the four European leagues to find out.
Eunice Au is GIJN’s program coordinator. Previously, she was a Malaysia correspondent for Singapore’s The Straits Times, and a journalist at the New Straits Times. She has also written for The Sun, Malaysian Today and Madam Chair.