Data Journalism
Data Cleaning Tools and Techniques for Non-Coders
Understanding, organizing, and validating data directly affects the accuracy of stories. New tools make cleaning accessible to journalists without coding.
Understanding, organizing, and validating data directly affects the accuracy of stories. New tools make cleaning accessible to journalists without coding.
What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from February 10 to 16 finds The Guardian US analyzing food expenditure on the Democrats’ campaign trail, The Washington Post’s Steven Rich sharing the pains of cleaning spelling permutations in data, The International Consortium of Investigative Journalists explaining the process of handling the massive #LuandaLeaks records, and the Sigma Awards announcing its shortlist.
What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from June 10 to 16 finds a gorgeous series of data visualizations by National Geographic on the Ring of Fire, Bloomberg analyzing the influence of dynasties in the Philippines’ politics, RStudio’s Hadley Wickham talking about why you shouldn’t assign data cleaning to a “data janitor,” and BR Recherche looking at the complexity of privacy policies.
Being struck by lightning is often used as an example of heavenly retribution because it is so unlikely. Fatalities due to lightning are statistical outliers, since most people struck by lightning survive. So what is the best way to avoid becoming one of these outliers? The following is a step-by-step set of instructions for unpacking a dataset – and being careful about the conclusions we draw.
What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from January 14 to 20 finds a @FinancialTimes exposé of possible electoral fraud in Congo, @sxywu’s beautiful visualization of interaction data between a museum and the public, and @maartenzam’s the ultimate data visualization list to end all lists.
What’s the global data journalism community tweeting about this week? Our NodeXL #ddj mapping from September 25 to October 1 has @FT mapping the route in a day of a London Uber driver, #NICAR18 registration opening up, @Lattif charting Africa’s internet shutdowns using @atlascharts and loads of German election data visualizations.