Stories

•

Topics

» Data Journalism

Data Biographies: Getting to Know Your Data

by Heather Krause • March 27, 2017

Read this article in

One of the most important takeaways from the NICAR conference — in my opinion — is the understanding that data stories can be simultaneously confusing and exciting. While I was there, I led a presentation on the importance of data biographies, and I’d like to share some of what I talked about with you.

There are many experts out there with years or decades of experience producing fascinating data stories and there are just as many (okay, probably many more) people still learning how to use data and experimenting with data journalism. When I’m introducing students to the world of data analysis and visualization, I’m often asked what the most important step in working with data is, and my answer is always the same: developing data biographies.

Too often, inexperienced data users make the mistake of taking their data at face value — assuming the story they see at first glance is the true (and only) story the data has to tell. I like to encourage people to treat data the way they would a human source. You’d never write a story without researching the person who supplied your information — why treat data any differently?

Getting to Know Your Data

For every piece of data you’re going to include in your story, you need to create a data biography — the background, or origin of your data. Just as you’d do a background check on a human source before publishing what they told you, you need to understand your data:

Where did it come from?
Who collected it?
How was it collected?
Most importantly, why was it collected?

This task is not always as straightforward as it may look at first blush. But getting to know your data can reveal crucial gaps, bias, misinformation, or overlooked details in your story. Think about it this way: if a doctor told you that you needed more sugar in your diet, you might assume there was some medical reason for his suggestion. If a candy apple salesman told you the same thing, you’d probably perceive the information very differently. Likewise, data isn’t just about the numbers in front of you, but the story behind how those numbers got there in the first place.

Real World Example: Violence Against Women Stats

A while ago, our team was working on a data story about violence against women. We spent a bit of time searching for data sources and determined that the United Nations was a good starting point. We downloaded the UN’s data on both violence against women and intimate partner violence and started our analysis.

Examining the variable for intimate partner violence over a woman’s lifetime, we did a couple of quick plots to get an idea of what trends within various countries looked like:

Trends in some countries were surprising and indicated unusual changes in the rates of violence against women. We wondered what was happening.

Our logical first step after our quick glance at the data was to create data biographies for each of these points. We needed to know the background of the information we were looking at so that we could better understand the patterns we were seeing.

Data Biography: Where?

In this case, the first thing we noticed in our data was where the information was coming from. Some of the data reflected all women, some reflected only women of a certain age, and some only included women of a specific marital status. All this data was lumped together in the same variable — the same name, the same label, and no hint as to the differences in the data sources.

Data Biography: Who?

Next, we looked at who collected that data. Examining the UN’s documentation to complete our data biography revealed that a wide range of people and organizations had been involved in collecting the data contained in this variable.

Data Biography: How and Why?

Some of the parties collecting the data we were using had gathered it for national statistics purposes; some were advocates making a case; some were testing out new methodology. All of our data, collected using different methods and for different reasons, was presented in the same table, with the same variable name and the same labels. Had we not taken the time to get to know our data with a data biography, we would never have realized how different all these data points were.

Once we had completed our data biography, it became clear that some of the trends we had seen that looked like significant changes in violence rates were actually variations in the data collection.

Using our data biography, we determined that data collection in Rwanda was reasonably consistent across the years. Because we were confident the trends we saw in that data were actually happening, we could move forward in investigating what caused such a dramatic spike in violence against women there.

Interestingly, in the years shown above, Rwanda elected a majority female parliament and passed the country’s first-ever laws aimed at preventing violence against women. So what did that mean?

Was there a huge backlash against the government changes that drove up violent acts?

Or were more incidents of violence being reported now that women felt they had recourse?

Even with a good data biography, you’ll still have to take care in interpreting your data — we’ll talk more about that in our next post on bulletproofing your data.

Data Isn’t Always Objective

Those of you who participated in the free online data journalism course I led with Alberto Cairo recently may remember this clip explaining how to create a data biography:

Remember, by taking the time to create a data biography, you can tell your story with full confidence that your sources are accurate and trustworthy. Want a shortcut to creating quality data biographies? Download a free copy of our data biography template.

This post originally appeared on the website of Datassist and is cross-posted here with the author’s permission.

Heather Krause is a data scientist. She founded Datassist, an international team of data professionals, which provides data consulting to journalists, non-profits and policy makers worldwide.

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License

Republish our articles for free, online or in print, under a Creative Commons license.

Read other stories tagged with:

Alberto Cairo data biography data collection data journalism Data Stories intimate partner violence Knight MOOC NICAR17 violence against women

Republish this article

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License

Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

<h2>Data Biographies: Getting to Know Your Data</h2> by <a href="https://twitter.com/datassist">Heather Krause</a> for Global Investigative Journalism Network &bull; March 27, 2017 <a href="https://gijn.org/wp-content/uploads/2017/03/nicar17-banner.png"><img class="alignright wp-image-34937 size-medium" src="https://gijn.org/wp-content/uploads/2017/03/nicar17-banner-336x87.png" alt="" width="336" height="87"></a>One of the most important takeaways from the NICAR conference &mdash; in my opinion &mdash; is the understanding that data stories can be simultaneously confusing and exciting. While I was there, I led a presentation on the importance of data biographies, and I&rsquo;d like to share some of what I talked about with you.There are many experts out there with years or decades of experience producing fascinating data stories and there are just as many (okay, probably many more) people still learning how to use data and experimenting with data journalism. When I&rsquo;m introducing students to the world of data analysis and visualization, I&rsquo;m often asked what the most important step in working with data is, and my answer is always the same: developing data biographies.Too often, inexperienced data users make the mistake of taking their data at face value &mdash; assuming the story they see at first glance is the true (and only) story the data has to tell. I like to encourage people to treat data the way they would a human source. You&rsquo;d never write a story without researching the person who supplied your information &mdash; why treat data any differently?<h3>Getting to Know Your Data</h3>For every piece of data you&rsquo;re going to include in your story, you need to create a data biography &mdash; the background, or origin of your data. Just as you&rsquo;d do a background check on a human source before publishing what they told you, you need to understand your data:<ul>
<li>Where did it come from?</li>
<li>Who collected it?</li>
<li>How was it collected?</li>
<li>Most importantly, why was it collected?</li>
</ul>This task is not always as straightforward as it may look at first blush. But <a href="http://idatassist.com/strategic-resources-for-data-journalists/">getting to know your data</a> can reveal crucial gaps, bias, misinformation, or overlooked details in your story. Think about it this way: if a doctor told you that you needed more sugar in your diet, you might assume there was some medical reason for his suggestion. If a candy apple salesman told you the same thing, you&rsquo;d probably perceive the information very differently. Likewise, data isn&rsquo;t just about the numbers in front of you, but the story behind how those numbers got there in the first place.<h3>Real World Example: Violence Against Women Stats</h3>A while ago, our team was working on a data story about violence against women. We spent a bit of time searching for data sources and determined that the United Nations was a good starting point. We downloaded the UN&rsquo;s data on both <a href="https://unstats.un.org/unsd/gender/worldswomen.html">violence against women and intimate partner violence</a> and started our analysis.Examining the variable for intimate partner violence over a woman&rsquo;s lifetime, we did a couple of quick plots to get an idea of what trends within various countries looked like:<a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-violence-against-women.jpg"><img class="aligncenter wp-image-34927 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-violence-against-women-771x464.jpg" alt="" width="771" height="464"></a>Trends in some countries were surprising and indicated unusual changes in the rates of violence against women. We wondered what was happening.<a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-violence-against-women-2.jpg"><img class="aligncenter wp-image-34928 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-violence-against-women-2-771x495.jpg" alt="" width="771" height="495"></a>Our logical first step after our quick glance at the data was to create data biographies for each of these points. We needed to know the background of the information we were looking at so that we could better understand the patterns we were seeing.<h3>Data Biography: Where?</h3>In this case, the first thing we noticed in our data was where the information was coming from. Some of the data reflected all women, some reflected only women of a certain age, and some only included women of a specific marital status. All this data was lumped together in the same variable &mdash; the same name, the same label, and no hint as to the differences in the data sources.<h3><a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-data-breakdown.jpg"><img class="aligncenter wp-image-34929 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-data-breakdown-771x323.jpg" alt="" width="771" height="323"></a>Data Biography: Who?</h3>Next, we looked at who collected that data. Examining the UN&rsquo;s documentation to complete our data biography revealed that a wide range of people and organizations had been involved in collecting the data contained in this variable.<h3>Data Biography: How and Why?</h3>Some of the parties collecting the data we were using had gathered it for national statistics purposes; some were advocates making a case; some were testing out new methodology. All of our data, collected using different methods and for different reasons, was presented in the same table, with the same variable name and the same labels. Had we not taken the time to get to know our data with a data biography, we would never have realized how different all these data points were.<a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-different-data-points.jpg"><img class="aligncenter wp-image-34930 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-different-data-points-771x325.jpg" alt="" width="771" height="325"></a>Once we had completed our data biography, it became clear that some of the trends we had seen that looked like significant changes in violence rates were actually variations in the data collection.<a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-rwanda-1.jpg"><img class="aligncenter wp-image-34931 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-rwanda-1-771x505.jpg" alt="" width="771" height="505"></a><a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-rwanda-2.jpg"><img class="aligncenter wp-image-34932 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-rwanda-2-771x502.jpg" alt="" width="771" height="502"></a><a href="https://gijn.org/wp-content/uploads/2017/03/data-biographies-rwanda-3.jpg"><img class="aligncenter wp-image-34933 size-large" src="https://gijn.org/wp-content/uploads/2017/03/data-biographies-rwanda-3-771x420.jpg" alt="" width="771" height="420"></a>Using our data biography, we determined that data collection in Rwanda was reasonably consistent across the years. Because we were confident the trends we saw in that data were actually happening, we could move forward in investigating what caused such a dramatic spike in violence against women there.Interestingly, in the years shown above, Rwanda elected a majority female parliament and passed the country&rsquo;s first-ever laws aimed at preventing violence against women. So what did that mean?Was there a huge backlash against the government changes that drove up violent acts?Or were more incidents of violence being reported now that women felt they had recourse?Even with a good data biography, you&rsquo;ll still have to take care in <a href="http://idatassist.com/numbers-moving-in-a-negative-direction-is-not-always-a-negative-thing/">interpreting your data</a> &mdash; we&rsquo;ll talk more about that in our next post on bulletproofing your data.<h3>Data Isn&rsquo;t Always Objective</h3>Those of you who participated in the free online data journalism course I led with Alberto Cairo recently may remember this clip explaining how to create a data biography:Remember, by taking the time to create a data biography, you can tell your story with full confidence that your sources are accurate and trustworthy. Want a shortcut to creating quality data biographies? Download a free copy of our data biography <a href="https://www.dropbox.com/s/uau1bvjbjvvwgqa/Datassist%20Data%20Biography%20Template.xlsx?dl=0">template</a>.<hr><a href="https://gijn.org/wp-content/uploads/2017/03/heather-krause-profile.jpg"><img class="alignleft wp-image-34934" src="https://gijn.org/wp-content/uploads/2017/03/heather-krause-profile.jpg" alt="" width="180" height="180"></a>This post <a href="http://idatassist.com/data-biographies-how-to-get-to-know-your-data/">originally appeared</a> on the website of Datassist and is cross-posted here with the author's permission.<a href="https://twitter.com/datassist">Heather Krause</a> is a data scientist. She founded <a href="http://idatassist.com/">Datassist</a>, an international team of data professionals, which provides data consulting to journalists, non-profits and policy makers worldwide.
	This <a target="_blank" href="https://gijn.org/stories/data-biographies-getting-to-know-your-data/">article</a> first appeared on <a target="_blank" href="https://gijn.org">Global Investigative Journalism Network</a> and is republished here under a Creative Commons license.
	<img id="republication-tracker-tool-source" src="https://gijn.org/?republication-pixel=true&amp;post=657947&amp;ga=UA-21528033-17">

From Data to Storytelling: Concept and Design Tips from the Financial Times’ John Burn-Murdoch

by Hanna Duggal • June 20, 2025

The chief data reporter for the Financial Times discusses how he considers the use of text, color, and annotation to aid visual storytelling through charts and graphics.

Data journalism in the Middle East and North Africa

Data Journalism MENA Focus Week

How Middle Eastern Data Outlets Use Data to Challenge Narratives and Advance Accountability

by Hanna Duggal • April 11, 2025

Data journalism in the Middle East has been driven by organizations that have produced collaborations and projects that combine innovative techniques with nuanced local knowledge.

Data Journalism Teaching & Training

Test Your Data Visualization Skills With GIJN’s Latest Quiz

by Pinar Dag • August 15, 2024

GIJN Turkish Editor and data journalism instructor Pinar Dag is back with another quiz to test your knowledge of data visualization.

Data Journalism LATAM Focus News & Analysis

‘Shining a Light Where There Are Shadows’: Latin American Outlets Innovating With Data

by Lucero Hernández García • July 12, 2024

Data journalism is helping outlets across the region carry out innovative projects that reveal the stories hidden in large volumes of data.

Accessibility Settings

text size

color options

reading tools

other

Stories

Topics

Data Biographies: Getting to Know Your Data

Read this article in

Getting to Know Your Data

Real World Example: Violence Against Women Stats

Data Biography: Where?

Data Biography: Who?

Data Biography: How and Why?

Data Isn’t Always Objective

Read other stories tagged with:

Republish this article

Read Next

Data Journalism

From Data to Storytelling: Concept and Design Tips from the Financial Times’ John Burn-Murdoch

Data Journalism MENA Focus Week

How Middle Eastern Data Outlets Use Data to Challenge Narratives and Advance Accountability

Data Journalism Teaching & Training

Test Your Data Visualization Skills With GIJN’s Latest Quiz

Data Journalism LATAM Focus News & Analysis

‘Shining a Light Where There Are Shadows’: Latin American Outlets Innovating With Data

Stories

Topics

Data Biographies: Getting to Know Your Data

Read this article in

Related Resources

Step-By-Step Guide for Journalists on the Basics of Google Sheets

Tipsheet for Using Ocean Data in Your Investigations

No Coding Required: A Step-by-Step Guide to Scraping Websites With Data Miner

GIJC23 – The Future of Data Journalism: New Analytical Tools, Data Visualization, and AI

Share

Getting to Know Your Data

Real World Example: Violence Against Women Stats

Data Biography: Where?

Data Biography: Who?

Data Biography: How and Why?

Data Isn’t Always Objective

Related Resources

Step-By-Step Guide for Journalists on the Basics of Google Sheets

Tipsheet for Using Ocean Data in Your Investigations

No Coding Required: A Step-by-Step Guide to Scraping Websites With Data Miner

GIJC23 – The Future of Data Journalism: New Analytical Tools, Data Visualization, and AI

Related Stories

From Data to Storytelling: Concept and Design Tips from the Financial Times’ John Burn-Murdoch

How Middle Eastern Data Outlets Use Data to Challenge Narratives and Advance Accountability

Test Your Data Visualization Skills With GIJN’s Latest Quiz

‘Shining a Light Where There Are Shadows’: Latin American Outlets Innovating With Data

Read other stories tagged with:

Republish this article

Read Next

Data Journalism

From Data to Storytelling: Concept and Design Tips from the Financial Times’ John Burn-Murdoch

Data Journalism MENA Focus Week

How Middle Eastern Data Outlets Use Data to Challenge Narratives and Advance Accountability

Data Journalism Teaching & Training

Test Your Data Visualization Skills With GIJN’s Latest Quiz

Data Journalism LATAM Focus News & Analysis

‘Shining a Light Where There Are Shadows’: Latin American Outlets Innovating With Data