Image: Shutterstock

Stories

•

Topics

» Data Journalism

4 Things Data Journalists Need to Know about Standard Deviation

by Denise-Marie Ordway • August 17, 2022

Read this article in

If you’re a journalist who reads academic research, you’ve likely seen the term “standard deviation” many times. If you’re not sure what it means or how to explain it to audiences, keep reading, because we’re going to break it down for you.

Here are four key things you need to know.

1. The standard deviation of a dataset is a number that indicates how much variation there is within the data.

When researchers analyze quantitative data such as birth rates, temperature readings, and student test scores, they typically calculate the standard deviation of the data to gauge how close or far apart the data is. A higher standard deviation means the data is more spread out. The lower the standard deviation, the more closely data cluster around the average value of the data.

Deborah J. Rumsey, a statistics professor at the Ohio State University, points out in her book Statistics for Dummies that the measure provides critical context.

“Without it, you’re getting only part of the story about the data,” she wrote. “Statisticians like to tell the story about the man who had one foot in a bucket of ice water and the other foot in a bucket of boiling water. He said, on average, he felt just great! But think about the variability in the two temperatures for each of his feet. Closer to home, the average house price, for example, tells you nothing about the range of house prices you may encounter when house-hunting. The average salary may not fully represent what’s really going on in your company, if the salaries are extremely spread out.”

2. Scientists can use standard deviation to make predictions, investigate trends, and answer other key research questions.

The standard deviation of a dataset plays a limited role in many academic studies. Scientists might only note standard deviation values in a table or list, or mention them within the body of an academic article.

Sometimes, however, researchers rely heavily on the measure to help them answer questions central to their studies. For example:

Researchers can make predictions about the weather, voter behavior, tax revenue, healthcare usage, and a host of other things based partly on the standard deviation of data gathered over time.
Equities researchers typically use the standard deviation of stock prices to measure market volatility, with a high standard deviation indicating high volatility.
Researchers examining student test scores can use the standard deviation to determine whether most students perform at or close to the average, or whether test scores are all over the place. The measure also allows researchers to estimate the proportion of students who need more help mastering the material.

Here’s a brief explanation of how to calculate standard deviation.

3. In some studies, scientists report their findings in terms of standard deviations instead of a unit of measurement, such as inches or pounds.

When datasets have data points with different units, scientists often need to standardize, or rescale, the data before they can draw comparisons and look for relationships. For instance, scientists might want to examine the relationship between orange juice consumption, measured in ounces [or grams], and flu vaccination rates, measured as the number of vaccines administered each month per 100,000 residents.

The process of standardizing data includes dividing each numerical data point by the standard deviation of the dataset. Doing this changes the units of measurement. Instead of expressing findings using common units such as ounces, inches, and pounds — or kilograms — they must be reported in terms of standard deviations.

Hypothetically, scientists looking at orange juice consumption and flu vaccination rates could conclude that a one standard deviation increase in juice consumption is associated with a one standard deviation reduction in vaccination rates.

While standardizing datasets can make them easier for researchers to work with, Brian Healy, an associate professor of neurology at Harvard Medical School, notes many people might have difficulty understanding the results. He urges journalists to read these papers closely.

“The problem is, unless you look really closely in the paper, you’ll have no idea what a one standard deviation means,” says Healy, who’s also the lead biostatistician for the Partners Multiple Sclerosis Center at Brigham and Women’s Hospital in Boston.

“Do understand the units that results are being shown in,” he adds. “If there is a number reported, you want to make sure you understand how to interpret the number, and you can’t understand how to interpret the number without knowing the units.”

4. Scientists can use standard deviation to help confirm whether a data point they consider an outlier actually is an outlier.

Outliers are extremely high or low values that can complicate statistical analyses, and skew results. Many researchers will remove or alter outliers caused by error — for example, an error in collecting or entering data.

When you look at a graph of all the data in a dataset, some data points appear to be outliers because they differ so much from the others. Since the standard deviation of a dataset takes into account how far away individual values are from the average, scientists often use it to gauge whether an unusual data point is an outlier. This method works well for datasets that follow the pattern of a symmetrical, bell-shaped curve in which the majority of data converge near the center of the bell, where the average value is located.

After calculating the standard deviation for that dataset, it’s easy to spot outliers. A general rule of thumb for data that follows a bell-shaped curve is that approximately 99.7% of the data will be within three standard deviations of the average. Data outside this boundary are usually deemed outliers.

Although the standard deviation of a dataset is affected by outliers, journalists should not assume a large standard deviation indicates data quality problems. As Rumsey writes in Statistics for Dummies, “a large standard deviation isn’t necessarily a bad thing; it just reflects a large amount of variation in the group that is being studied.”

This post was originally published by The Journalist’s Resource, and is reprinted here via its Creative Commons license. The Journalist’s Resource would like to thank Troy Quast, a professor of health economics at the University of South Florida’s College of Public Health, and Brian Healy, an associate professor of neurology at Harvard Medical School, for their help creating this tipsheet.

Additional Resources

5 Things Journalists Need to Know About Statistical Significance

New Data Tools and Tips for Investigating Climate Change

GIJN Resource Center: Data Journalism

Denise-Marie Ordway is managing editor of The Journalist’s Resource, which she joined in 2015 after working for newspapers and radio stations in the US and Central America. Her work has appeared in USA TODAY, The New York Times, and The Washington Post. She was a 2014-15 Harvard Nieman Fellow.

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License

Republish our articles for free, online or in print, under a Creative Commons license.

Read other stories tagged with:

academic research climate Cross post data journalism outliers predictions standard deviation statistics variation

Republish this article

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License

Material from GIJN’s website is generally available for republication under a Creative Commons Attribution-NonCommercial 4.0 International license. Images usually are published under a different license, so we advise you to use alternatives or contact us regarding permission. Here are our full terms for republication. You must credit the author, link to the original story, and name GIJN as the first publisher. For any queries or to send us a courtesy republication note, write to hello@gijn.org.

<h2>4 Things Data Journalists Need to Know about Standard Deviation</h2> by <a href="https://twitter.com/deniseordway?lang=en">Denise-Marie Ordway</a> for Global Investigative Journalism Network &bull; August 17, 2022 If you&rsquo;re a journalist who reads academic research, you&rsquo;ve likely seen the term &ldquo;standard deviation&rdquo; many times. If you&rsquo;re not sure what it means or how to explain it to audiences, keep reading, because we&rsquo;re going to break it down for you.Here are four key things you need to know.1. The standard deviation of a dataset is a number that indicates how much variation there is within the data.When researchers analyze quantitative data such as birth rates, temperature readings, and student test scores, they typically calculate the standard deviation of the data to&nbsp;<a href="https://www.ncbi.nlm.nih.gov/books/NBK574574/">gauge how close or far apart the data is</a>. A higher standard deviation means the data is more spread out. The lower the standard deviation, the more closely data cluster around the average value of the data.<aside class="module align-right half type-pull-quote">&ldquo;Statisticians like to tell the story about the man who had one foot in a bucket of ice water and the other in boiling water." -- statistics professor Deborah J. Rumsey</aside><a href="https://stat.osu.edu/people/rumsey-johnson.1">Deborah J. Rumsey</a>, a statistics professor at the Ohio State University, points out in her book&nbsp;<a href="https://www.dummies.com/article/academics-the-arts/math/statistics/finding-standard-deviation-in-a-statistical-sample-169326/">Statistics for Dummies</a>&nbsp;that the measure provides critical context.&ldquo;Without it, you&rsquo;re getting only part of the story about the data,&rdquo; she wrote. &ldquo;Statisticians like to tell the story about the man who had one foot in a bucket of ice water and the other foot in a bucket of boiling water. He said, on average, he felt just great! But think about the variability in the two temperatures for each of his feet. Closer to home, the average house price, for example, tells you nothing about the range of house prices you may encounter when house-hunting. The average salary may not fully represent what&rsquo;s really going on in your company, if the salaries are extremely spread out.&rdquo;<a href="https://gijn.org/wp-content/uploads/2022/08/Deviation-JR.png"><img class="aligncenter wp-image-562619 size-large" src="https://gijn.org/wp-content/uploads/2022/08/Deviation-JR-771x560.png" alt="" width="771" height="560"></a>2. Scientists can use standard deviation to make predictions, investigate trends, and answer other key research questions.The standard deviation of a dataset plays a limited role in many academic studies. Scientists might only note standard deviation values in a table or list, or mention them within the body of an academic article.Sometimes, however, researchers rely heavily on the measure to help them answer questions central to their studies. For example:<ul>
<li>Researchers can make predictions about the weather, voter behavior, tax revenue, healthcare usage, and a host of other things based partly on the standard deviation of data gathered over time.</li>
<li>Equities researchers typically use&nbsp;<a href="https://www.emerald.com/insight/content/doi/10.1108/SEF-09-2020-0389/full/html">the standard deviation of stock prices to measure market volatility</a>, with a high standard deviation indicating high volatility.</li>
<li>Researchers examining student test scores can use the standard deviation to determine whether most students perform at or close to the average, or whether test scores are all over the place. The measure also allows researchers to estimate the proportion of students who need more help mastering the material.</li>
</ul>Here&rsquo;s&nbsp;<a href="https://www.ncbi.nlm.nih.gov/books/NBK574574/">a brief explanation</a>&nbsp;of how to calculate standard deviation.3. In some studies, scientists report their findings in terms of standard deviations instead of a unit of measurement, such as inches or pounds.When datasets have data points with different units, scientists often need to standardize, or rescale, the data before they can draw comparisons and look for relationships. For instance, scientists might want to examine the relationship between orange juice consumption, measured in ounces [or grams], and flu vaccination rates, measured as the number of vaccines administered each month per 100,000 residents.The process of standardizing data includes dividing each numerical data point by the standard deviation of the dataset. Doing this changes the units of measurement. Instead of expressing findings using common units such as ounces, inches, and pounds -- or kilograms -- they must be reported in terms of standard deviations.<aside class="module align-right half type-pull-quote">"If there is a number reported, you want to make sure you understand how to interpret the number." &mdash; Harvard Medical School professor Brian Healy</aside>Hypothetically, scientists looking at orange juice consumption and flu vaccination rates could conclude that a one standard deviation increase in juice consumption is associated with a one standard deviation reduction in vaccination rates.While standardizing datasets can make them easier for researchers to work with,&nbsp;<a href="https://postgraduateeducation.hms.harvard.edu/faculty-staff/brian-healy-0">Brian Healy</a>, an associate professor of neurology at Harvard Medical School, notes many people might have difficulty understanding the results. He urges journalists to read these papers closely.&ldquo;The problem is, unless you look really closely in the paper, you&rsquo;ll have no idea what a one standard deviation means,&rdquo; says Healy, who&rsquo;s also the lead biostatistician for the Partners Multiple Sclerosis Center at Brigham and Women&rsquo;s Hospital in Boston.&ldquo;Do understand the units that results are being shown in,&rdquo; he adds. &ldquo;If there is a number reported, you want to make sure you understand how to interpret the number, and you can&rsquo;t understand how to interpret the number without knowing the units.&rdquo;4. Scientists can use standard deviation to help confirm whether a data point they consider an outlier actually is an outlier.Outliers are extremely high or low values that can&nbsp;<a href="https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1139&amp;context=pare">complicate statistical analyses, and skew results</a>. Many researchers will remove or alter outliers caused by error -- for example, an error in collecting or entering data.When you look at a graph of all the data in a dataset, some data points appear to be outliers because they differ so much from the others. Since the standard deviation of a dataset takes into account how far away individual values are from the average, scientists often use it to gauge whether an unusual data point is an outlier. This method works well for datasets that follow the pattern of a symmetrical, bell-shaped curve in which the majority of data converge near the center of the bell, where the average value is located.<aside class="module align-right half type-pull-quote">Although the standard deviation is affected by outliers, journalists should not assume a large standard deviation indicates data quality problems.</aside>After calculating the standard deviation for that dataset, it&rsquo;s easy to spot outliers. A&nbsp;<a href="http://faculty.washington.edu/tamre/GS105Lecture2.pdf">general rule of thumb</a>&nbsp;for data that follows a bell-shaped curve is that approximately 99.7% of the data will be within three standard deviations of the average. Data outside this boundary are usually deemed outliers.Although the standard deviation of a dataset is affected by outliers, journalists should not assume a large standard deviation indicates data quality problems. As Rumsey writes in Statistics for Dummies, &ldquo;a large standard deviation isn&rsquo;t necessarily a bad thing; it just reflects a large amount of variation in the group that is being studied.&rdquo;This&nbsp;post was <a href="https://journalistsresource.org/media/standard-deviation-data-journalists/">originally published</a> by&nbsp;<a href="https://journalistsresource.org/">The Journalist&rsquo;s Resource</a>, and is reprinted here via its <a href="https://creativecommons.org/licenses/by-nd/4.0/">Creative Commons license</a>.&nbsp;The Journalist&rsquo;s Resource would like to thank&nbsp;<a href="https://health.usf.edu/publichealth/overviewcoph/faculty/troy-quast">Troy Quast</a>, a professor of health economics at the University of South Florida&rsquo;s College of Public Health, and&nbsp;<a href="https://postgraduateeducation.hms.harvard.edu/faculty-staff/brian-healy-0">Brian Healy</a>, an associate professor of neurology at Harvard Medical School, for their help creating this tipsheet.<h4>Additional Resources</h4><a href="https://gijn.org/2022/07/20/5-things-journalists-need-to-know-about-statistical-significance/">5 Things Journalists Need to Know About Statistical Significance</a><a href="https://gijn.org/2021/05/11/new-data-tools-and-tips-for-investigating-climate-change/">New Data Tools and Tips for Investigating Climate Change</a><a href="https://helpdesk.gijn.org/support/solutions/articles/14000036505-data-journalism">GIJN Resource Center: Data Journalism</a><hr><a href="https://gijn.org/wp-content/uploads/2022/04/Ordway.jpeg"><img class=" wp-image-509778 alignleft" src="https://gijn.org/wp-content/uploads/2022/04/Ordway.jpeg" alt="Denise-Marie Ordway" width="167" height="167"></a><a href="https://twitter.com/deniseordway?lang=en">Denise-Marie Ordway</a> is managing editor of The Journalist&rsquo;s Resource, which she joined in 2015 after working for newspapers and radio stations in the US and Central America. Her work has appeared in USA TODAY, The New York Times, and The Washington Post. She was a 2014-15 Harvard Nieman Fellow.
	This <a target="_blank" href="https://gijn.org/stories/4-things-data-journalists-need-to-know-about-standard-deviation/">article</a> first appeared on <a target="_blank" href="https://gijn.org">Global Investigative Journalism Network</a> and is republished here under a Creative Commons license.
	<img id="republication-tracker-tool-source" src="https://gijn.org/?republication-pixel=true&amp;post=657947&amp;ga=UA-21528033-17">

Turning the Threat to a Distant Glacier into a Local Story Through Data Visualization

by Lauren Salem, Storybench • July 3, 2026

New York Times climate and environmental graphics reporter Mira Rojanasakul discusses how her team visualized the sea level rise threat from the melting Thwaites Glacier in Antarctica.

Data Journalism

One Name at a Time: How Die Zeit Built a Searchable Database of Nazi Party Members

by Hanna Duggal • June 26, 2026

An online tool set up by the German newspaper Die Zeit, in cooperation with archives in Germany and in the United States, allows people to search several million Nazi Party membership cards.

People,Drive,Past,Piles,Of,Debris,Lining,The,Road,After

Data Journalism

How The Washington Post Combined Data and Human Stories to Cover Hurricane Helene’s Aftermath

by Isabelle Warren, Storybench • June 12, 2026

From Storybench, Washington Post climate reporter Brady Dennis is interviewed about his immersive story on Hurricane Helene and the importance of combining data and human experience.

Data Journalism

Mexico Scrapped Its Transparency Agency — Journalists Are Still Investigating Corruption

by César López Linares, LatAm Journalism Review • June 11, 2026

A year after Mexico dissolved the autonomous body that oversaw government transparency, journalists are still finding ways to access public documents and conduct data-based investigations.

Accessibility Settings

text size

color options

reading tools

other

Stories

Topics

4 Things Data Journalists Need to Know about Standard Deviation

Read this article in

Additional Resources

Read other stories tagged with:

Republish this article

Read Next

Climate Data Journalism

Turning the Threat to a Distant Glacier into a Local Story Through Data Visualization

Data Journalism

One Name at a Time: How Die Zeit Built a Searchable Database of Nazi Party Members

Data Journalism

How The Washington Post Combined Data and Human Stories to Cover Hurricane Helene’s Aftermath

Data Journalism

Mexico Scrapped Its Transparency Agency — Journalists Are Still Investigating Corruption

Stories

Topics

4 Things Data Journalists Need to Know about Standard Deviation

Read this article in

Related Resources

Step-By-Step Guide for Journalists on the Basics of Google Sheets

Tipsheet for Using Ocean Data in Your Investigations

No Coding Required: A Step-by-Step Guide to Scraping Websites With Data Miner

GIJC23 – The Future of Data Journalism: New Analytical Tools, Data Visualization, and AI

Share

Additional Resources

Related Resources

Step-By-Step Guide for Journalists on the Basics of Google Sheets

Tipsheet for Using Ocean Data in Your Investigations

No Coding Required: A Step-by-Step Guide to Scraping Websites With Data Miner

GIJC23 – The Future of Data Journalism: New Analytical Tools, Data Visualization, and AI

Related Stories

Turning the Threat to a Distant Glacier into a Local Story Through Data Visualization

One Name at a Time: How Die Zeit Built a Searchable Database of Nazi Party Members

How The Washington Post Combined Data and Human Stories to Cover Hurricane Helene’s Aftermath

Mexico Scrapped Its Transparency Agency — Journalists Are Still Investigating Corruption

Read other stories tagged with:

Republish this article

Read Next

Climate Data Journalism

Turning the Threat to a Distant Glacier into a Local Story Through Data Visualization

Data Journalism

One Name at a Time: How Die Zeit Built a Searchable Database of Nazi Party Members

Data Journalism

How The Washington Post Combined Data and Human Stories to Cover Hurricane Helene’s Aftermath

Data Journalism

Mexico Scrapped Its Transparency Agency — Journalists Are Still Investigating Corruption