Standard Normal Distribution Explained with Real-World Examples
In statistics and data science, few concepts are as widely used as normal and standard normal distributions. Whether you’re analyzing exam scores, quality control data, or stock market returns, the familiar bell-shaped curve shows up repeatedly. Therefore, it is important to understand the standard normal distribution, its properties, and use cases.
This article discusses the normal and standard normal distributions along with their probability density functions and properties. We will also discuss how to calculate probabilities for a variable being less than, equal to, or greater than a given value using the standard normal distribution curve and the z-score table. Finally, we will discuss some real-world use cases for the standard normal distribution.
What is normal distribution?
A normal distribution curve is a bell-shaped and symmetric curve centered around its mean. It describes how the values of a continuous variable are distributed when most data points cluster around the mean, and fewer data points occur as we move away from the mean.
Some examples of normal distribution in real-life datasets include the heights of people, IQ scores, errors in measurement, and blood pressure or cholesterol levels in a healthy population. All these variables follow a normal distribution and have the following characteristics:
- Symmetry: The normal distribution plot is perfectly symmetrical around the mean.
- Bell-shaped curve: The probability of a data point being equal to the mean is the highest and reduces as we move away from the mean on either side. Also, the data points are fairly concentrated near the mean value.
- Same measures of central tendency: A variable with a normal distribution curve has the same mean, median, and mode.
Having discussed the characteristics of a variable with normal distribution, let’s discuss the shape of normal distribution plots and their properties.
Probability
Learn the fundamentals of probability and how to quantify and visualize uncertainty. Try it for freeProperties of normal distribution plots
For a dataset with normal distribution, the probability density function (PDF) of the data points is given as follows:
In this formula,
- f(x) is the probability density value at point x.
μ
is the mean of the dataset. It controls the position of the plot on the x-axis.σ
is the standard deviation of the dataset. It controls the shape of the plot.
If we plot the PDF of a variable with a normal distribution, we get different plots depending on the mean and standard deviation values.
In this image, we have plotted the PDF of four different types of normal distributions. The dotted-blue curve represents the normal distribution curve with a mean of zero and a standard deviation of one. In the image, you can observe the following things:
- The normal distribution plots shift sideways with a change in the mean of the dataset. If we increase the mean, the plots shift to the right. The plots shift to the left upon decreasing the mean.
- The shape of the normal distribution curve changes when the standard deviation of the dataset changes. For a small standard deviation, the data points are concentrated around the mean, and we get a thin and tall bell curve. With an increase in standard deviation, the plots become wider and flatter.
- The total area under the curve of the normal distribution curve is always 1, as it represents the total probability.
Having discussed the different types of normal distribution plots, let’s focus more on the standard normal distribution, a special case of normal distribution.
What is standard normal distribution?
The standard normal distribution is a standard case of normal distribution, which has a mean of 0 and a standard deviation of 1. When we replace mean with zero and standard deviation with 1 in the PDF of the normal distribution curve, we get the PDF for the standard normal distribution curve.
A variable with a standard normal distribution is often represented as z, and we can rewrite the PDF as follows:
We use the properties of the standard normal distribution to calculate probabilities and make statistical inferences, especially in hypothesis testing like Z-tests. Let’s discuss finding probabilities using z-score and standard normal distribution plots.
Finding probabilities using the area under standard normal distribution plot
In a continuous distribution, the probability of getting an exact value is zero because there are infinite possible values. However, we can calculate the probability of a randomly selected value being less than a value, greater than a value, or falling between two values. For this task, we use the area under the normal distribution curve.
Find probability P(X<a)
To find the probability of a randomly selected value being less than a
, we calculate the area on the left side of the line x=a
in the standard normal distribution curve. We can calculate this area using the following formula:
Here, f(x) is the PDF of the standard normal distribution curve, and X is the random variable. For example, we can calculate the area under the curve on the left side of the line x=1.5
to find the probability of a value being less than 1.5, as shown in the following image:
Find probability P(X>a)
To find the probability of a randomly selected value being greater than a
, we calculate the area on the right side of the line x=a
under the standard normal distribution curve.
Again, f(x) is the PDF of the standard normal distribution curve. For example, if we want to calculate the probability of a value being greater than 1.5, we can calculate the area under the curve on the right side of the line x=1.5
, as shown in the following image:
Find probability P(a<X<b)
To find the probability of a randomly selected value being between a
and b
where a < b
, we calculate the area under the standard normal distribution curve between the lines x=a
and x=b
using the following formula:
For example, we can find the probability of a value being between -2 and 2 by calculating the area under the curve between the lines x=-2
and x=2
.
Finding the probabilities using integrals can be difficult. Hence, we can calculate the probabilities using z-scores and a pre-computed normal distribution table (Z-table).
Finding probabilities using z-score and standard normal distribution table
To find the probability of a random variable X
being less than a value or greater than a value, we use the z-table to find the area to the left of the z-score on the standard normal curve.
The z-table gives us the probability P(Z<z)
where z
is the z-score of the variable X
, and Z
is the z-score of a randomly selected variable less than X
. The z-score of a random variable X
measures how far X
is from the mean. For a given normal distribution curve, we can calculate the z-score of a value using the following formula:
Here,
z
represents the z-score for the variableX
.X
is the random variable.μ
is the mean of the dataset.σ
is the standard deviation.
For the standard normal distribution, the mean is 0, and the standard deviation is 1. Hence, the z-score of random variable X
with standard normal distribution is the same as the variable X
.
To understand how z-scores work, let’s use the z-score and the z-table to find probabilities of values being greater than a value, less than a value, and in between two values.
Find probability P(X<a)
To find the probability of a randomly selected value being less than 1.5, i.e., P(X<1.5)
, we will first calculate the z-score for the value 1.5. The z-score for X=1.5
for standard normal distribution is 1.50
.
After obtaining the z-score, we get the probability P(Z<1.5)
or P(X<1.5)
by getting the value in row 1.5
and column 0.00
in the z-table, which is 0.93315.
Find probability P(X>a)
To find the probability of a randomly selected value being greater than a
, we calculate the z-score z
for X=a
. As z
determines the probability P(Z<z)
, we will find the probability and take its complement P(Z>z)
to get the probability P(X>a)
. Let’s calculate the probability P(X>1.5)
to understand this.
Assuming standard normal distribution, the z-score for X=1.5
is 1.5, and the associated probability P(Z<1.5)
is 0.93315, we will calculate P(X>1.5)
as follows:
P(Z < 1.5)= 0.93315P(X > 1.5) = P(Z > 1.5)= 1 - P(Z < 1.5) = 1 - 0.93315 = 0.06685
Hence, the probability P(X>1.5)
is 0.06685 for a dataset with standard normal distribution.
Find probability P(a<X<b)
To find the probability P(a<X<b)
, we will first find the z-score for the values a
and b
. Let’s say that the z-score for the value a
is za and the same for the value b
is zb. Using za and zb, we can find the probabilities P(Z<za) and P(Z< zb), which is essentially P(X<a)
and P(X<b)
. Now, we can take the difference P(Z<zb) - P(Z<za) to get the value P(a<X<b)
.
To understand this, let’s calculate the probability P(-2<X<2)
. As X and z-scores are equal for the standard normal distribution, we will find the probabilities for z-scores -2 and 2 in the z-table.
- We get the probability P(Z< -2) by getting the value in row
-2.0
and column0.00
in the z-table, which is 0.02275. - We get the probability P(Z<2) by getting the value in row
2.0
and column0.00
in the z-table, which is 0.97725.
After getting these values, we can calculate P(-2<X<2)
, as shown below:
P(-2<X<2) = P(-2<Z<2)P(-2<Z<2) = P(Z<2)- P(Z<-2)P(Z<2 )= 0.97725 //obtained from the z-tableP(Z<-2) = 0.02275 //obtained from the z-tableP(-2<Z<2) = 0.97725-0.02275 = 0.9545P(-2<X<2) = 0.9545
Hence, we get the probability P(-2<X<2)
as 0.9545
for a dataset with standard normal distribution.
When we know the population mean and standard deviation, we can use the above approaches to find probabilities, even if the data doesn’t have a standard normal distribution.
For example, suppose that we have a dataset with normal distribution having mean μ=70
and standard deviation σ=15
. If we want to find the probability of a random value X being less than 90, i.e., P(X<90)
, we will first use the z-score formula to convert the value 90
to a standardized value.
z= (X-μ)/σz= (90-70)/15z=1.33
As we have z=1.33
, we will find the probability P(Z<1.33)
to find the probability P(X<90)
. For this, we will look at the value in row 1.3
and column 0.03
in the z-table, which is 0.90824. Hence, the probability P(X<90)
is 0.90824.
Similarly, we can calculate the z-score and find probabilities for other use cases, even if the dataset doesn’t have a standard normal distribution.
When to use standard normal distribution?
We use the standard normal distribution to simplify probability calculations for any normal distribution by converting values into z-scores. We can convert any normal distribution into a standard normal distribution using z-score normalization. This helps us easily calculate probabilities for large datasets with known mean and standard deviation values.
- The z-table approach to calculate probabilities only works with standard normal distribution. Therefore, we need to convert the data from normal to standard normal distribution using the z-score formula.
- We also use standard normal distribution in z-tests. For a z-test, we assume the data has a standard normal distribution. If the data doesn’t have a normal distribution, we shouldn’t use the z-score normalization to convert the dataset into standard normal form.
- When we are dealing with sample means instead of individual data points with sample sizes greater than 30, we can use the central limit theorem to assume that the distribution of the averages of repeated samples will be normal. Thus, we can use standard normal distribution even if the data is not normally distributed. This helps us perform hypothesis testing using z-tests, even with non-normal data.
Now that we have discussed the properties and uses of the standard normal distribution, let’s discuss some real-world examples of how we can use it.
Real-world examples using standard normal distribution
Standard normal distribution is used in various industrial and research studies to find probabilities and test hypotheses. The following are examples of calculating probabilities using the standard normal distribution and z-table.
Example 1: We are given that the average marks of candidates in an entrance test is 75, with a standard deviation of 7.5 marks. What is the probability of a randomly selected candidate having marks less than 90?
To find the probability of a randomly selected candidate having marks less than 90, i.e., P(X<90)
, we will first find the z-score for the marks value 90 using the z-score formula.
X= 90μ = 75σ = 7.5z = (X-μ)/σz = (90-75)/7.5z = 2.00
So, the z-score for 90 marks is 2.00 for the given data sample.
To calculate P(X<90)
, we need to find the probability P(Z<2)
. For this, we will get the value corresponding to z-score 2.00, which is present in row 2.0
and column 0.00
in the z-table. Thus, there is a 0.97725 probability that the marks of a random candidate will be less than 90.
Example 2: The weight of the adult population in the city has a mean of 50 kg with a standard deviation of 15 kg. If we select a person from the city, what is the probability that his weight will be greater than 75 kg?
To find the probability of a randomly selected person’s weight being greater than 75, i.e., P(X>75)
, we will first find the z-score for the value 75.
X= 75μ = 50σ = 15z = (X-μ)/σz = (75-50)/15z = 1.67
So, the z-score for 75 is 1.67 for the given mean and standard deviation. To calculate P(X>75)
, we need to find P(Z>1.67). The value corresponding to the z-score of 1.67 in the z-table can be found in row 1.6
and column 0.07
, which is 0.95254.
As we know, the z-table gives the probability P(Z<1.67)
. Hence, the value 0.95254 is the probability of a randomly selected person’s weight being less than 75, i.e., P(X<75)
. To find P(X>75)
, we will take the complement of the probability P(X<75)
.
P(X<75) = 0.95254P(X>75) = 1 - P(X<75)P(X>75) = 0.04746
Hence, there is a 0.048 probability that the weight of a randomly selected person will be greater than 75.
Example 3: A manufacturing plant produces bolts with lengths that are normally distributed with 50 mm average length and 2 mm standard deviation. A bolt is considered acceptable if its length is between 47 mm and 53 mm. What is the probability that a randomly selected bolt is acceptable?
For a randomly selected bolt to be acceptable, its length should be between 47 mm and 53 mm. Hence, we have to calculate the probability P(47<X<53)
where X
is a random variable representing the length of the bolt.
P(47<X<53) = P(X<53) - P(X<47)
Hence, if we can find P(X<47)
and P(X<53)
, we can easily get P(47<X<53)
.
To calculate P(X<47)
, we will first find the z-score for the value 47 for the given mean and standard deviation.
X= 47μ = 50σ = 2z = (X-μ)/σz = (47-50)/2z = -1.5
So, the z-score for 47 is -1.5 for the given mean and standard deviation. To calculate P(X<47)
or P(Z<-1.5)
, we will look at the value in the row -1.5
and column 0.00
in the z-table, which is 0.06681.
To find P(X<53)
, we will first find the z-score for the value 53.
X= 53μ = 50σ = 2z = (X-μ)/σz = (53-50)/2z = 1.5
So, the z-score for 53 is 1.5 for the given mean and standard deviation. To calculate the probability P(X<53)
or P(Z<1.5)
, we will look at the value in row 1.5
and column 0.00
in the z-table, which is 0.93319. Now, we can easily calculate P(47<X<53)
.
P(X<47) = 0.06681P(X<53) = 0.93319P(47<X<53)=P(X<53)-P(X<47)P(47<X<53)=0.93319- 0.06681P(47<X<53)=0.86638
Hence, there is a 0.87 probability that a randomly selected bolt will be of acceptable length.
Conclusion
The standard normal distribution simplifies the process of calculating probabilities for datasets with normal distribution. Understanding the z-score and using the standard normal table allows us to analyze data more effectively in various real-world scenarios. In this article, we discussed standard normal distribution and its properties. We also discussed calculating probabilities for real-world scenarios using the standard normal distribution curve and the z-score table.
To learn more about statistical concepts, you can take this course on Introduction to Hypothesis Testing. You might also like this course on Hypotheis Testing in Python that discusses how to perform z-tests and t-tests in Python.
FAQs
1. What is the 95% normal distribution value?
The 95% normal distribution value refers to the z-score that captures the middle 95% of the data in a standard normal distribution. The z-score for the 95% normal distribution value is 1.96. This means that 95% of the area under the standard normal curve falls within 1.96 standard deviations of the mean.
2. What is the range of z values in a standard normal distribution?
The z-score values in a standard normal distribution range from negative infinity (-∞) to positive infinity (+∞). While the theoretical range is infinite, most of the area under the curve (and thus the probability) falls within the z-score range of -3 to +3.
3. What is the purpose of z-score?
The z-score gives a measure of how many standard deviations a particular data point is away from the mean of its distribution.
4. How to convert the z-score to percentile?
To convert a z-score to a percentile, we can use the standard normal distribution table. After getting the probability from the z-table, we can multiply the value by 100 to get percentile values.
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
Differences Between Z-Test and T-Test
Learn the key differences between Z-Test and T-Test in hypothesis testing. Find out when to use each test based on sample size, variance, and assumptions. - Article
Normalization
This article describes why normalization is necessary. It also demonstrates the pros and cons of min-max normalization and z-score normalization. - Article
How to Convert CSS to SCSS
Guide on converting CSS to SCSS syntax.
Learn more on Codecademy
- Free course
Probability
Learn the fundamentals of probability and how to quantify and visualize uncertainty.Beginner Friendly4 hours - Skill path
Fundamental Math for Data Science
Build the mathematical skills you need to work in data science.Includes 8 CoursesWith CertificateBeginner Friendly12 hours - Free course
Statistics: Variance and Standard Deviation
Learn how to calculate, interpret, and report the variance and standard deviationBeginner Friendly1 hour
- What is normal distribution?
- Properties of normal distribution plots
- What is standard normal distribution?
- Finding probabilities using the area under standard normal distribution plot
- Finding probabilities using z-score and standard normal distribution table
- When to use standard normal distribution?
- Real-world examples using standard normal distribution
- Conclusion
- FAQs