My Software & Spreadsheets, Psychometrics, Statistics

Using the truncated normal distribution

The term truncated normal distribution may sound highly technical but it is actually fairly simple and has many practical applications. If the math below is daunting, be assured that it is not necessary to understand the notation and the technical details. I have created a user-friendly spreadsheet that performs all the calculations automatically.

The mean of a truncated normal distribution

Imagine that your school district has a gifted education program. All students in the program have an IQ of 130 or higher. What is the average IQ of this group? Assume that in your school district, IQ is normally distributed with a mean of 100 and a standard deviation of 15.

Truncated Normal

Questions like this one can be answered by calculating the mean of the truncated normal distribution. The truncated normal distribution is a normal distribution in which one or both ends have been sliced off (i.e., truncated). In this case, everything below 130 has been sliced off (and there is no upper bound).

Four parameters determine the properties of the truncated normal distribution:

μ = mean of the normal distribution (before truncation)
σ = standard deviation of the normal distribution (before truncation)
a = the lower bound of the distribution (can be as low as −∞)
b = the upper bound of the distribution (can be as high as +∞)

The formula for the mean of a truncated distribution is a bit of a mess but can be simplified by finding the z-scores associated with the lower and upper bounds of the distribution:

z_a=\dfrac{a-\mu}{\sigma}

z_b=\dfrac{b-\mu}{\sigma}

The expected value of the truncated distribution (i.e., the mean):
E(X)=\mu+\sigma\dfrac{\phi(z_a)-\phi(z_b)}{\Phi(z_b)-\Phi(z_a)}

Where \phi is the probability density function of the standard normal distribution (NORMDIST(z,0,1,FALSE) in Excel, dnorm(z) in R) and \Phi is the cumulative distribution function of the standard normal distribution (NORMSDIST(z) in Excel, pnorm(z) in R).

This spreadsheet calculates the mean (and standard deviation) of a truncated distribution. See the part below the plot that says “Truncated Normal Distribution.”

In R you could make a function to calculate the mean of a truncated distribution like so:

MeanNormalTruncated<-function(mu=0,sigma=1,a=-Inf,b=Inf){
  mu+sigma*(dnorm((a-mu)/sigma)-dnorm((b-mu)/sigma))/(pnorm((b-mu)/sigma)-pnorm((a-mu)/sigma))
}

#Example: Find the mean of a truncated normal distribution with a mu = 100, sigma = 15, and lower bound = 130
MeanNormalTruncated(mu=100,sigma=15,a=130)

The cumulative distribution function of the truncated normal distribution

Suppose that we wish to know the proportion of students in the same gifted education program who score 140 or more. The cumulative truncated normal distribution function tells us the proportion of the distribution that is less than a particular value.

cdf=\dfrac{\Phi(z_x)-\Phi(z_a)}{\Phi(z_b)-\Phi(z_a)}

Where z_x = \dfrac{X-\mu}{\sigma}

In the previously mentioned spreadsheet, the cumulative distribution function is the proportion of the shaded region that is less than the value you specify.

You can create your own cumulative distribution function for the truncated normal distribution in R like so:

cdfNormalTruncated<-function(x=0,mu=0,sigma=1,a=-Inf,b=Inf){
  (pnorm((x-mu)/sigma)-pnorm((a-mu)/sigma))/(pnorm((b-mu)/sigma)-pnorm((a-mu)/sigma))
}
#Example: Find the proportion of the distribution less than 140
cdfNormalTruncated(x=140,mu=100,sigma=15,a=130)

In this case, the cumulative distribution function returns approximately 0.8316. Subtracting from 1, gives the proportion of scores 140 and higher: 0.1684. This means that about 17% of students in the gifted program can be expected to have IQ scores of 140 or more.1

The truncated normal distribution in R

A fuller range of functions related to the truncated normal distribution can be found in the truncnorm package in R, including the expected value (mean), variance, pdf, cdf, quantile, and random number generation functions.

1 In the interest of precision, I need to say that because IQ scores are rounded to the nearest integer, a slight adjustment needs to be made. The true lower bound of the truncated distribution is not 130 but 129.5. Furthermore, we want the proportion of scores 139.5 and higher, not 140 and higher. This means that the expected proportion of students with IQ scores of “140” and higher in the gifted program is about 0.1718 instead of 0.1684. Of course, there is little difference between these estimates and such precision is not usually needed for “back-of-the-envelope” estimates such as this one.
Advertisements
Standard

4 thoughts on “Using the truncated normal distribution

  1. Pingback: Using the multivariate truncated normal distribution | Assessing Psyche, Engaging Gauss, Seeking Sophia

  2. Pingback: time enough | hbd chick

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s