# Using the truncated normal distribution

The term truncated normal distribution may sound highly technical but it is actually fairly simple and has many practical applications. If the math below is daunting, be assured that it is not necessary to understand the notation and the technical details. I have created a user-friendly spreadsheet that performs all the calculations automatically.

# The mean of a truncated normal distribution

Imagine that your school district has a gifted education program. All students in the program have an IQ of 130 or higher. What is the average IQ of this group? Assume that in your school district, IQ is normally distributed with a mean of 100 and a standard deviation of 15.

Questions like this one can be answered by calculating the mean of the truncated normal distribution. The truncated normal distribution is a normal distribution in which one or both ends have been sliced off (i.e., truncated). In this case, everything below 130 has been sliced off (and there is no upper bound).

Four parameters determine the properties of the truncated normal distribution:

μ = mean of the normal distribution (before truncation)
σ = standard deviation of the normal distribution (before truncation)
a = the lower bound of the distribution (can be as low as −∞)
b = the upper bound of the distribution (can be as high as +∞)

The formula for the mean of a truncated distribution is a bit of a mess but can be simplified by finding the z-scores associated with the lower and upper bounds of the distribution:

$z_a=\dfrac{a-\mu}{\sigma}$

$z_b=\dfrac{b-\mu}{\sigma}$

The expected value of the truncated distribution (i.e., the mean):
$E(X)=\mu+\sigma\dfrac{\phi(z_a)-\phi(z_b)}{\Phi(z_b)-\Phi(z_a)}$

Where $\phi$ is the probability density function of the standard normal distribution (NORMDIST(z,0,1,FALSE) in Excel, dnorm(z) in R) and $\Phi$ is the cumulative distribution function of the standard normal distribution (NORMSDIST(z) in Excel, pnorm(z) in R).

This spreadsheet calculates the mean (and standard deviation) of a truncated distribution. See the part below the plot that says “Truncated Normal Distribution.”

In R you could make a function to calculate the mean of a truncated distribution like so:

MeanNormalTruncated<-function(mu=0,sigma=1,a=-Inf,b=Inf){
mu+sigma*(dnorm((a-mu)/sigma)-dnorm((b-mu)/sigma))/(pnorm((b-mu)/sigma)-pnorm((a-mu)/sigma))
}

#Example: Find the mean of a truncated normal distribution with a mu = 100, sigma = 15, and lower bound = 130
MeanNormalTruncated(mu=100,sigma=15,a=130)

# The cumulative distribution function of the truncated normal distribution

Suppose that we wish to know the proportion of students in the same gifted education program who score 140 or more. The cumulative truncated normal distribution function tells us the proportion of the distribution that is less than a particular value.

$cdf=\dfrac{\Phi(z_x)-\Phi(z_a)}{\Phi(z_b)-\Phi(z_a)}$

Where $z_x = \dfrac{X-\mu}{\sigma}$

In the previously mentioned spreadsheet, the cumulative distribution function is the proportion of the shaded region that is less than the value you specify.

You can create your own cumulative distribution function for the truncated normal distribution in R like so:

cdfNormalTruncated<-function(x=0,mu=0,sigma=1,a=-Inf,b=Inf){
(pnorm((x-mu)/sigma)-pnorm((a-mu)/sigma))/(pnorm((b-mu)/sigma)-pnorm((a-mu)/sigma))
}
#Example: Find the proportion of the distribution less than 140
cdfNormalTruncated(x=140,mu=100,sigma=15,a=130)

In this case, the cumulative distribution function returns approximately 0.8316. Subtracting from 1, gives the proportion of scores 140 and higher: 0.1684. This means that about 17% of students in the gifted program can be expected to have IQ scores of 140 or more.1

# The truncated normal distribution in R

A fuller range of functions related to the truncated normal distribution can be found in the truncnorm package in R, including the expected value (mean), variance, pdf, cdf, quantile, and random number generation functions.

1 In the interest of precision, I need to say that because IQ scores are rounded to the nearest integer, a slight adjustment needs to be made. The true lower bound of the truncated distribution is not 130 but 129.5. Furthermore, we want the proportion of scores 139.5 and higher, not 140 and higher. This means that the expected proportion of students with IQ scores of “140” and higher in the gifted program is about 0.1718 instead of 0.1684. Of course, there is little difference between these estimates and such precision is not usually needed for “back-of-the-envelope” estimates such as this one.
Standard

# An easy way to simulate data according to a specific structural model.

I have made an easy-to-use Excel spreadsheet that can simulate data according to a latent structure that you specify. You do not need to know anything about R but you’ll need to install it. RStudio is not necessary but it makes life easier. In this video tutorial, I explain how to use the spreadsheet.

This project is still “in beta” so there may still be errors in it. If you find any, let me know.

If you need something with more features and is further along in its development cycle, consider simulating data with the R package simsem.

Standard

# What if we took our models seriously? Slides from my NASP 2014 talk

WISC-IV Five-Factor Model

I was part of a symposium last week at NASP on the factor structure of the WISC-IV and WAIS-IV. It was organized and moderated by Renée Tobin, who edited the special issue of the Journal of Psychoeducational Assessment on the same topic. The other presenters were Larry Weiss, Tim Keith, Gary Canivez, Joe Kush, Dawn Flanagan, and Vinny Alfonso.

The slides for my talk are here. The text for the talk is written in the “Notes” for each slide.

I have spoken at length on this topic in my article on the special issue and in a companion video.

The spreadsheets that accompany the article are here:

Standard

# Estimating Latent Scores in Individuals

How to estimate latent scores in individuals when there is a known structural model:

I wrote a commentary in a special issue of the Journal Psychoeducational Assessment. My article proposes a new way to interpret cognitive profiles. The basic idea is to use the best available latent variable model of the tests and then estimate an individual’s latent scores (with confidence intervals around those estimates). I have made two spreadsheets available, one for the WISC-IV and one for the WAIS-IV.

Five-Factor Model of the WISC-IV

Four-Factor Model of the WAIS-IV

I decided not to provide a spreadsheet for the five-factor model of the WAIS-IV because Gf and g were so highly correlated in that model that it would be nearly impossible to distinguish between Gf and g in individuals. You can think of Gf and g as nearly synonymous (at the latent level).

Schneider, W. J. (2013). What if we took our models seriously? Estimating latent scores in individuals. Journal of Psychoeducational Assessment, 31, 186–201.

Standard

# Psychometrics from the Ground Up 9: Standard Scores and Why We Need Them

In this video tutorial, I explain why we have standard scores, why there are so many different kinds of standard scores, and how to convert between any two types of standard scores.

Standard

# TableMaker for Psychological Evaluation Reports

I am proud to announce the release of my new computer program, TableMaker for Psychological Evaluation Reports. It is designed to help providers of psychological assessments organize and present test data in a simple, efficient, and theoretically informed manner. You enter an evaluee’s test scores in an order that is convenient to you and theoretically organized tables are generated in MS Word like so:

This video tutorial explains how to use the program.

TableMaker is free. For now, unfortunately, it runs on Windows only. Mac users can still use the Excel spreadsheet I made several years ago, which can do much of what the TableMaker does but is less convenient and less flexible.

Standard

# The Compositator: New Software for the Woodcock-Johnson III

I am very excited to announce that the Woodcock-Muñoz Foundation’s WMF Press has published my FREE software program with an admittedly silly name:

Its purpose is anything but silly, though. The feature that gives it its name is that you can create custom composite scores from any test in three WJ III batteries (Cognitive, Achievement, and Diagnostic Supplement).

For example, Picture Vocabulary and Academic Knowledge from the WJ III Achievement battery can be combined with Verbal Comprehension and General Information from the WJ III Cognitive battery to form a more reliable and more comprehensive measure of crystallized intelligence.

The Compositator is a supplement to the scoring software for the WJ III (either the WJ III Compuscore and Profiles Program or the WIIIP). It will not run on a machine unless one of these programs is installed.

The Compositator not only allows you to combine subtests in a psychometrically sound manner, it allows you to combine statistical information in ways not previously possible. For example, it allows you to create a comprehensive model of reading ability, specifying the relationships among all the various cognitive and academic abilities in the WJ III. From there, you can do things like estimate how much a person’s reading comprehension would improve if auditory processing were remediated. If auditory processing is remediated, what are the simultaneous effects on reading decoding, reading fluency, and reading comprehension?

There is much more that the program can do. I’ve been working on this program for the past 3 years and have been thinking about it since 1999 when I was in graduate school. There is a comprehensive manual and video tutorials to get you started.

Kevin McGrew has been the earliest and most enthusiastic supporter of the program. His generous description of the program is here.

I hope that you find the program useful. I would love to hear from you, especially if you have ideas for improving the program.

Standard