My Software & Spreadsheets, Psychometrics, Statistics

Using the truncated normal distribution

The term truncated normal distribution may sound highly technical but it is actually fairly simple and has many practical applications. If the math below is daunting, be assured that it is not necessary to understand the notation and the technical details. I have created a user-friendly spreadsheet that performs all the calculations automatically.

The mean of a truncated normal distribution

Imagine that your school district has a gifted education program. All students in the program have an IQ of 130 or higher. What is the average IQ of this group? Assume that in your school district, IQ is normally distributed with a mean of 100 and a standard deviation of 15.

Truncated Normal

Questions like this one can be answered by calculating the mean of the truncated normal distribution. The truncated normal distribution is a normal distribution in which one or both ends have been sliced off (i.e., truncated). In this case, everything below 130 has been sliced off (and there is no upper bound).

Four parameters determine the properties of the truncated normal distribution:

μ = mean of the normal distribution (before truncation)
σ = standard deviation of the normal distribution (before truncation)
a = the lower bound of the distribution (can be as low as −∞)
b = the upper bound of the distribution (can be as high as +∞)

The formula for the mean of a truncated distribution is a bit of a mess but can be simplified by finding the z-scores associated with the lower and upper bounds of the distribution:

z_a=\dfrac{a-\mu}{\sigma}

z_b=\dfrac{b-\mu}{\sigma}

The expected value of the truncated distribution (i.e., the mean):
E(X)=\mu+\sigma\dfrac{\phi(z_a)-\phi(z_b)}{\Phi(z_b)-\Phi(z_a)}

Where \phi is the probability density function of the standard normal distribution (NORMDIST(z,0,1,FALSE) in Excel, dnorm(z) in R) and \Phi is the cumulative distribution function of the standard normal distribution (NORMSDIST(z) in Excel, pnorm(z) in R).

This spreadsheet calculates the mean (and standard deviation) of a truncated distribution. See the part below the plot that says “Truncated Normal Distribution.”

In R you could make a function to calculate the mean of a truncated distribution like so:

MeanNormalTruncated<-function(mu=0,sigma=1,a=-Inf,b=Inf){
  mu+sigma*(dnorm((a-mu)/sigma)-dnorm((b-mu)/sigma))/(pnorm((b-mu)/sigma)-pnorm((a-mu)/sigma))
}

#Example: Find the mean of a truncated normal distribution with a mu = 100, sigma = 15, and lower bound = 130
MeanNormalTruncated(mu=100,sigma=15,a=130)

The cumulative distribution function of the truncated normal distribution

Suppose that we wish to know the proportion of students in the same gifted education program who score 140 or more. The cumulative truncated normal distribution function tells us the proportion of the distribution that is less than a particular value.

cdf=\dfrac{\Phi(z_x)-\Phi(z_a)}{\Phi(z_b)-\Phi(z_a)}

Where z_x = \dfrac{X-\mu}{\sigma}

In the previously mentioned spreadsheet, the cumulative distribution function is the proportion of the shaded region that is less than the value you specify.

You can create your own cumulative distribution function for the truncated normal distribution in R like so:

cdfNormalTruncated<-function(x=0,mu=0,sigma=1,a=-Inf,b=Inf){
  (pnorm((x-mu)/sigma)-pnorm((a-mu)/sigma))/(pnorm((b-mu)/sigma)-pnorm((a-mu)/sigma))
}
#Example: Find the proportion of the distribution less than 140
cdfNormalTruncated(x=140,mu=100,sigma=15,a=130)

In this case, the cumulative distribution function returns approximately 0.8316. Subtracting from 1, gives the proportion of scores 140 and higher: 0.1684. This means that about 17% of students in the gifted program can be expected to have IQ scores of 140 or more.1

The truncated normal distribution in R

A fuller range of functions related to the truncated normal distribution can be found in the truncnorm package in R, including the expected value (mean), variance, pdf, cdf, quantile, and random number generation functions.

1 In the interest of precision, I need to say that because IQ scores are rounded to the nearest integer, a slight adjustment needs to be made. The true lower bound of the truncated distribution is not 130 but 129.5. Furthermore, we want the proportion of scores 139.5 and higher, not 140 and higher. This means that the expected proportion of students with IQ scores of “140” and higher in the gifted program is about 0.1718 instead of 0.1684. Of course, there is little difference between these estimates and such precision is not usually needed for “back-of-the-envelope” estimates such as this one.
Advertisements
Standard
My Software & Spreadsheets, Psychometrics, R, Statistics, Video

An easy way to simulate data according to a specific structural model.

I have made an easy-to-use Excel spreadsheet that can simulate data according to a latent structure that you specify. You do not need to know anything about R but you’ll need to install it. RStudio is not necessary but it makes life easier. In this video tutorial, I explain how to use the spreadsheet.

This project is still “in beta” so there may still be errors in it. If you find any, let me know.

If you need something with more features and is further along in its development cycle, consider simulating data with the R package simsem.

Standard
Cognitive Assessment, My Software & Spreadsheets, Psychometrics

What if we took our models seriously? Slides from my NASP 2014 talk

WISC-IV Five-Factor Model

WISC-IV Five-Factor Model

I was part of a symposium last week at NASP on the factor structure of the WISC-IV and WAIS-IV. It was organized and moderated by Renée Tobin, who edited the special issue of the Journal of Psychoeducational Assessment on the same topic. The other presenters were Larry Weiss, Tim Keith, Gary Canivez, Joe Kush, Dawn Flanagan, and Vinny Alfonso.

The slides for my talk are here. The text for the talk is written in the “Notes” for each slide.

I have spoken at length on this topic in my article on the special issue and in a companion video.

The spreadsheets that accompany the article are here:

WISC-IV Spreadsheet

WAIS-IV Spreadsheet

Standard
Cognitive Assessment, My Software & Spreadsheets, Psychometrics, Tutorial, Uncategorized, Video

Estimating Latent Scores in Individuals

How to estimate latent scores in individuals when there is a known structural model:

I wrote a commentary in a special issue of the Journal Psychoeducational Assessment. My article proposes a new way to interpret cognitive profiles. The basic idea is to use the best available latent variable model of the tests and then estimate an individual’s latent scores (with confidence intervals around those estimates). I have made two spreadsheets available, one for the WISC-IV and one for the WAIS-IV.

Five-Factor Model of the WISC-IV

Four-Factor Model of the WAIS-IV

I decided not to provide a spreadsheet for the five-factor model of the WAIS-IV because Gf and g were so highly correlated in that model that it would be nearly impossible to distinguish between Gf and g in individuals. You can think of Gf and g as nearly synonymous (at the latent level).

Schneider, W. J. (2013). What if we took our models seriously? Estimating latent scores in individuals. Journal of Psychoeducational Assessment, 31, 186–201.

Standard
Cognitive Assessment, My Software & Spreadsheets, Psychometrics, Psychometrics from the Ground Up, Tutorial, Uncategorized, Video

Psychometrics from the Ground Up 9: Standard Scores and Why We Need Them

In this video tutorial, I explain why we have standard scores, why there are so many different kinds of standard scores, and how to convert between any two types of standard scores.

Here is my Excel spreadsheet that converts any type of standard score to any other type.

Standard
Cognitive Assessment, My Software & Spreadsheets, Tutorial, Video

TableMaker for Psychological Evaluation Reports

TableMaker

I am proud to announce the release of my new computer program, TableMaker for Psychological Evaluation Reports. It is designed to help providers of psychological assessments organize and present test data in a simple, efficient, and theoretically informed manner. You enter an evaluee’s test scores in an order that is convenient to you and theoretically organized tables are generated in MS Word like so:

This video tutorial explains how to use the program.

TableMaker is free. For now, unfortunately, it runs on Windows only. Mac users can still use the Excel spreadsheet I made several years ago, which can do much of what the TableMaker does but is less convenient and less flexible.

Standard