Cognitive Assessment, Psychometrics, Statistics

How unusual is it to have multiple scores below a threshold?

In psychological assessment, it is common to specify a threshold at which a score is considered unusual (e.g., 2 standard deviations above or below the mean). If we can assume that the scores are roughly normal, it is easy to estimate the proportion of people with scores below the threshold we have set. If the threshold is 2 standard deviations below the mean, then the Excel function NORMSDIST will tell us the answer:

=NORMSDIST(-2)

=0.023

In R, the pnorm function gives the same answer:

pnorm(-2)

How unusual is it to have multiple scores below the threshold? The answer depends on how correlated the scores are. If we can assume that the scores are multivariate normal, Crawford and colleagues (2007) show us how to obtain reasonable estimates using simulated data. Here is a script in R that depends on the mvtnorm package. Suppose that the 10 subtests of the WAIS-IV have correlations as depicted below. Because the subtests have a mean of 10 and a standard deviation of 3, the scores are unusually low if 4 or lower.

#WAIS-IV subtest names
WAISSubtests <- c("BD", "SI", "DS", "MR", "VO", "AR", "SS", "VP", "IN", "CD")

# WAIS-IV correlations
WAISCor <- rbind(
  c(1.00,0.49,0.45,0.54,0.45,0.50,0.41,0.64,0.44,0.40), #BD
  c(0.49,1.00,0.48,0.51,0.74,0.54,0.35,0.44,0.64,0.41), #SI
  c(0.45,0.48,1.00,0.47,0.50,0.60,0.40,0.40,0.43,0.45), #DS
  c(0.54,0.51,0.47,1.00,0.51,0.52,0.39,0.53,0.49,0.45), #MR
  c(0.45,0.74,0.50,0.51,1.00,0.57,0.34,0.42,0.73,0.41), #VO
  c(0.50,0.54,0.60,0.52,0.57,1.00,0.37,0.48,0.57,0.43), #AR
  c(0.41,0.35,0.40,0.39,0.34,0.37,1.00,0.38,0.34,0.65), #SS
  c(0.64,0.44,0.40,0.53,0.42,0.48,0.38,1.00,0.43,0.37), #VP
  c(0.44,0.64,0.43,0.49,0.73,0.57,0.34,0.43,1.00,0.34), #IN
  c(0.40,0.41,0.45,0.45,0.41,0.43,0.65,0.37,0.34,1.00)) #CD
rownames(WAISCor) <- colnames(WAISCor) <- WAISSubtests

#Means
WAISMeans<-rep(10,length(WAISSubtests))

#Standard deviations
WAISSD<-rep(3,length(WAISSubtests))

#Covariance Matrix
WAISCov<-WAISCor*WAISSD%*%t(WAISSD)

#Sample size
SampleSize<-1000000

#Load mvtnorm package
library(mvtnorm)

#Make simulated data
d<-rmvnorm(n=SampleSize,mean=WAISMeans,sigma=WAISCov)
#To make this more realistic, you can round all scores to the nearest integer (d<-round(d))

#Threshold for abnormality
Threshold<-4

#Which scores are less than or equal to threshold
Abnormal<- d<=Threshold

#Number of scores less than or equal to threshold
nAbnormal<-rowSums(Abnormal)

#Frequency distribution table
p<-c(table(nAbnormal)/SampleSize)

#Plot
barplot(p,axes=F,las=1,
    xlim=c(0,length(p)*1.2),ylim=c(0,1),
    bty="n",pch=16,col="royalblue2",
    xlab="Number of WAIS-IV subtest scores less than or equal to 4",
    ylab="Proportion")
axis(2,at=seq(0,1,0.1),las=1)
text(x=0.7+0:10*1.2,y=p,labels=formatC(p,digits=2),cex=0.7,pos=3,adj=0.5)

The code produces this graph:
Abnormal Scores Simulation

Using the multivariate normal distribution

The simulation method works very well, especially if the sample size is very large. An alternate method that gives more precise numbers is to estimate how much of the multivariate normal distribution is within certain bounds. That is, we find all of the regions of the multivariate normal distribution in which one and only one test is below a threshold and then add up all the probabilities. The process is repeated to find all regions in which two and only two tests are below a threshold. Repeat the process, with 3 tests, 4 tests, and so on. This is tedious to do by hand but only takes a few lines of code do automatically.

AbnormalPrevalance<-function(Cor,Mean=0,SD=1,Threshold){
  require(mvtnorm)
  k<-nrow(Cor)
  p<-rep(0,k)
  zThreshold<-(Threshold-Mean)/SD
  for (n in 1:k){
    combos<-combn(1:k,n)
    ncombos<-ncol(combos)
    for (i in 1:ncombos){
      u<-rep(Inf,k)
      u[combos[,i]]<-zThreshold
      l<-rep(-Inf,k)
      l[seq(1,k)[-combos[,i]]]<-zThreshold 
      p[n]<-p[n]+pmvnorm(lower=l,upper=u,mean=rep(0,k),sigma=Cor)[1]
    }
  }
  p<-c(1-sum(p),p)
  names(p)<-0:k

  barplot(p,axes=F,las=1,xlim=c(0,length(p)*1.2),ylim=c(0,1),
     bty="n",pch=16,col="royalblue2",
     xlab=bquote("Number of scores less than or equal to " * .(Threshold)),
     ylab="Proportion")
  axis(2,at=seq(0,1,0.1),las=1)
  text(x=0.7+0:10*1.2,y=p,labels=formatC(p,digits=2),cex=0.7,pos=3,adj=0.5)
  return(p)
}
Proportions<-AbnormalPrevalance(Cor=WAISCor,Mean=10,SD=3,Threshold=4)

Using this method, the results are nearly the same but slightly more accurate. If the number of tests is large, the code can take a long time to run.

Abnormal Scores Direct Method

Advertisements
Standard

One thought on “How unusual is it to have multiple scores below a threshold?

  1. Pingback: How common is it to have no academic weaknesses? | Assessing Psyche, Engaging Gauss, Seeking Sophia

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s