Cognitive Assessment, Psychometrics, Statistics, Tutorial, Video

Conditional normal distributions provide useful information in psychological assessment

Conditional Normal Distribution

Conditional Normal Distribution

Conditional normal distributions are really useful in psychological assessment. We can use them to answer questions like:

  • How unusual is it for someone with a vocabulary score of 120 to have a score of 90 or lower on reading comprehension?
  • If that person also has a score of 80 on a test of working memory capacity, how much does the risk of scoring 90 or lower on reading comprehension increase?

What follows might be mathematically daunting. Just let it wash over you if it becomes confusing. At the end, there is a video in which I will show how to use a spreadsheet that will do all the calculations.

Unconditional Normal Distributions

Suppose that variable Y represents reading comprehension test scores. Here is a fancy way of saying Y is normally distributed with a mean of 100 and a standard deviation of 15:

Y\sim N(100,15^2)

In this notation, “~” means is distributed as, and N means normally distributed with a particular mean (μ) and variance (σ2).

If we know literally nothing about a person from this population, our best guess is that the person’s reading comprehension score is at the population mean. One way to say this is that the person’s expected value on reading comprehension is the population mean:

E(Y)=\mu_Y = 100

The 95% confidence interval around this guess is :

95\%\, \text{CI} = \mu_Y \pm z_{95\%} \sigma_Y

95\%\, \text{CI} \approx 100 \pm 1.96*15 = 70.6 \text{ to } 129.4

Unconditional Normal Distribution with 95% CI

Unconditional Normal Distribution with 95% CI

Conditional Normal Distributions

Simple Linear Regression

Now, suppose that we know one thing about the person: the person’s score on a vocabulary test. We can let X represent the vocabulary score and its distribution is the same as that of Y:

X\sim N(100,15^2)

If we know that this person scored 120 on vocabulary (X), what is our best guess as to what the person scored on reading comprehension (Y)? This guess is a conditional expected value. It is “conditional” in the sense that the expected value of Y depends on what value X has. The pipe symbol “|” is used to note a condition like so:


This means, “What is our best guess for Y if X is 120?”

What if we don’t want to be specific about the value of X but want to refer to any particular value of X? Oddly enough, it is traditional to use the lowercase x for that. So, X refers to the variable as a whole and x refers to any particular value of variable X. So if I know that variable X happens to be a particular value x, the expected value of Y is:

E(Y|X=x)=\sigma_Y \rho_{XY}\dfrac{x-\mu_X}{\sigma_X}+\mu_Y

where ρXY is the correlation between X and Y.

You might recognize that this is a linear regression formula and that:


where “Y-hat” (Ŷ) is the predicted value of Y when X is known.

Let’s assume that the relationship between X and Y is bivariate normal like in the image at the top of the post:

\begin{bmatrix}X\\Y\end{bmatrix}\sim N\left(\begin{bmatrix}\mu_X\\ \mu_Y\end{bmatrix}\begin{matrix} \\,\end{matrix}\begin{bmatrix}\sigma_X^2&\rho_{XY}\sigma_X\sigma_Y\\ \rho_{XY}\sigma_X\sigma_Y&\sigma_X^2\end{bmatrix}\right)

The first term in the parentheses is the vector of means and the second term (the square matrix in the brackets) is the covariance matrix of X and Y. It is not necessary to understand the notation. The main point is that X and Y are both normal, they have a linear relationship, and the conditional variance of Y at any value of X is the same.

The conditional standard deviation of Y at any particular value of X is:


This is the standard deviation of the conditional normal distribution. In the language of regression, it is the standard error of the estimate (σe). It is the standard deviation of the residuals (errors). Residuals are simply the amount by which your guesses differ from the actual values.

e = y - E(Y|X=x)=y-\hat{Y}



So, putting all this together, we can answer our question:

How unusual is it for someone with a vocabulary score of 120 to have a score of 90 or lower on reading comprehension?

The expected value of Y (Ŷ) is:


Suppose that the correlation is 0.5. Therefore,


This means that among all the people with a vocabulary score of 120, the average is 110 on reading comprehension. Now, how far off from that is 90?

e= y - \hat{Y}=90-110=-20

What is the standard error of the estimate?



Dividing the residual by the standard error of the estimate (the standard deviation of the conditional normal distribution) gives us a z-score. It represents how far from expectations this individual is in standard deviation units.

z=\dfrac{e}{\sigma_e} \approx\dfrac{-20}{12.99}\approx -1.54

Using the standard normal cumulative distribution function (Φ) gives us the proportion of people scoring 90 or less on reading comprehension (given a vocabulary score of 120).

\Phi(z)\approx\Phi(-1.54)\approx 0.06

In Microsoft Excel, the standard normal cumulative distribution function is NORMSDIST. Thus, entering this into any cell will give the answer:


Conditional Normal when Vocabulary = 120

Conditional normal distribution when Vocabulary = 120

Multiple Regression

What proportion of people score 90 or less on reading comprehension if their vocabulary is 120 but their working memory capacity is 80?

Let’s call vocabulary X1 and working memory capacity X2. Let’s suppose they correlated at 0.3. The correlation matrix among the predictors (RX):

\mathbf{R_X}=\begin{bmatrix}1&\rho_{12}\\ \rho_{12}&1\end{bmatrix}=\begin{bmatrix}1&0.3\\ 0.3&1\end{bmatrix}

The validity coefficients are the correlations of Y with both predictors (RXY):

\mathbf{R}_{XY}=\begin{bmatrix}\rho_{Y1}\\ \rho_{Y2}\end{bmatrix}=\begin{bmatrix}0.5\\ 0.4\end{bmatrix}

The standardized regression coefficients (β) are:

\pmb{\mathbf{\beta}}=\mathbf{R_{X}}^{-1}\mathbf{R}_{XY}\approx\begin{bmatrix}0.418\\ 0.275\end{bmatrix}

Unstandardized coefficients can be obtained by multiplying the standardized coefficients by the standard deviation of Y (σY) and dividing by the standard deviation of the predictors (σX):


However, in this case all the variables have the same metric and thus the unstandardized and standardized coefficients are the same.

The vector of predictor means (μX) is used to calculate the intercept (b0):

b_0=\mu_Y-\mathbf{b}' \pmb{\mathbf{\mu}}_X

b_0\approx 100-\begin{bmatrix}0.418\\ 0.275\end{bmatrix}^{'} \begin{bmatrix}100\\ 100\end{bmatrix}\approx 30.769

The predicted score when vocabulary is 120 and working memory capacity is 80 is:

\hat{Y}=b_0 + b_1 X_1 + b_2 X_2

\hat{Y}\approx 30.769+0.418*120+0.275*80\approx 102.9

The error in this case is 90-102.9=-12.9:

The multiple R2 is calculated with the standardized regression coefficients and the validity coefficients.

R^2 = \pmb{\mathbf{\beta}}'\pmb{\mathbf{R}}_{XY}\approx\begin{bmatrix}0.418\\ 0.275\end{bmatrix}^{'} \begin{bmatrix}0.5\\ 0.4\end{bmatrix}\approx0.319

The standard error of the estimate is thus:

\sigma_e=\sigma_Y\sqrt{1-R^2}\approx 15\sqrt{1-0.319^2}\approx 12.38

The proportion of people with vocabulary = 120 and working memory capacity = 80 who score 90 or less is:

\Phi\left(\dfrac{e}{\sigma_e}\right)\approx\Phi\left(\dfrac{-12.9}{12.38}\right)\approx 0.15

Here is a spreadsheet that automates these calculations.

Multiple Regression Spreadsheet

Multiple Regression Spreadsheet

I explain how to use this spreadsheet in this YouTube video:

Principles of assessment of aptitude and achievement, Tutorial

Predicted Achievement Using Simple Linear Regression

There are two ways to make an estimate of a person’s abilities. A point estimate (a single number) is precise but usually wrong, whereas an interval estimate (a range of numbers) is usually right but can be so wide that it is nearly useless. Confidence intervals combine both types of estimates in order to balance the weaknesses of one type of estimate with the strengths of the other. If I say that Suzie’s expected reading comprehension is 85 ± 11, the 85 is the point estimate (also known as the expected score or the predicted score or just Ŷ). The ± 11 is called the margin of error. If the confidence level is left unspecified, by convention we mean the 95% margin of error. If I add 11 and subtract 11 to get a range from 74 to 96, I have the respective lower and upper bounds of the 95% confidence interval.

Calculating the Predicted Achievement Score

I will assume that both the IQ and achievement scores are index scores (μ = 100, σ = 15) to make things simple. The predicted achievement score is a point estimate. It represents the best guess we can make in the absence of other information. The equation below is called a regression equation.

\hat{Y}=\sigma_Y r_{XY} \frac{X-\mu_X}{\sigma_X}+\mu_Y

If X is IQ, Y is Achievement, and both scores are index scores (μ = 100, σ = 15), the regression equation simplifies to:

Predicted achievement = (Correlation between IQ and Achievement) (IQ – 100) + 100

Calculating the Confidence Interval for the Predicted Achievement Score

Whenever you make a prediction using regression, your estimate is not exactly right very often. It is expected to differ from the actual achievement score by a certain amount (on average). This amount is called the standard error of the estimate. It is the standard deviation of all the prediction errors. Thus, it is the standard to which all the errors in your estimates are compared. When both scores are index scores, the formula is

\text{Standard error of the estimate}=\sqrt{1-r^2_{XY}}

To calculate the margin of error, multiply the standard error of the estimate by the z-score that corresponds to the degree of confidence desired. In Microsoft Excel the formula for the z-score corresponding to the 95% confidence interval is



For the 95% confidence interval, multiply the standard error of the estimate by 1.96. The 95% confidence interval’s formula is

95% Confidence Interval = Predicted achievement ± 1.96 * Standard error of the estimate

This interval estimates the achievement score for 95% of people with the same IQ as the child. About 2.5% will score lower than this estimate and 2.5% will score higher.

You can use Excel to estimate how unusual it is for an observed achievement score to differ from a predicted achievement score in a particular direction by using this formula,

=NORMSDIST(-1*ABS(Observed-Predicted)/(Standard error of the estimate))

If a child’s observed achievement score is unusually low, it does not automatically mean that the child has a learning disorder. Many other things need to be checked before that diagnosis can be considered valid. However, it does mean that an explanation for the unusually low achievement score should be sought.

This post is an excerpt from:

Schneider, W. J. (2013). Principles of assessment of aptitude and achievement. In D. Saklofske, C. Reynolds, & V. Schwean (Eds.), Oxford handbook of psychological assessment of children and adolescents (pp. 286–330). New York: Oxford

Statistics, Tutorial

Simple Regression Tutorial Using Prezi

I have been experimenting with Prezi, an alternative to PowerPoint. I think that Prezi is a very interesting tool. I made a tutorial about simple regression for my introductory statistics class. I recommend clicking “More” and then “Fullscreen” to get the full effect.
Regression: Predicting the Future without ESP on Prezi