When a person scores exactly 2 standard deviations below the mean on several tests, it is intuitive that the composite score that summarizes these scores should also be exactly 2 standard deviations below the mean. Out intuitions let us down in this case because in this case the composite score is lower than 2 standard deviations. I attempt to make this “composite score extremity effect” a little more intuitive in an Assessment Service Bulletin for the Woodcock-Johnson IV.
Schneider , W. J. (2016). Why Are WJ IV Cluster Scores More Extreme Than the Average of Their Parts? A Gentle Explanation of the Composite Score Extremity Effect (Woodcock-Johnson IV Assessment Service Bulletin No. 7). Itasca, IL: Houghton Mifflin Harcourt.
I thank Mark Ledbetter for the invitation to write the paper and support in the writing process, Erica LaForte for patiently editing a complex first draft down to a much more readable version, and Kevin McGrew for additional thoughtful comments and suggestions for improvement on the first draft.
The bulk of the paper is not mathematical. However, the first draft had a few bells and whistles like the animated graph below that shows how the composite score extremity effect is larger as the average correlation among the tests decreases and the number of tests in the composite increases.
Another plot that was originally animated shows what our best guess of a latent variable X if we have two indicators X1 and X2 that are both exactly 2 standard deviations below the mean. X1 and X2 correlate with each other at 0.64 and with X at 0.8. If we only know that X1 = −2, our best guess is that X is −1.60. If we know that both X1 and X2 are −2, out best guess is that X is −1.95. Thus, our estimate is lower with 2 scores (−1.95) than with one score (−1.60).