CHC Theory, Cognitive Assessment

Intelligence and the Modern World of Work: A Special Issue of Human Resource Management Review

Charles Scherbaum and Harold Goldstein took an innovative approach to editing a special issue of Human Resource Management Review. They asked prominent I/O psychologists to collaborate with scholars from other disciplines to explore how advances in intelligence research might be incorporated into our understanding of the role of intelligence in the workplace.

It was an honor to be invited to participate, and it was a pleasure to be paired to work with Daniel Newman of the University of Illinois at Urbana/Champaign. Together we wrote an I/O psychology-friendly introduction to current psychometric theories of cognitive abilities, emphasizing Kevin McGrew‘s CHC theory. Before that could be done, we had to articulate compelling reasons I/O psychologists should care about assessing multiple cognitive abilities. This was a harder sell than I had anticipated.

Formal cognitive testing is not a part of most hiring decisions, though I imagine that employers typically have at least a vague sense of how bright job applicants are. When the hiring process does include formal cognitive testing, typically only general ability tests are used. Robust relationships between various aspects of job performance and general ability test scores have been established.

In comparison, the idea that multiple abilities should be measured and used in personnel selection decisions has not fared well in the marketplace of ideas. To explain this, there is no need to appeal to some conspiracy of test developers. I’m sure that they would love to develop and sell large, expensive, and complex test batteries to businesses. There is also no need to suppose that I/O psychology is peculiarly infected with a particularly virulent strain of g zealotry and that proponents of multiple ability theories have been unfairly excluded.

To the contrary, specific ability assessment has been given quite a bit of attention in the I/O psychology literature, mostly from researchers sympathetic to the idea of going beyond the assessment of general ability.  Dozens (if not hundreds) of high-quality studies were conducted to test whether using specific ability measures added useful information beyond general ability measures. In general, specific ability measures provide only modest amounts of additional information beyond what can be had from general ability scores (ΔR2 ≈ 0.02–0.06). In most cases, this incremental validity was not large enough to justify the added time, effort, and expense needed to measure multiple specific abilities. Thus it makes sense that relatively short measures of general ability have been preferred to longer, more complex measures of multiple abilities.

However, there are several reasons that the omission of specific ability tests in hiring decisions should be reexamined:

  • Since the time that those high quality studies were conducted, multidimensional theories of intelligence have advanced, and we have a better sense of which specific abilities might be important for specific tasks (e.g., working memory capacity for air traffic controllers). The tests measuring these specific abilities have also improved considerably.
  • With computerized administration, scoring, and interpretation, the cost of assessment and interpretation of multiple abilities is potentially far lower than it was in the past. Organizations that make use of the admittedly modest incremental validity of specific ability assessments would likely have a small but substantial advantage over organizations that do not. Over the long run, small advantages often accumulate into large advantages.
  • Measurement of specific abilities opens up degrees of freedom in balancing the need to maintain the predictive validity of cognitive ability assessments and the need to reduce the adverse impact on applicants from disadvantaged minority groups that can occur when using such assessments. Thus, organizations can benefit from using cognitive ability assessments in hiring decisions without sacrificing the benefits of diversity.

The publishers of Human Resource Management Review have made our paper available to download for free until January 25th, 2015.

Broad Abilities in CHC Theory

Broad Abilities in CHC Theory

Cognitive Assessment

John Willis’ Comments on Reports newsletter makes me happy.

Whenever I find that John Willis has posted a new edition of his Comments on Reports newsletter, I read it greedily and gleefully. Each newsletter is filled with sharp-witted observations, apt quotations, and practical wisdom about writing better psychological evaluation reports.

Recent gems:

From #251

The first caveat of writing reports is that readers will strive mightily to attach significant meaning to anything we write in the report. The second caveat is that readers will focus particularly on statements and numbers that are unimportant, potentially misleading, or — whenever possible — both. This is the voice of bitter experience.

Also from #251

Planning is so important that people are beginning to indulge in “preplanning,” which I suppose is better than “postplanning” after the fact. One activity we often do not plan is evaluations.

From #207:

I still recall one principal telling the entire team that, if he could not trust the spelling in my report, he could not trust any of the information in it. This happened recently (about 1975), so it is fresh in my mind. Names of tests are important to spell correctly. Alan and Nadeen Kaufman spell their last name with a single f and only one n. David Wechsler spelled his name as shown, never as Weschler. The American version of the Binet-Simon scale was developed at Stanford University, not Standford. I have to keep looking it up, but it is Differential Ability Scales even though it is a scale for several abilities. Richard Woodcock may, for all I know, have attended the concert, but his name is not Woodstock.

Cognitive Assessment

Three inspired presentations at the Richard Woodcock Institute Event

On 10/24/2014, I attended the Richard Woodcock Institute Event hosted at University of Texas at Austin. The three speakers had strikingly different presentation styles but were equally excellent.

Dick Woodcock gave the opening remarks. I loved hearing about the twists and turns of his career and how he made the most of unplanned opportunities. It was rather remarkable how diverse his contributions are (including an electronic Braille typewriter). Then he stressed the importance of communicating test results in ways that non-specialists can understand. He speculated on what psychological testing will look like in the future, focusing on integrative software that will guide test selection and interpretation in more sophisticated ways than has hitherto been possible. Given that he has been creating the future of assessment for decades now, I am betting that he is likely to be right. Later he graciously answered my questions about the WJ77 and how he came up with what I consider to be among the most ingenious test paradigms we have.

After a short break, Kevin McGrew gave us a wild ride of a talk about advances in CHC Theory. Actually it was more like a romp than a ride. I tried to keep track of all the interesting ideas for future research he presented but there were so many I quickly lost count. The visuals were stunning and his energy was infectious. He offered a quick overview of new research from diverse fields about the overlooked importance of auditory processing (beyond the usual focus on phonemic awareness). Later he talked about his evolving conceptualization of the memory factors in CHC theory and role of complexity in psychometric tests. My favorite part of the talk was a masterful presentation of information processing theory, judiciously supplemented with very clever animations.

After lunch, Cathy Fiorello gave one of the most thoughtful presentations I have ever heard. Instead of contrasting nomothetic and idiographic approaches to psychological assessment, Cathy stressed their integration. Most of the time, nomothetic interpretations are good first approximations and often are sufficient. However, there are certain test behaviors and other indicators that a more nuanced interpretation of the underlying processes of performance is warranted. Cathy asserted (and I agree) that well trained and highly experienced practitioners can get very good at spotting unusual patterns of test performance that completely alter our interpretations of test scores. She called on her fellow scholars to develop and refine methods of assessing these patterns so that practitioners do not require many years of experience to develop their expertise. She was not merely balanced in her remarks—lip service to a sort of bland pluralism is an easy and rather lazy trick to seem wise. Instead, she offered fresh insight and nuance in her balanced and integrative approach to cognitive and neuropsychological assessment. That is, she did the hard work of offering clear guidelines of how to integrate nomothetic and idiographic methods, all the while frankly acknowledging the limits of what can be known.

CHC Theory, Cognitive Assessment

Exploratory model of cognitive predictors of academic skills that I presented at APA 2014

I have many reservations about this model of cognitive predictors of academic abilities that I presented at APA today (along with co-presenters Lee Affrunti, Renée Tobin, and Kimberley Collins) but I think that it illustrates an important point: prediction and explanation of cognitive and academic abilities is so complex that it is impossible to do in one’s head. Eyeballing scores and making pronouncements is not likely to be accurate and will result in misinterpretations. We need good software that can manage the complex calculations for us. We can still think creatively in the diagnostic process but the creativity must be grounded in realistic probabilities.

The images from the poster are from a single exploratory model based on a clinical sample of 865 college students. The model was so big and complex I had to split the path diagram into two images:

Exploratory Model of WAIS and WJ III cognitive subtests

Exploratory Model of WAIS and WJ III cognitive subtests. Gc = Comprehension/Knowledge, Ga = Auditory processing, Gv = Visual processing, Gl = Long-term memory: Learning, Gr = Long-term memory: Retrieval speed, Gs = Processing speed, MS = Memory span, Gwm = Working memory capacity, g = Anyone’s guess

Exploratory model of cognitive predictors of WJ III academic subtests

Exploratory model of cognitive predictors of WJ III academic subtests. Percentages in error terms represent unexplained variance.

Cognitive Assessment

Cognitive profiles are rarely flat.

Because cognitive abilities are positively correlated, there is an assumption that cognitive abilities should be evenly developed. When psychologists examine cognitive profiles, they often describe any features that deviate from the expected flat profile.

It is true, mathematically, that the expected profile IS flat. However, this does not mean that flat profiles are common. There is a very large set of possible profiles and only a tiny fraction are perfectly flat. Profiles that are nearly flat are not particularly common, either. Variability is the norm.

Sometimes it helps to get a sense of just how uneven cognitive profiles typically are. That is, it is good to fine-tune our intuitions about the typical profile with many exemplars. Otherwise it is easy to convince ourselves that the reason that we see so many interesting profiles is that we only assess people with interesting problems.

If we use the correlation matrix from the WAIS-IV to randomly simulate multivariate normal profiles, we can see that even in the general population, flat, “plain-vanilla” profiles are relatively rare. There are features that draw the eye in most profiles.

WAISIVProfilesIf cognitive abilities were uncorrelated, profiles would be much more uneven than they are. But even with moderately strong positive correlations, there is still room for quite a bit of within-person variability.

Let’s see what happens when we look at profiles that have the exact same Full Scale IQ (80, in this case). The conditional distributions of the remaining scores are seen in the “violin” plots. There is still considerable diversity of profile shape even though the Full Scale IQ is held constant.

WAISIVProfiles80Note that the supplemental subtests have wider conditional distributions because they are not included in the Full Scale IQ, not necessarily because they are less g-loaded.


Two visualizations for explaining “variance explained”

In my introductory statistics class, I feel uneasy when I have explain what variance explained means. The term has two things I don’t like. First, I don’t like variance very much. I feel much more comfortable with standard deviations. I understand that at a deep level variance is a more fundamental concept than the standard deviation. However, variance is a poor descriptive statistic because there is no direct visual analog for variance in a probability distribution plot. In contrast, the standard deviation illustrates very clearly how much scores typically deviate from the mean. So, variance explained is hard to grasp in part because variance is hard to visualize.

The second thing I don’t like about variance explained is the whole “explained” business. As I mentioned in my last post, variance explained does not actually mean that we have explained anything, at least in a causal sense. That is, it does not imply that we know what is going on. It simply means that we can use one or more variables to predict things more accurately than before.

In many models, if X is correlated with Y, X can be said to “explain” variance in Y even though X does not really cause Y. However, in some situations the term variance explained is accurate in every sense:

X causes Y

X causes Y

In the model above, the arrow means that X really is a partial cause of Y. Why does Y vary? Because of variability in X, at least in part. In this example, 80% of Y’s variance is due to X, with the remaining variance due to something else (somewhat misleadingly termed error). It is not an “error” in that something is wrong or that someone is making a mistake. It is merely that which causes our predictions of Y to be off. Prediction error is probably not a single variable. It it likely to be the sum total of many influences.

Because X and error are uncorrelated z-scores in this example, the path coefficients are equal to the correlations with Y. Squaring the correlation coefficients yields the variance explained. The coefficients for X and error are actually the square roots of .8 and .2, respectively. Squaring the coefficients tells us that X explains 80% of the variance in Y and error explains the rest.

Visualizing Variance Explained

Okay, if X predicts Y, then the variance explained is equal to the correlation coefficient squared. Unfortunately, this is merely a formula. It does not help us understand what it means. Perhaps this visualization will help:

Variance Explained

Variance Explained

If you need to guess every value of Y but you know nothing about Y except that it has a mean of zero, then you should guess zero every time. You’ll be wrong most of the time, but pursuing other strategies will result in even larger errors. The variance of your prediction errors will be equal to the variance of Y. In the picture above, this corresponds to a regression line that passes through the mean of Y and has a slope of zero. No matter what X is, you guess that Y is zero. The squared vertical distance from Y to the line is represented by the translucent squares. The average area of the squares is the variance of Y.

If you happen to know the value of X each time you need to guess what Y will be, then you can use a regression equation to make a better guess. Your prediction of Y is called Y-hat (Ŷ):

\hat{Y}=b_0+b_1X=0+\sqrt{0.80}X\approx 0.89X

When X and Y have the same variance, the slope of the regression line is equal to the correlation coefficient, 0.89. The distance from Ŷ (the predicted value of Y) to the actual value of Y is the prediction error. In the picture above, the variance of the prediction errors (0.2) is the average size of the squares when the slope is equal to the correlation coefficient.

Thus, when X is not used to predict Y, our prediction errors have a variance of 1. When we do use X to predict Y, the average size of the prediction errors shrinks from 1 to 0.2, an 80% reduction. This is what is meant when we say that “X explains 80% of the variance in Y.” It is the proportion by which the variance of the prediction errors shrinks.

An alternate visualization

Suppose that we flip 50 coins and record how many heads there are. We do this over and over. The values we record constitute the variable Y. The number of heads we get each time we flip a coin happens to have a binomial distribution. The mean of a binomial distribution is determined by the probability p of an event occurring on a single trial (i.e., getting a head on a single toss) and the number of events k (i.e., the number of coins thrown). As k increases, the binomial distribution begins to resemble the normal distribution. The probability p of getting a head on any one coin toss is 0.5 and the number of coins k is 50. The mean number of heads over the long run is:

\mu = pk=0.5*50=25

The variance of the binomial distribution:

\sigma^2 = p(1-p)k=0.5*(1-0.5)*50=12.5

Before we toss the coins, we should guess that we will toss an average number of heads, 25. We will be wrong much of the time but our prediction errors will be as small as they can be, over the long run. The variance of our prediction errors is equal to the variance of Y, 12.5.

Now suppose that after tossing 80% of our coins (i.e., 40 coins), we count the number of heads. This value is recorded as variable X. The remaining 20% of the coins (10 coins) are then tossed and the total number of heads is counted from all 50 coins. We can use a regression equation to predict Y from X. The intercept will be the mean number of heads from the remaining 10 coins:

\hat{Y} = b_0+b_1X=5+X

In the diagram below, each peg represents a coin toss. If the outcome is heads, the dot moves right. If the outcome is tails, the dot moves left. The purple line represents the probability distribution of Y before any coin has been tossed.

X explains variance in Y

X explains 80% of the variance in Y.

When the dot gets to the red line (after 40 tosses or 80% of the total), we can make a new guess as to what Y is going to be. This conditional distribution is represented by a blue line. The variance of the conditional distribution has a mean equal to Ŷ, with a variance of 2.5 (the variance of the 10 remaining coins).

The variability in Y is caused by the outcomes of 50 coin tosses. If 80% of those coins are the variable X, then X explains 80% of the variance in Y. The remaining 10 coins represent the variability of Y that is not determined by X (i.e., the error term). They determine 20% of the variance in Y.

If X represented only the first 20 of 50 coins, then X would explain 40% of the variance in Y.


X explains 40% of the variance in Y.

X explains 40% of the variance in Y.