In this tutorial I discuss and present visual representations of covariance. Although covariance is not directly informative, it is a fundamental ingredient in almost all of the most frequently used statistical procedures.

In this tutorial I discuss and present visual representations of covariance. Although covariance is not directly informative, it is a fundamental ingredient in almost all of the most frequently used statistical procedures.

Correlation? I get it. I have a gut-level sense of what it is. Covariance? Somehow it just eludes me. I mean, I know the formulas and I can give you a conceptual definition of it—but its meaning never really sunk in.

One thing about covariance that always seemed counter-intuitive to me is that covariance between two variables of unequal variance can sometimes be larger than the variance of the variable with less variance. For example, if X has a variance of 9, Y has a variance of 64 and the correlation between X and Y is 0.5, the covariance between X and Y is 12. How can X have a larger covariance with Y than its own variance (i.e., its covariance with itself)? Never made sense to me.

I started using Mathematica recently and decided to make an interactive visualization tool that shows how large covariance between two variables is. Click here (You might be prompted to download the Wolfram CDF Player plugin for your browser. It may take a while to load.). Play with the sliders at the bottom.

The area of the blue square is equal to the variance of X. The area of the red square is equal to the variance of Y. The pink rectangle (which is partially occluded by the purple rectangle) is how large covariance could be if X and Y were perfectly correlated. The area of the purple square is equal to the covariance between X and Y. The ratio of the area of the purple rectangle to the area of the pink rectangle is equal to the correlation between X and Y.

I’m not sure why but this visualization has made me feel better about covariance. It’s like were friends now. 😉