This document contains 3-D graphs. You can rotate by dragging and you can zoom with a mouse wheel or two-finger scrolling.

Suppose that we want to know about the effect of x on y and that we observe that x and y are correlated. It could be that x causes y, that y causes x, or that there is some unknown factor that explains the relationship. For example, there could be a variable (call it z) that causes both x and y. The idea of multiple regression or “controls” in a regression is that if we can observe z, then we can see if x and y are still correlated for subsets of data with equal values of z. If x and y are still related even when z does not change, then the relationship between them is not just a result of z.

We will create 1000 observations with values of three variables for each observation. By construction, z is a standard normal variable, and x and y are each caused by z but do not cause each other.

n = 1000
z = rnorm(n)
x = z + .5*rnorm(n)
y = z + .5*rnorm(n)
As seen in Figure 1, because x and y are each related to z, they are correlated even though there is no causal relationship between them.
ggplot(data=NULL, aes(x=x,y=y)) + geom_point(alpha=.3)
Figure 1: x and y are correlated
We start exploring controlling for z by plotting the data in three dimensions, as in Figure 2. It is hard to tell by just looking at the graph, by x and y are only related because each increases as z increases. You can rotate the plot by dragging, and you can zoom in and out with a mouse wheel, with two finger scrolling, and possibly some other ways.
plot3d(x, y, z, mgp=c(0,1,2))

You must enable Javascript to view this page properly.

Figure 2: 3-D plot of x, y, and z. You can rotate the plot by dragging.
Let’s select a subsample of the data with values of z between -0.1 and 0.1 (this is an approximation to holding z contant at 0) and plot those points in a different color:
zLB <- -.1
zUB <- 0.1
zSubset = subset(z, z >= zLB & z <= zUB)
xSubset = subset(x, z >= zLB & z <= zUB)
ySubset = subset(y, z >= zLB & z <= zUB)

palette2 = c("#E69F00", "#56B4E9")
plot3d(x, y, z, col=palette2[2], size=1.5)
points3d(xSubset, ySubset, zSubset, col=palette2[1], size=3)