This document contains 3-D graphs. You can rotate by dragging and you can zoom with a mouse wheel or two-finger scrolling.
Suppose that we want to know about the effect of x
on y
and that we observe that x
and y
are correlated. It could be that x
causes y
, that y
causes x
, or that there is some unknown factor that explains the relationship. For example, there could be a variable (call it z
) that causes both x
and y
. The idea of multiple regression or “controls” in a regression is that if we can observe z
, then we can see if x
and y
are still correlated for subsets of data with equal values of z
. If x
and y
are still related even when z
does not change, then the relationship between them is not just a result of z
.
We will create 1000 observations with values of three variables for each observation. By construction, z
is a standard normal variable, and x
and y
are each caused by z
but do not cause each other.
n = 1000
z = rnorm(n)
x = z + .5*rnorm(n)
y = z + .5*rnorm(n)
As seen in Figure 1, because x
and y
are each related to z
, they are correlated even though there is no causal relationship between them.
We start exploring controlling for z
by plotting the data in three dimensions, as in Figure 2. It is hard to tell by just looking at the graph, by x
and y
are only related because each increases as z
increases. You can rotate the plot by dragging, and you can zoom in and out with a mouse wheel, with two finger scrolling, and possibly some other ways.
Let’s select a subsample of the data with values of z
between -0.1 and 0.1 (this is an approximation to holding z
contant at 0) and plot those points in a different color:
Rotate the top of the graph toward you so that the z
axis is coming straight out toward you. You should have the x
axis going to the right and the y
axis pointing up. You should see that within the golden points there is no relationship between x
and y
. Figure 4 shows only these points. What we learn from the graph is that conditional on z
(holding z
constant near 0), there is no relationship between x
and y
.