Interaction Effects
R Tutorials for Applied Statistics
In this tutorial, we expand our linear regression framework to include interaction effects. An interaction effect is when the combination of two variables has a different effect on the outcome variable than just the sum of the impact from each variable in isolation.
We will illustrate the meaning and use of interaction effects with an example
1 Example: Factors Affecting Monthly Earnings
Let us examine a data set that explores the relationship between total monthly earnings (MonthlyEarnings
) and a number of factors that may influence monthly earnings including including each person’s IQ (IQ
), a measure of knowledge of their job (Knowledge
), years of education (YearsEdu
), years experience (YearsExperience
), and years at current job (Tenure
).
The code below downloads data on the above variables from 1980 for 663 individuals and assigns it to a dataframe called df
.
The following call to lm()
estimates a multiple regression predicting monthly earnings based on the five explanatory variables given above. The next call to summary()
displays some summary statistics for the estimated regression.
lmwages <- lm(MonthlyEarnings ~
IQ + Knowledge + YearsEdu + YearsExperience + Tenure,
data = df)
summary(lmwages)
##
## Call:
## lm(formula = MonthlyEarnings ~ IQ + Knowledge + YearsEdu + YearsExperience +
## Tenure, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -849.51 -244.91 -41.28 191.41 2225.88
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -512.540 139.180 -3.683 0.000250 ***
## IQ 3.889 1.204 3.230 0.001299 **
## Knowledge 8.048 2.246 3.582 0.000366 ***
## YearsEdu 46.308 8.833 5.243 2.13e-07 ***
## YearsExperience 13.445 4.064 3.309 0.000989 ***
## Tenure 3.392 3.016 1.124 0.261262
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 369.6 on 657 degrees of freedom
## Multiple R-squared: 0.1796, Adjusted R-squared: 0.1733
## F-statistic: 28.76 on 5 and 657 DF, p-value: < 2.2e-16
Notice that the regression predict the return to each new year of experience on monthly earnings is equal to $13.45. The return to an additional year of education is equal to $46.31.
Is it possible that the return to additional years of experience should be bigger for people with higher levels of education? Could this be another benefit of receiving a higher education that is not yet in our model? It could be that not only do people with a higher level of education benefit with a higher base pay, the rates of pay increases for additional experience with these types of jobs may be higher.
We introduce an interaction effect between experience and education to allow for this effect and estimate its significance. We introduce a new term into the regression that is equal to the YearsEdu
variable multiplied by the YearsExperience
variable. The following call to lm()
estimates such a model:
lmwages <- lm(MonthlyEarnings ~
IQ + Knowledge + YearsEdu + YearsExperience + Tenure +
YearsEdu:YearsExperience,
data = df)
summary(lmwages)
##
## Call:
## lm(formula = MonthlyEarnings ~ IQ + Knowledge + YearsEdu + YearsExperience +
## Tenure + YearsEdu:YearsExperience, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -863.30 -241.41 -41.33 190.62 2213.41
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.020 284.112 -0.004 0.997135
## IQ 3.800 1.202 3.162 0.001636 **
## Knowledge 7.859 2.243 3.504 0.000489 ***
## YearsEdu 9.403 19.937 0.472 0.637325
## YearsExperience -33.477 23.097 -1.449 0.147699
## Tenure 4.235 3.037 1.395 0.163607
## YearsEdu:YearsExperience 3.547 1.719 2.064 0.039450 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 368.7 on 656 degrees of freedom
## Multiple R-squared: 0.1849, Adjusted R-squared: 0.1774
## F-statistic: 24.8 on 6 and 656 DF, p-value: < 2.2e-16
The coefficient on the interaction term is equal to \(3.547\) and it is statistically significant at the 5% level (p-value = \(0.039\)). We found sufficient statistical evidence that experience and education have an interaction effect.
The coefficient is positive which implies higher levels of education lead to higher return for an additional year of experience. It is equivalent to say that higher levels of experience lead to a higher return to an additional year of education.
2 Marginal Effects
What is the impact on monthly earnings of an additional year of experience? This is called the marginal effect of a year of experience
In the simple linear regression with no interaction effects at the opening of this tutorial, the marginal effect is exactly the same as the estimate of the coefficient on experience
With interaction effects, you can no longer answer the question looking at only one coefficient. When experience increases by one year, the coefficient on YearsExperience
suggests that monthly earnings decreases by $33.48. That is not the end of the story.
The coefficient on the interaction effect says that an additional year of experience leads to a $3.55 raise for each year of education the person has obtained. The marginal effect is different for each person, depending on that person’s average level of education.
The marginal effect for experience is given by,
\[ me = b_{Exp} + b_{Exp*Edu} x_{Edu} \]
For someone with 16 years of education (most likely a four-year college degree), the marginal effect for experience is equal to:
\[ me = -33.48 + 3.55 (16) = \$23.32 \]
The marginal effect on monthly earnings of an additional year of experience for a four year college grad is $23.32.
Let us examine the marginal effect for someone with 12 years of education (most likely someone with a high school degree and no college):
\[ me = -33.48 + 3.55 (12) = \$9.12 \]
To compute the average marginal effect, we can replace the mean value for years of education into the formula above:
\[ \bar{me} = b_{Exp} + b_{Exp*Edu} \bar{x}_{Edu} \]
Let us compute this in R:
b <- lmwages$coefficients;
me_exp <- b["YearsExperience"] + b["YearsEdu:YearsExperience"] * mean(df$YearsEdu)
me_exp
## YearsExperience
## 15.05299
The average return to an additional year of experience is equal to $15.05