lmsmoke <- lm(smoke ~ log(cigpric) + log(income) + educ + age + restaurn + white,
data=smoke)
vv <- vcovHC(lmsmoke, type="HC1")
coeftest(lmsmoke, vcov=vv)
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.75242714 0.87042381 0.8644 0.387607
## log(cigpric) -0.06565084 0.20908613 -0.3140 0.753611
## log(income) 0.04152261 0.02480127 1.6742 0.094480 .
## educ -0.02530213 0.00562607 -4.4973 7.900e-06 ***
## age -0.00366676 0.00093711 -3.9129 9.897e-05 ***
## restaurn -0.10277101 0.03855269 -2.6657 0.007837 **
## white -0.01084839 0.05113556 -0.2121 0.832044
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Coefficient 2.5% Limit 97.5% Limit
## (Intercept) 0.752427141 -0.956157127 2.461011409
## log(cigpric) -0.065650835 -0.476073063 0.344771392
## log(income) 0.041522610 -0.007160642 0.090205861
## educ -0.025302131 -0.036345733 -0.014258528
## age -0.003666764 -0.005506242 -0.001827287
## restaurn -0.102771015 -0.178447393 -0.027094637
## white -0.010848387 -0.111224106 0.089527333
Problem 1
What impact does an increase in the cost of cigarettes by 1% have on the probability that someone smokes? Construct a 95% confidence interval.
Based on the regression above, for every 1% increase in cigarette price, the probability that someone smokes decreases by 6.6%. The confidence interval is wide and includes both increases and decreases in probability. The lower bound is a 48% decrease in probability of smoking and the upper bound is a 34% increase in the probability of smoking.
Problem 2
Does imposing a ban on smoking in restaurants cause smoking prevalence to decrease? Test the appropriate hypothesis and construct and interpret a 95% confidence interval for the impact the smoking ban has on smoking prevalence.
Based on the regression above, with a p-value equal to \(0.007837\), there is sufficient statistical evidence that a restaurant ban does lead to a decrease in the incidence of smoking.
With 95% confidence, the probability someone smokes decreases between 17.8% to 9.0% due to a restaurant ban.
Problem 3
Suppose someone has an income equal to $6,500, has 16 years of education, is 45 years old, is white, does not have a smoking ban in his state, and the price of cigarettes is $5 per pack. What does the regression predict is the probability that the person smokes?
person <- data.frame(educ=16, income=6500, age=45, white=1, restaurn=0, cigpric=5)
predict(lmsmoke, newdata=person)
## 1
## 0.4306295
There is a 43% chance that such a person smokes.
Problem 4
Accounting for the other variables in the model, does race affect whether or not a person smokes? Test the appropriate hypothesis and construct and interpret a 95% confidence interval for the impact the smoking ban has on smoking prevalence.
Based on the regression above, with a p-value equal to \(0.832044\), we fail to find statistical evidence that smoking prevelance is different for white people versus non-white people.
We are 95% confident that the difference between white and non-white smoking prevalence is between -11% and 8.95%.
Problem 5
Re-estimate the regression model with the following interaction terms: restaurn
by (log) income
, and restaurn
by age
.
lmsmoke2 <- lm(smoke ~ log(cigpric) + log(income) + educ + age + restaurn + white +
restaurn:log(income) + restaurn:age,
data=smoke)
vv <- vcovHC(lmsmoke2, type="HC1")
coeftest(lmsmoke2, vcov=vv)
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.8203856 0.8800932 0.9322 0.3515369
## log(cigpric) -0.0641136 0.2095152 -0.3060 0.7596774
## log(income) 0.0358662 0.0290864 1.2331 0.2179041
## educ -0.0253926 0.0056423 -4.5004 7.791e-06 ***
## age -0.0041473 0.0011055 -3.7514 0.0001886 ***
## restaurn -0.3785871 0.4823588 -0.7849 0.4327650
## white -0.0092158 0.0514536 -0.1791 0.8578977
## log(income):restaurn 0.0201501 0.0484339 0.4160 0.6774981
## age:restaurn 0.0019617 0.0020325 0.9651 0.3347660
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
5(a)
Does the restaurant ban have differential effects depending on people’s income? Test the appropriate hypothesis. If so, describe the relationship.
No, with a p-value equal to \(0.6775\), we fail to find evidence that the restaurant ban is different for people with different levels of income.
5(b)
Does the restaurant ban have differential effects depending on people’s age? Test the appropriate hypothesis. If so, describe the relationship.
No, with a p-value equal to \(0.3348\), we fail to find evidence that the restaurant ban is different for people with different ages
5(c)
With this model that includes interaction effects, test the hypothesis that the smoking ban influences the probability that a person smokes.
lmsmoke2res <- lm(smoke ~ log(cigpric) + log(income) + educ + age + white,
data=smoke)
waldtest(lmsmoke2, lmsmoke2res, vcov=vv)
## Wald test
##
## Model 1: smoke ~ log(cigpric) + log(income) + educ + age + restaurn +
## white + restaurn:log(income) + restaurn:age
## Model 2: smoke ~ log(cigpric) + log(income) + educ + age + white
## Res.Df Df F Pr(>F)
## 1 798
## 2 801 -3 2.6374 0.0486 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
With a p-value equal to \(0.0486\), we do find sufficient statistical evidence that a smoking ban affects the probability that a person smokes.