Ordinary Least Squares
Estimate a linear regression that predicts the average price of a fast food burger or chicken entree in a zip code based on the following explanatory variables: starting wage for fast food workers, median family income, the proportion of the population that lives in poverty, the average crime rate (per 1000 in population), the population density, and the proportion of the population that is black / African American.
lme <- lm(pentree ~ wagest + income + prppov + crmrte + density + prpblck, data=discrim)
summary(lme)
##
## Call:
## lm(formula = pentree ~ wagest + income + prppov + crmrte + density +
## prpblck, data = discrim)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.1055 -0.3926 -0.2611 0.1865 2.3784
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.002e+00 4.877e-01 2.055 0.04059 *
## wagest 1.309e-01 1.005e-01 1.301 0.19390
## income -5.996e-06 3.697e-06 -1.622 0.10563
## prppov -5.557e-01 9.848e-01 -0.564 0.57288
## crmrte 1.376e-01 9.853e-01 0.140 0.88901
## density -9.479e-06 7.199e-06 -1.317 0.18875
## prpblck 6.966e-01 2.597e-01 2.683 0.00762 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6368 on 370 degrees of freedom
## (33 observations deleted due to missingness)
## Multiple R-squared: 0.04796, Adjusted R-squared: 0.03253
## F-statistic: 3.107 on 6 and 370 DF, p-value: 0.00556
a. Report the estimated regression equation.
\[ PriceEntree_i = 1.002 + 0.1309 (Wage_i) - 0.000006 (Income_i) \] \[ - 0.5557 (PovertyProp_i) + 0.1376 (CrimeRate_i) -0.000009 (PopDensity_i) \]
\[ + 0.6967 (PropBlack_i) + e_i \]
b. Is there statistical evidence that there is racial discrimination in fast food prices, after accounting for fast food workers wage, median family income, proportion of the population in poverty, crime rate, and population density? Test the appropriate hypothesis.
Yes. The p-value on the proportion of the population that is black is 0.00762. We found sufficient statistical evidence that communities with larger black populations have different average fast food entree prices.
c. Report the explanatory variables where you have statistical evidence that they influence the fast food prices.
Just one, the proportion of the population that is black.
d. What percentage of the variability in fast food prices is accounted for by your explanatory variables?
The \(R^2\) value is 0.04796, therefore 4.8% of the variability is fast food prices is explained by the explanatory variables.
e. Test the hypothesis that at least one of your regression variables is useful in explaining prices of fast food entrees.
The p-value on the F-test is 0.00556. We found sufficient statistical evidence that at least one variable was useful in explaining prices of fast food entrees.
f. What does your regression predict would be the change in fast food entree prices for each $1,000 of additional median family income?
The coefficient on income is -0.000006. Therefore, \[\$1000 \times -0.000006 = -\$0.006.\] Each additional $1,000 of family income leads to a $0.006 drop in the price of fast food entrees.
Log Transformations
Estimate a linear regression that predicts the natural log of the average price of a fast food burger or chicken entree in a zip code based on the following explanatory variables: starting wage for fast food workers, the natural log of median family income, the proportion of the population that lives in poverty, the average crime rate, the natural log of the population population density, and the proportion of the population that is black / African American.
lme_log <- lm(log(pentree) ~ wagest + log(income) + prppov +
crmrte + log(density) + prpblck, data=discrim)
summary(lme_log)
##
## Call:
## lm(formula = log(pentree) ~ wagest + log(income) + prppov + crmrte +
## log(density) + prpblck, data = discrim)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.0083 -0.2477 -0.1439 0.2097 1.0782
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.594531 1.557540 1.024 0.3066
## wagest 0.112815 0.064503 1.749 0.0811 .
## log(income) -0.174552 0.142445 -1.225 0.2212
## prppov -0.660919 0.753220 -0.877 0.3808
## crmrte 0.151684 0.634245 0.239 0.8111
## log(density) -0.008293 0.023991 -0.346 0.7298
## prpblck 0.431052 0.168041 2.565 0.0107 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4091 on 370 degrees of freedom
## (33 observations deleted due to missingness)
## Multiple R-squared: 0.04115, Adjusted R-squared: 0.02561
## F-statistic: 2.647 on 6 and 370 DF, p-value: 0.01587
a. The same outcome and explanatory variables are used in this problem as the previous problem, but the outcome variable and some of the explanatory variables are expressed instead as a natural log. How much variability in this transformed outcome variable is accounted for by your explanatory variables? How does this compare to the previous model?
The \(R^2\) value is 0.04115, which is very similar to the previous model
b. With this new regression structure, is there statistical evidence that there is racial discrimination in fast food prices? Test the appropriate hypothesis.
Yes. The p-value on the proportion of the population that is black is 0.0107. We found sufficient statistical evidence that communities with larger black populations have different average fast food entree prices.
c. Accounting for all the explanatory variables in your regression model, how does a 1% increase in median income influence fast food prices? Construct and interpret a 95% confidence interval.
## 2.5 % 97.5 %
## (Intercept) -1.46820940 4.65727183
## wagest -0.01402343 0.23965344
## log(income) -0.45465557 0.10555204
## prppov -2.14204726 0.82020986
## crmrte -1.09549418 1.39886128
## log(density) -0.05546995 0.03888323
## prpblck 0.10061690 0.76148764
The cofficient on \(log(income)\) is -0.17. Therefore, a 1% increase in income on average leads to a 0.17% decrease in the price of a fast food entree. We are 95% confidence that the true impact on the price of fast food entrees is between -0.45% and 0.11%.
d. Accounting for all the explanatory variables in your regression model, how does a 1% increase in population density influence fast food prices? Construct and interpret a 95% confidence interval.
The cofficient on \(log(income)\) is -0.008. Therefore, a 1% increase in population density on average leads to a 0.008% decrease in the price of a fast food entree. We are 95% confidence that the true impact on the price of fast food entrees is between -0.055% and 0.039%.