Introduction

Load the tidyverse library to make use of these functions:

Open the dataset and view its structure with the following function calls:

## Observations: 1,063
## Variables: 7
## $ MALE       <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
## $ WHITE      <int> 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,…
## $ EDUCYRS    <int> 16, 15, 18, 16, 14, 16, 12, 19, 18, 0, 12, 14, 12, 15…
## $ UHRSWORKT  <int> 72, 40, 45, 45, 40, 37, 70, 38, 25, 7, 50, 40, 40, 67…
## $ EARNWEEK   <dbl> 800.00, 1000.00, 2884.61, 1538.46, 380.00, 962.00, 62…
## $ SPUSUALHRS <int> 32, 50, 40, 40, 40, 35, 70, 40, 40, 40, 15, 40, 40, 1…
## $ SPEARNWEEK <dbl> 217.50, 1057.69, 2307.69, 961.53, 568.80, 865.00, 617…

This loads the data frame into memory. The data frame object is named, df.

Answer the questions below. Type up your answers and submit to the appropriate Canvas assignment folder. Include in your answers (1) the code you used, (2) the output from the code, and (3) your written description of the interpretation of the output as appropriate to answer the question.

Problem 1

Do men on average work a different number hours per week than their partner? Test the appropriate hypothesis. Compute and interpret a 95% confidence interval for the difference between male’s usual weekly hours and their partners’.

## 
##  Paired t-test
## 
## data:  df$UHRSWORKT and df$SPUSUALHRS
## t = 14.751, df = 1062, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  5.695319 7.442969
## sample estimates:
## mean of the differences 
##                6.569144

With a p-value less than \(2.2 \times 10^{-16}\), we have sufficient statistical evidence that men on average do work a different number of hours per week on average than their partners.

We are 95% confident that men work between 5.7 and 7.4 hours more than their partner.

Problem 2

Do White/Caucasian people have a different average weekly total earnings than non-white people? Test the appropriate hypothesis. Compute and interpret a 95% confidence interval for the difference between average weekly earnings for whites versus non-whites.

## 
##  Welch Two Sample t-test
## 
## data:  df$EARNWEEK by df$WHITE
## t = -0.27822, df = 208.01, p-value = 0.7811
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -147.2326  110.8153
## sample estimates:
## mean in group 0 mean in group 1 
##        1193.459        1211.667

With a p-value equal to 0.7811, we fail to find statistical evidence that there is a difference between average earnings per week between white people and non-white people.

We are 95% confident that white people earn somewhere between $147.23 less than non-white people to $110.81 more than non-white people.

Problem 3

Is there a relationship between educational attainment and weekly earnings? Test the appropriate hypothesis. Construct and interpret a 95% confidence interval for the appropriate statistic. Comment on the nature of the relationship.

## 
##  Pearson's product-moment correlation
## 
## data:  df$EDUCYRS and df$EARNWEEK
## t = 16.329, df = 1061, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.398753 0.494927
## sample estimates:
##       cor 
## 0.4481357

With a p-value less than \(2.2 \times 10^{-16}\), we find sufficient statistical evidence that there is a correlation between education and earnings. We are 95% confident the true correlation coefficient is between 0.39 and 0.49. The correlation is positive. On average, men who have higher levels of education have higher earnings.

Problem 4

Test the hypothesis that employed men work on average more than 40 hours per week. Construct and interpret a 95% confidence interval for the average number of hours that men usually work per week.

## 
##  One Sample t-test
## 
## data:  df$UHRSWORKT
## t = 15.06, df = 1062, p-value < 2.2e-16
## alternative hypothesis: true mean is greater than 40
## 95 percent confidence interval:
##  43.98167      Inf
## sample estimates:
## mean of x 
##  44.47037

With a p-value less than \(2.2 \times 10^{-16}\), we find sufficient statistical evidence that men on average work more than 40 hours per week.

## 
##  One Sample t-test
## 
## data:  df$UHRSWORKT
## t = 15.06, df = 1062, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 40
## 95 percent confidence interval:
##  43.88789 45.05284
## sample estimates:
## mean of x 
##  44.47037

We are 95% confident that men on average work between 43.9 hours per week and 45.1 hours per week.

Problem 5

What other variables in the dataset do you think may affect usual hourly earnings? Describe the relationship you may expect? What statistical tests would you use to determine if these relationships exist?

We found evidence that education is positively correlated with earnings.

I expect that usual hours worked is also positively correlated with earnings. This could be measured with a correlation.

I expect that partners’ earnings are also positively correlated with earnings. This could be measured with a correlation.