https://github.com/sta-112-f23/hw-01-multiple-linear-regression.git
HW 01 - Multiple Linear Regression
Due: 2023-10-31 at 11:59pm Turn your .html file in on Canvas
Getting started
Go to RStudio Pro and click:
Step 1. File > New Project
Step 2. “Version Control”
Step 3. Git
Step 4. Copy the following into the “Repository URL”:
Exercises
The Diamonds
data set can be used to examine the relationship between a diamond’s characteristics and it’s price. Using the Diamonds
data set in the Stat2Data
package, we want to assess following models:
(#1) TotalPrice
= \(\beta_0 + \beta_1\)Depth
+ \(\beta_2\)Depth
\(^2 + \epsilon\)
(#2) TotalPrice
= \(\beta_0 + \beta_1\)Carat
+ \(\beta_2\)Depth
\(+ \epsilon\)
(#3) TotalPrice
= \(\beta_0 + \beta_1\)Carat
+ \(\beta_2\)Depth
+ \(\beta_3\)Depth
\(\times\)Carat
\(+ \epsilon\)
Using Multiple linear regression
How many observations are in this data set? What are the observations? How many variables are in this data set? Is there any missing data? If so, handle the missing data and report how you did so.
What are the assumptions of multiple regression?
Fit the three models described above. For each model (1, 2, and 3), create two plots: (1) a Residuals vs Fits plot and (2) a Q-Q plot. Assess whether the assumptions in exercise 1 hold (if an assumption is not testable, state that).
Inference
- For each model (1, 2, and 3), report the \(\hat\beta\) values and confidence intervals. Interpret all \(\hat\beta\) coefficients in the context of the model fit.
Prediction
Examine \(R^2\) and adjusted \(R^2\) for each model (1, 2, and 3). Use this metric to select your final prediction model. Which did you select? Why did you select this model? What is the equation for this prediction model?
A diamond sales person has a shipment of 1 carat diamonds with a depth of 60. They want to know what the average total price will be for these diamonds. Report this value along with an appropriate confidence interval.
Your relative has a 1 carat diamond with a depth of 59.7 – they want to know what you would predict the total price for this diamond would be. Report this value along with the appropriate confidence interval.