Drawing Inference

Lucy D’Agostino McGowan

Data

data_sample

# A tibble: 33 × 5
      id screen_time battery_percent campus_location pit_meals
   <int>       <dbl>           <dbl> <chr>               <dbl>
 1     9         204              21 Quad                    7
 2     9         181              23 Quad                    7
 3     9         202              14 Quad                    5
 4     9         278              54 South Campus            5
 5     9         255              89 Quad                    6
 6     9         233              13 South Campus            9
 7     9         299              73 Quad                   10
 8     9         282              73 Quad                    2
 9     9         294              86 Quad                    6
10     9         231              42 Quad                    9
# ℹ 23 more rows

Survey data

Code

ggplot(data_sample, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent")

Full Survey data

Code

ggplot(data, aes(x = screen_time)) + 
  geom_histogram(bins = 50)
ggplot(data, aes(x = battery_percent)) + 
  geom_histogram(bins = 50)

Data Cleaning

data_clean <- data |>
  filter(battery_percent <= 100) |>
  mutate(screen_time = ifelse(screen_time == 1210, 132, screen_time))

Survey data

Code

ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_point(data = data_sample, color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent")

Survey data

Code

ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, formula = "y ~ x", lty = 2, color = "orange") +
  geom_point(data = data_sample, color = "cornflower blue") + 
  geom_smooth(data = data_sample, method = "lm", se = FALSE, 
              formula = "y ~ x", color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent",
       caption = "ID: 9")

Survey data

Code

ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, formula = "y ~ x", lty = 2, color = "orange") +
  geom_point(data = data[data$id == 43,], color = "cornflower blue") + 
  geom_smooth(data = data[data$id == 43,], method = "lm", se = FALSE, 
              formula = "y ~ x", color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent",
       caption = "ID: 43")

Survey data

Code

ggplot(data_clean, aes(x = screen_time, y = battery_percent)) +
  geom_point() + 
  geom_smooth(method = "lm", se = FALSE, formula = "y ~ x", lty = 2, color = "orange") +
  geom_point(data = data[data$id == 43,], color = "cornflower blue") + 
  geom_smooth(data = data[data$id == 43,], method = "lm", se = FALSE, 
              formula = "y ~ x", color = "cornflower blue") + 
  labs(x = "Average Daily Screen Time (Minutes)",
       y = "Battery Percent",
       caption = "ID: 39")

Survey data

What if I want to know the relationship between screen time and battery percent for Wake Forest Students?

How can we quantify how much we’d expect the slope to differ from one random sample to another?

We need a measure of uncertainty
How about the standard error of the slope?
The standard error is how much we expect \(\hat{\beta}_1\) to vary from one random sample to another.

Survey data

How can we quantify how much we’d expect the slope to differ from one random sample to another?

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

Survey data

We need a test statistic that incorporates \(\hat{\beta}_1\) and the standard error

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

\(t = \frac{\hat{\beta}_1}{SE_{\hat{\beta}_1}}\)

Survey data

How do we interpret this?

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

“\(\hat{\beta}_1\) is 4.53 standard errors below a slope of zero”

Survey data

How do we know what values of this statistic are worth paying attention to?

mod <- lm(battery_percent ~ screen_time, data = data_sample)
summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

confidence intervals, p-values
Hypothesis testing: \(H_0: \beta_1 = 0\) \(H_A: \beta_1 \neq 0\)

Survey data

How do get a confidence interval for \(\hat{\beta}_1\)? What function can we use in R?

confint(mod)

                   2.5 %     97.5 %
(Intercept) -103.9495418 -8.7592428
screen_time    0.2434083  0.6425661

How do we interpret this value?

`Application Exercise`

Create a new project from this template in RStudio Pro:

https://github.com/sta-112-f23/appex-08.git

Fit the model of battery_percent and screen_time in your data
Calculate a confidence interval for the estimate \(\hat\beta_1\)
Interpret this value

05:00

Hypothesis testing

So far, we have estimated the relationship between screen time and battery percent
This could be useful if we wanted to understand, on average, how these variables are related (estimation)
This could also be useful if we wanted to guess someone’s battery percent from their screen time (prediction)
What if we just want to know whether there is some relationship bewteen the two? (hypothesis testing)

Hypothesis testing

Null hypothesis: There is no relationship between screen time and battery percent
- \(H_0: \beta_1 = 0\)
Alternative hypothesis: There is a relationship between screen time and battery percent
- \(H_A: \beta_1 \neq 0\)

Hypothesis testing

summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

Is \(\hat\beta_1\) different from 0?

Hypothesis testing

summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

Is \(\beta_1\) different from 0? (notice the lack of the hat!)

p-value

The probability of observing a statistic as extreme or more extreme than the observed test statistic given the null hypothesis is true

p-value

summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

What is the p-value? What is the interpretation?

Hypothesis testing

Null hypothesis: \(\beta_1 = 0\) (there is no relationship between screen time and battery percent)
Alternative hypothesis: \(\beta_1 \neq 0\) (there is a relationship between screen time and battery percent)
Often we have an \(\alpha\) level cutoff to compare the p-value to, for example 0.05.
If p-value < 0.05, we reject the null hypothesis
If p-value > 0.05, we fail to reject the null hypothesis
Why don’t we ever “accept” the null hypothesis?
absense of evidence is not evidence of absense

p-value

summary(mod)


Call:
lm(formula = battery_percent ~ screen_time, data = data_sample)

Residuals:
    Min      1Q  Median      3Q     Max 
-61.226 -14.151   2.546  12.516  40.836 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -56.35439   23.33652  -2.415   0.0218 *  
screen_time   0.44299    0.09786   4.527 8.29e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 22.51 on 31 degrees of freedom
Multiple R-squared:  0.398, Adjusted R-squared:  0.3786 
F-statistic: 20.49 on 1 and 31 DF,  p-value: 8.293e-05

Do we reject the null hypothesis?

`Application Exercise`

Open appex-08.qmd
Examine the summary of the model of battery_percent and screen_time with your data
Test the null hypothesis that there is no relationship between screen time and battery percent
What is the p-value? What is the result of your hypothesis test?
Turn this in on Canvas

02:00

Survey Data

Code

overall <- coef(lm(battery_percent ~ screen_time, data = data_clean))[2]

data_clean |>
  nest_by(id) |>
  mutate(model = list(lm(battery_percent ~ screen_time, data = data))) |>
  reframe(broom::tidy(model, conf.int = TRUE)) |>
  filter(term == "screen_time") |>
  mutate(yes = ifelse(conf.low < overall & conf.high > overall, 1, 0)) |>
  ggplot(aes(y = factor(id), xmin = conf.low, x = estimate, xmax = conf.high, color = yes)) +
  geom_pointrange() +
  geom_vline(xintercept = overall, lty = 2) + 
  theme(legend.position = "none") + 
  ylab("id") + 
  xlab("Average Daily Screen Time (Minutes)")