Logistic Regression

Lucy D’Agostino McGowan

Outcome variable

  • So far, we’ve only had continuous (numeric, quantitative) outcome variables ( \(y\) )
  • We’ve just learned about categorical and binary explanatory variables ( \(x\) )
  • What if we have a binary outcome variable?

Outcome variable

What does it mean to be a binary variable?

  • So far, we’ve only had continuous (numeric, quantitative) outcome variables ( \(y\) )
  • We’ve just learned about categorical and binary explanatory variables ( \(x\) )
  • What if we have a binary outcome variable?

Let’s look at an example

  • 446 teens were asked “On an average school night, do you get at least 7 hours of sleep”
  • Outcome is [1 = “Yes”, 0 = “No”]
  • Is Age related to this outcome?
  • What if I try to fit this as a linear regression model?
Code
data("LosingSleep")

ggplot(LosingSleep, aes(Age, Outcome)) + 
  geom_point() + 
  geom_line(aes(x = Age, y = predict(lm(Outcome ~ Age, data = LosingSleep))))

Let’s look at an example

  • 446 teens were asked “On an average school night, do you get at least 7 hours of sleep”
  • Outcome is [1 = “Yes”, 0 = “No”]
  • Is Age related to this outcome?
  • What if I try to fit this as a linear regression model?
Code
LosingSleep |>
  group_by(Age) |>
  count(Age, Outcome) |>
  mutate(p = n / sum(n)) |>
  filter(Outcome == 1) |>
  ggplot(aes(Age, p)) + 
  geom_point() + 
  geom_line(data = LosingSleep, aes(x = Age, y = predict(lm(Outcome ~ Age, data = LosingSleep)))) + 
  ylim(0, 1)

Let’s look at an example

  • 446 teens were asked “On an average school night, do you get at least 7 hours of sleep”
  • Outcome is [1 = “Yes”, 0 = “No”]
  • Is Age related to this outcome?
  • What if I try to fit this as a linear regression model?
Code
new_data <- data.frame(Age = 0:40)
LosingSleep |>
  group_by(Age) |>
  count(Age, Outcome) |>
  mutate(p = n / sum(n)) |>
  filter(Outcome == 1) |>
  ggplot(aes(Age, p)) + 
  geom_point() + 
  geom_line(data = new_data, aes(x = Age, y = predict(lm(Outcome ~ Age, data = LosingSleep), newdata = new_data))) + 
  ylim(-1, 2) + 
  xlim(0, 40) +
  geom_hline(yintercept = c(0, 1), lty = 2) 

Let’s look at an example

  • Perhaps it would be sensible to find a function that would not extend beyond 0 and 1?
Code
LosingSleep |>
  group_by(Age) |>
  count(Age, Outcome) |>
  mutate(p = n / sum(n)) |>
  filter(Outcome == 1) |>
  ggplot(aes(Age, p)) + 
  geom_point() + 
  geom_rect(aes(xmin = 0, xmax = 5, ymin = 1, ymax = 1.2), fill = "yellow", color = "black") +
  geom_rect(aes(xmin = 35, xmax = 40, ymin = 0, ymax = -0.2), fill = "yellow", color = "black") +
  geom_line(data = new_data, aes(x = Age, y = predict(lm(Outcome ~ Age, data = LosingSleep), newdata = new_data))) + 
  ylim(-1, 2) + 
  xlim(0, 40) +
  geom_hline(yintercept = c(0, 1), lty = 2) 

Let’s look at an example

  • Perhaps it would be sensible to find a function that would not extend beyond 0 and 1?
Code
LosingSleep |>
  group_by(Age) |>
  count(Age, Outcome) |>
  mutate(p = n / sum(n)) |>
  filter(Outcome == 1) |>
  ggplot(aes(Age, p)) + 
  geom_point() + 
  geom_rect(aes(xmin = 0, xmax = 5, ymin = 1, ymax = 1.2), fill = "yellow", color = "black") +
  geom_rect(aes(xmin = 35, xmax = 40, ymin = 0, ymax = -0.2), fill = "yellow", color = "black") +
  geom_line(data = new_data,
            aes(x = Age, y = predict(
              glm(Outcome ~ Age, family = "binomial", data = LosingSleep),
              newdata = new_data,
              type = "response"))) + 
  ylim(-1, 2) + 
  xlim(0, 40) +
  geom_hline(yintercept = c(0, 1), lty = 2) 

Let’s look at an example

  • Perhaps it would be sensible to find a function that would not extend beyond 0 and 1?
Code
LosingSleep |>
  group_by(Age) |>
  count(Age, Outcome) |>
  mutate(p = n / sum(n)) |>
  filter(Outcome == 1) |>
  ggplot(aes(Age, p)) + 
  geom_point() + 
  geom_line(data = new_data,
            aes(x = Age, y = predict(
              glm(Outcome ~ Age, family = "binomial", data = LosingSleep),
              newdata = new_data,
              type = "response"))) + 
  xlim(0, 40) +
  geom_hline(yintercept = c(0, 1), lty = 2) 
  • this line is fit using logistic regression model

How does this compare to linear regression?

Model Outcome Form
Ordinary linear Regression Numeric \(y \approx \beta_0 + \beta_1 x\)
Number of Doctors example Numeric \(\sqrt{\textrm{Number of doctors}}\approx \beta_0 +\beta_1x\)
Logistic regression Binary \(\log\left(\frac{\pi}{1-\pi}\right) = \beta_0 + \beta_1x\)
  • \(\pi\) is the probability that \(y = 1\) ( \(P(y = 1)\) )

Notation

  • \(\log\left(\frac{\pi}{1-\pi}\right)\): the “log odds”
  • \(\pi\) is the probability that \(y = 1\) - the probability that your outcome is 1.
  • \(\frac{\pi}{1-\pi}\) is a ratio representing the odds that \(y = 1\)
  • \(\log\left(\frac{\pi}{1-\pi}\right)\) is the log odds
  • The transformation from \(\pi\) to \(\log\left(\frac{\pi}{1-\pi}\right)\) is called the logistic or logit transformation

A bit about “odds”

  • The “odds” tell you how likely an event is
  • 👛 if I flip a fair coin, what is the probability that I’d get heads?
    • \(p = 0.5\)
  • What is the probability that I’d get tails?
    • \(1 - p = 0.5\)
  • The odds are 1:1, 0.5:0.5
  • the odds can be written as \(\frac{p}{1-p} =\frac{0.5}{0.5} = 1\)

A bit about “odds”

  • The “odds” tell you how likely an event is
  • ⛱ Let’s say there is a 60% chance of rain today
  • What is the probability that it will rain?
    • \(p = 0.6\)
  • What is the probability that it won’t rain?
    • \(1-p = 0.4\)
  • What are the odds that it will rain?
    • 3 to 2, 3:2, \(\frac{0.6}{0.4} = 1.5\)

Transforming logs

  • How do you “undo” a \(\log\) base \(e\)?
  • Use \(e\)! For example:
    • \(e^{\log(10)} = 10\)
  • \(e^{\log(1283)} = 1283\)
  • \(e^{\log(x)} = x\)

Transforming logs

How would you get the odds from the log(odds)?

  • How do you “undo” a \(\log\) base \(e\)?
  • Use \(e\)! For example:
    • \(e^{\log(10)} = 10\)
    • \(e^{\log(1283)} = 1283\)
    • \(e^{\log(x)} = x\)
  • \(e^{\log(odds)}\) = odds

Transforming odds

  • odds = \(\frac{\pi}{1-\pi}\)
  • Solving for \(\pi\)
    • \(\pi = \frac{\textrm{odds}}{1+\textrm{odds}}\)
  • Plugging in \(e^{\log(odds)}\) = odds
  • \(\pi = \frac{e^{\log(odds)}}{1+e^{\log(odds)}}\)
  • Plugging in \(\log(odds) = \beta_0 + \beta_1x\)
  • \(\pi = \frac{e^{\beta_0 + \beta_1x}}{1+e^{\beta_0 + \beta_1x}}\)

The logistic model

  • ✌️ forms
Form Model
Logit form \(\log\left(\frac{\pi}{1-\pi}\right) = \beta_0 + \beta_1x\)
Probability form \(\Large\pi = \frac{e^{\beta_0 + \beta_1x}}{1+e^{\beta_0 + \beta_1x}}\)

The logistic model

probability odds log(odds)
\(\pi\) \(\frac{\pi}{1-\pi}\) \(\log\left(\frac{\pi}{1-\pi}\right)=l\)

⬅️

log(odds) odds probability
\(l\) \(e^l\) \(\frac{e^l}{1+e^l} = \pi\)

The logistic model

  • ✌️ forms
  • log(odds): \(l = \beta_0 + \beta_1x\)
  • P(Outcome = Yes): \(\Large\pi =\frac{e^{\beta_0 + \beta_1x}}{1+e^{\beta_0 + \beta_1x}}\)

Example

  • We are interested in the probability of getting accepted to medical school given a college student’s GPA
data("MedGPA")
ggplot(MedGPA, aes(Accept, GPA)) + 
  geom_boxplot() + 
  geom_jitter()

Example

What is the equation for the model we are going to fit?

  • We are interested in the probability of getting accepted to medical school given a college student’s GPA

Example

What is the equation for the model we are going to fit?

  • \(\log(odds) = \beta_0 + \beta_1 GPA\)
  • P(Accept) \(= \frac{e^{\beta_0 + \beta_1GPA}}{1+e^{\beta_0 + \beta_1GPA}}\)
  • We are interested in the probability of getting accepted to medical school given a college student’s GPA

Example

  • We are interested in the probability of getting accepted to medical school given a college student’s GPA
glm(Acceptance ~ GPA, data = MedGPA,
    family = "binomial") 

Call:  glm(formula = Acceptance ~ GPA, family = "binomial", data = MedGPA)

Coefficients:
(Intercept)          GPA  
    -19.207        5.454  

Degrees of Freedom: 54 Total (i.e. Null);  53 Residual
Null Deviance:      75.79 
Residual Deviance: 56.84    AIC: 60.84

Example

  • We are interested in the probability of getting accepted to medical school given a college student’s GPA
glm(Acceptance ~ GPA, data = MedGPA,
    family = "binomial") |>
  fitted() 
         1          2          3          4          5          6          7 
0.63124876 0.85036851 0.16944765 0.71491359 0.31617156 0.74706022 0.88186412 
         8          9         10         11         12         13         14 
0.27099332 0.73661581 0.88186412 0.92030775 0.45723876 0.79506038 0.61846455 
        15         16         17         18         19         20         21 
0.23009846 0.52528952 0.66845438 0.52528952 0.18535740 0.88186412 0.73661581 
        22         23         24         25         26         27         28 
0.79506038 0.89276358 0.87606259 0.70366831 0.55238896 0.39074729 0.57918073 
        29         30         31         32         33         34         35 
0.34021444 0.83595184 0.63124876 0.08681729 0.88186412 0.72589836 0.17726253 
        36         37         38         39         40         41         42 
0.86372477 0.52528952 0.34021444 0.87001815 0.11101432 0.30449919 0.31617156 
        43         44         45         46         47         48         49 
0.63124876 0.90745180 0.30449919 0.29307306 0.92030775 0.06749390 0.22057873 
        50         51         52         53         54         55 
0.69217046 0.01247874 0.55238896 0.44373791 0.01917403 0.39074729