COSMOS logo

Logistic Regression

Zhaoxia Yu

Load packages, read data

Code
library(fabricerin)
library(tidyverse)
library(ggplot2)
Code
library(fabricerin)
library(tidyverse)
library(ggplot2)

#read, select variables
#remove impaired, and define new variables
alzheimer_data <- read.csv('../data/alzheimer_data.csv') %>% 
  select(id, diagnosis, age, educ, female, height, weight, lhippo, rhippo) %>%
  filter(diagnosis!=1) %>%
  mutate(alzh=(diagnosis>0)*1, female = as.factor(female), hippo=lhippo+rhippo) 

table(alzheimer_data$alzh)

   0    1 
1534  553 
Code
ggplot(data=alzheimer_data, aes(x=alzh)) +
  geom_bar()

Outline

  • Introduction
    • Motivating example: alzh ~ hippo
  • Ex1: One binary predictor
  • Ex2: One numerical predictor
  • Ex3: Multiple predictors
  • Additional examples
    • Ex4: one numerical predictor
    • Ex5: mulitple predictors

Introduction

Motivating example: alzh ~ hippo

Code
ggplot(data=alzheimer_data, aes(x=hippo, y=alzh)) +
  geom_point() +
  geom_smooth(method='lm', se=FALSE) +
  labs(title="alzh vs hippo")
  • Any comments about the linear fit?

Predict Binary Responses (wrong way)

  • Now let us do predictions using linear regression: alzh.lm
Code
alzh.lm=lm(alzh ~ hippo, data=alzheimer_data)
#add lm.pred variable
alzheimer_data$lm.pred=predict(alzh.lm)
#add red flags
alzheimer_data$lm.pred.red = alzheimer_data$lm.pred
alzheimer_data$lm.pred.red[alzheimer_data$lm.pred<1 &
                             alzheimer_data$lm.pred>0]=NA
ggplot(alzheimer_data) +
  geom_point(aes(x=alzh, y=lm.pred)) +
  geom_point(aes(x=alzh, y=lm.pred.red), col="red") +
  geom_hline(yintercept = c(0,1), col='red') +
  labs(x="true alzh", y="predicted by lm")
  • Is alzh.lm a good model?

Binary Response Variables

  • Recall that alzh is a binary variable.

  • We tried linear regression: alzh \(\sim\) hippo

    • It does not fit well: the difference between fitted and observed is large for most data points.
    • The predicted/ values for the binary response variable alzh can be greater than 1 or less than 0.

Binary Response Variables

  • Now consider situations where the response variable \(Y\) is a binary random variable (e.g., disease status).

  • Recall that a Bernoulli distribution can be used to characterize the behavior of a binary random variable.

\[Y \sim Bernoulli(p),\]

where, \(p\) is the probability of the outcome of interest (denoted as 1) given the explanatory variables.

Odds

  • In statistics, odds quantifies of the relative probabilities of an event and its complement

\[odds=\frac{P(A)}{P(A^c)}=\frac{P(A)}{1-P(A)}.\]

  • For a binary variable \(Y\). Assume \(Y\sim Bernoulli(p)\). The odds is

\[odds = \frac{P(Y=1)}{P(Y=0)}=\frac{p}{1-p}\]

  • The range of odds is \([0,\infty)\). Can you find a transformation such as that the range is \((-\infty, \infty)\)?

Log-Odds and Logit Transformation

  • \(log \left( \frac{p}{1-p}\right)\) is called the logit function, i.e.,

\[logit(p)=log \left( \frac{p}{1-p}\right).\]

Code
p=seq(0.01, 0.99, 0.01)
logit.p=log(p/(1-p))
plot(p, logit.p, type="l", xlab=expression(pi), main="the logit function")

How to Model Binary Responses?

  • Linear regression is not good choice for a binary response variable.

  • For linear regression models, the response variable, \(Y\), is assumed to be a real-valued continuous random variable.

  • When the response variable is binary, such as alzh, how should we model it using one or multiple covarites?

  • For such problems, it is common to use logistic regression instead

Logisitc Regression

  • In a logistic regression, we model the log odds using a linear combination of the explantoary variables (predictors):

\[\begin{align*} \log \left(\frac{\hat{p}}{1- \hat{p}} \right) & = b_{0} + b_{1}x_1 + \ldots + b_{q}x_{q} \end{align*}\]

  • The term \(\Big(\frac{\hat{p}}{1- \hat{p}} \Big)\) is called the odds of \(Y=1\).

  • The term \(\log \Big(\frac{\hat{p}}{1- \hat{p}} \Big)\), i.e., log of odds, is called the logit function.

  • Although \(p\) is a real number between 0 and 1, its logit transformation can be any real number between \(-\infty\) to \(+\infty\).

Ex1: One Binary Predictor

Birthweight data

  • As an example, we use the birthwt data set to model the relationship between having low birthweight babies (a binary variable), \(Y\), and smoking during pregnancy, \(X\).

  • The binary variable low identifies low birthweight babies (low = 1 for low birthweight babies, and 0 otherwise).

  • The binary variable smoke identifies mothers who were smoking during pregnancy (smoke=1 for smoking during pregnancy, and 0 otherwise).

Birthweight data

Code
library(MASS)
data(birthwt)
table(birthwt$low, birthwt$smoke)
   
     0  1
  0 86 44
  1 29 30
  • chisq.test can be used for testing the associaiton between them.

  • A logistic regression provides more information

Logistic Regression with One Binary Predictor: Generalized linear model (glm) in R

  • We can use the glm() function in R to fit a regression model
Code
birthwt <- birthwt %>% 
  mutate(across(c(low, smoke, race, ht, ui), factor))
glimpse(birthwt)
Rows: 189
Columns: 10
$ low   <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ age   <int> 19, 33, 20, 21, 18, 21, 22, 17, 29, 26, 19, 19, 22, 30, 18, 18, …
$ lwt   <int> 182, 155, 105, 108, 107, 124, 118, 103, 123, 113, 95, 150, 95, 1…
$ race  <fct> 2, 3, 1, 1, 1, 3, 1, 3, 1, 1, 3, 3, 3, 3, 1, 1, 2, 1, 3, 1, 3, 1…
$ smoke <fct> 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0…
$ ptl   <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0…
$ ht    <fct> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ ui    <fct> 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1…
$ ftv   <int> 0, 3, 1, 2, 0, 0, 1, 1, 1, 0, 0, 1, 0, 2, 0, 0, 0, 3, 0, 1, 2, 3…
$ bwt   <int> 2523, 2551, 2557, 2594, 2600, 2622, 2637, 2637, 2663, 2665, 2722…

Generalized linear model (glm) in R

Code
fit <- glm(low ~ smoke, family = 'binomial', data = birthwt)
fit

Call:  glm(formula = low ~ smoke, family = "binomial", data = birthwt)

Coefficients:
(Intercept)       smoke1  
    -1.0871       0.7041  

Degrees of Freedom: 188 Total (i.e. Null);  187 Residual
Null Deviance:      234.7 
Residual Deviance: 229.8    AIC: 233.8

Generalized linear model (glm) in R

  • We can use the glm() function in R to fit a regression model
Code
confint(fit)
                 2.5 %    97.5 %
(Intercept) -1.5243118 -0.679205
smoke1       0.0786932  1.335154

Generalized linear model (glm) in R

  • We can use the glm() function in R to fit a regression model
Code
summary(fit)

Call:
glm(formula = low ~ smoke, family = "binomial", data = birthwt)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.0197  -0.7623  -0.7623   1.3438   1.6599  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -1.0871     0.2147  -5.062 4.14e-07 ***
smoke1        0.7041     0.3196   2.203   0.0276 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 234.67  on 188  degrees of freedom
Residual deviance: 229.80  on 187  degrees of freedom
AIC: 233.8

Number of Fisher Scoring iterations: 4

Estimation

  • For the above example, the estimated values of the intercept \(\alpha\) and the regression coefficient \(\beta\) are \(b_0=-1.09\) and \(b_1=0.70\) respectively.

  • Therefore,

\[\begin{align*} \frac{\hat{p}}{1- \hat{p}} & = & \exp(-1.09 + 0.70x) \end{align*}\]

  • Here, \(\hat{p}\) is the estimated probability of having a low birthweight baby for a given \(x\).

  • The left-hand side of the above equation is the estimated odds of having a low birthweight baby.

Estimation

  • For non-smoking mother, \(x=0\), the odds of having low birthweight baby is \[\begin{align*} \frac{\hat{p}_{0}}{1- \hat{p}_{0}} & = & \exp(-1.09) \\ & = & 0.34 \end{align*}\]

  • That is, the exponential of the intercept is the odds when \(x=0\), which is sometimes referred to as the baseline odds.

Estimation

  • For mothers who smoke during pregnancy, \(x=1\), and \[\begin{align*} \frac{\hat{p}_{1}}{1- \hat{p}_{1}} & = & \exp(-1.09 + 0.7)\\ & = & \exp(-1.09) \exp(0.7)\\ & = & 0.68 \end{align*}\]

  • As we can see, corresponding to one unit increase in \(x\) from \(x=0\) (non-smoking) to \(x=1\) (smoking), the odds multiplicatively increases by the exponential of the regression coefficient.

Interpretation

  • Note that

\[\begin{align*} \frac{\frac{\hat{p}_{1}}{1- \hat{p}_{1}}}{\frac{\hat{p}_{0}}{1- \hat{p}_{0}}} & = & \frac{\exp(-1.09) \exp(0.7)}{\exp(-1.09)} = \exp(0.7) = 2.01 \end{align*}\]

  • We can interpret the exponential of the regression coefficient as the odds ratio of having low birthweight babies for smoking mothers compared to non-smoking mothers.

  • Here, the estimated odds ratio is \(\exp(0.7) = 2.01\) so the odds of having a low birthweight baby almost doubles for smoking mothers compared to non-smoking mothers.

Interpretation

  • In general,

    • if \(b_1>0\), then \(\exp(b_1) > 1\) so the odds increases as \(X\) increases;

    • if \(b_1<0\), then \(0 < \exp(b_1) < 1\) so the odds decreases as \(X\) increases;

    • if \(b_1=0\), the odds ratio is 1 so the odds does not change with \(X\) according to the assumed model.

Prediction

  • We can use logistic regression models for predicting the unknown values of the response variable \(Y\) given the value of the predictor value \(X\).

\[\begin{align*} \hat{p} & = & \frac{\exp(b_0 + b_1x)}{1 + \exp(b_0 + b_1x)} \end{align*}\]

  • For the above example, \[\begin{align*} \hat{p} & = & \frac{\exp(-1.09 + 0.70x)}{1 + \exp(-1.09 + 0.70x)} \end{align*}\]

Prediction

  • Therefore, the estimated probability of having a low birthweight baby for non-smoking mothers, \(x=0\), is \[\begin{align*} \hat{p} & = & \frac{\exp(-1.09)}{1 + \exp(-1.09)} = 0.25 \end{align*}\]

  • This probability increases for mothers who smoke during pregnancy, \(x=1\), \[\begin{align*} \hat{p} & = & \frac{\exp(-1.09 + 0.7)}{1 + \exp(-1.09 + 0.7)} = 0.40 \end{align*}\]

  • That is, the risk of having a low birthweight baby increases by 60% if a mother smokes during her pregnancy.

Ex 2: Logistic Regression with One Numerical Predictor

  • For the most part, we follow similar steps to fit the model, estimate regression parameters, perform hypothesis testing, and predict unknown values of the response variable.

  • As an example, we want to investigate the relationship between having a low birthweight baby, \(Y\), and mother’s age at the time of pregnancy, \(X\).

Logistic Regression with One Numerical Predictor

Code
fit <- glm(low ~ age, family = 'binomial', data = birthwt)
fit

Call:  glm(formula = low ~ age, family = "binomial", data = birthwt)

Coefficients:
(Intercept)          age  
    0.38458     -0.05115  

Degrees of Freedom: 188 Total (i.e. Null);  187 Residual
Null Deviance:      234.7 
Residual Deviance: 231.9    AIC: 235.9

Logistic Regression with One Numerical Predictor

  • Finding confidence intervals and performing hypothesis testing remain as before, so we focus on prediction and interpreting the point estimates.

  • For the above example, the point estimates for the regression parameters are \(b_0=0.38\) and \(b_1=-0.05\).

  • While the intercept is the log odds when \(x=0\), it is not reasonable to interpret its exponential as the baseline odds since mother’s age cannot be zero.

Logistic Regression with One Numerical Predictor

  • To interpret \(b_1\), consider mothers whose age is 20 years old at the time of pregnancy, \[\begin{align*} \log \Big(\frac{\hat{p}_{20}}{1- \hat{p}_{20}} \Big) & = & 0.38 - 0.05 \times 20\\ \frac{\hat{p}_{20}}{1- \hat{p}_{20}} & = & \exp(0.38 - 0.05 \times 20)\\ & = & \exp(0.38) \exp(- 0.05 \times 20) \end{align*}\]

Logistic Regression with One Numerical Predictor

  • For mothers who are one year older (i.e., one unit increase in age), we have \[\begin{align*} \log \Big(\frac{\hat{p}_{21}}{1- \hat{p}_{21}} \Big) & = & 0.38 - 0.05 \times 21\\ \frac{\hat{p}_{21}}{1- \hat{p}_{21}} & = & \exp(0.38 - 0.05 \times 21)\\ & = & \exp(0.38) \exp(- 0.05 \times 21) \end{align*}\]

Logistic Regression with One Numerical Predictor

  • The odds ratio for comparing 21 year old mothers to 20 year old mothers is \[\begin{align*} \frac{\frac{\hat{p}_{21}}{1- \hat{p}_{21}}}{\frac{\hat{p}_{20}}{1- \hat{p}_{20}}} & = & \frac{ \exp(0.38) \exp(- 0.05 \times 21))}{ \exp(0.38) \exp(- 0.05 \times 20)}\\ & = & \exp(- 0.05 \times 21 + 0.05 \times 20)\\ & = & \exp(- 0.05) \end{align*}\]

  • Therefore, \(\exp(b_1)\) is the estimated odds ratio comparing 21 year old mothers to 20 year old mothers.

Logistic Regression with One Numerical Predictor

  • In general, \(\exp(b_1)\) is the estimated odds ratio for comparing two subpopulations, whose predictor values are \(x+1\) and \(x\),

\[\begin{align*} \frac{\frac{\hat{p}_{x+1}}{1- \hat{p}_{x+1}}}{\frac{\hat{p}_{x}}{1- \hat{p}_{x}}} & = & \exp(b_1) \end{align*}\]

Logistic Regression with One Numerical Predictor

  • As before, we can use the estimated regression parameters to find \(\hat{p}\) and predict the unknown value of the response variable. \[\begin{align*} \hat{p} & = & \frac{\exp(b_0 + b_1x)}{1 + \exp(b_0 + b_1x)} & = & \frac{\exp(0.38 -0.05x)}{1 + \exp(0.38 -0.05x)}. \end{align*}\]

  • For example, for mother who are 20 years old at the time of pregnancy, the estimated probability of having a low birthweight baby is \[\begin{align*} \hat{p} & = & \frac{\exp(0.38 -0.05 \times 20)}{1 + \exp(0.38 -0.05 \times 20)} = 0.35. \end{align*}\]

Ex3: Logistic Regression with Multiple Variables

  • Including multiple explanatory variables (predictors) in a logistic regression model is easy.

  • Similar to linear regression models, we specify the model formula by entering the response variable on the left side of the “~” symbol and the explanatory variables (separated by “+” sings) on the right side.

Logistic Regression with Multiple Variables

Code
fit <- glm(low ~ age + smoke, family = 'binomial', 
           data = birthwt)
fit

Call:  glm(formula = low ~ age + smoke, family = "binomial", data = birthwt)

Coefficients:
(Intercept)          age       smoke1  
    0.06091     -0.04978      0.69185  

Degrees of Freedom: 188 Total (i.e. Null);  186 Residual
Null Deviance:      234.7 
Residual Deviance: 227.3    AIC: 233.3

Logistic Regression with Multiple Variables

Code
summary(fit)

Call:
glm(formula = low ~ age + smoke, family = "binomial", data = birthwt)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.1589  -0.8668  -0.7470   1.2821   1.7925  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)  
(Intercept)  0.06091    0.75732   0.080   0.9359  
age         -0.04978    0.03197  -1.557   0.1195  
smoke1       0.69185    0.32181   2.150   0.0316 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 234.67  on 188  degrees of freedom
Residual deviance: 227.28  on 186  degrees of freedom
AIC: 233.28

Number of Fisher Scoring iterations: 4

Logistic Regression with Multiple Variables

Code
confint(fit)
                  2.5 %     97.5 %
(Intercept) -1.41611481 1.56446444
age         -0.11450032 0.01132921
smoke1       0.06214062 1.32718026

Ex 4: Logistic Regression with One Numerical Predictor

alzh vs hippo

Code
alzh.logistic=glm(alzh ~ hippo, family="binomial", data=alzheimer_data)
summary(alzh.logistic) %>% coefficients()
             Estimate Std. Error   z value     Pr(>|z|)
(Intercept)  6.093409 0.41017858  14.85550 6.408616e-50
hippo       -1.184699 0.06916233 -17.12926 8.979352e-66
  • Note, glm stands for generalized linear models, which include linear regression and logistic regression as special situations.

  • \(b_0=6.093409, b_1=-1.184699\).

  • How should we interpret these values?

Interpret Results

  • Consider \(x_i=x_0\), the log-odds of AD is

\[logit(\hat p_0) = b_0 + b_1 x_0\]

  • If we increase the explanatory by one unit, the log-odds becomes

\[logit(\hat p_1) = b_0 + b_1 (x_0+1)\]

  • Therefore, \(logit(\hat p_1)-logit(\hat p_0)=b_1=-1.184699\).

Interpret Results

  • We have obtained the change in log-odds associated with one-unit increase in \(x\).

  • How can we convert it to changes in odds, which is more well understandable?

  • By the definition of logit, we have

\[logit(\hat p_1) - logit(\hat p_0)=log\frac{\hat p_1}{1-\hat p_1} -log \frac{\hat p_0}{1-\hat p_0}=log \frac{\frac{\hat p_1}{1-\hat p_1}}{\frac{\hat p_0}{1-\hat p_0}}=-1.184699\]

Interpret Results

  • Taking the exponent on both sides of the equation,

\[\frac{\frac{\hat p_1}{1-\hat p_1}}{\frac{\hat p_0}{1-\hat p_0}}=exp(-1.184699)=0.3058\]

  • The new odds is 30.58% of the odds.
  • In other words, if the hippocampal value increases by 1cc, the odds of AD decreased by 1-30.58%=69.42%.

Confidence intervals

  • The confint function in R can be used to find confidence intervals for log-odds.
Code
confint.default(alzh.logistic)[2,]
    2.5 %    97.5 % 
-1.320255 -1.049144 
  • We need to convert the results to odds
Code
exp(confint.default(alzh.logistic)[2,])
    2.5 %    97.5 % 
0.2670672 0.3502375 

Confidence intervals

Code
1-exp(confint.default(alzh.logistic))[2,]
    2.5 %    97.5 % 
0.7329328 0.6497625 
  • Recall that the estimated decrease of odds in AD associated with one-unit increase of hippocampal volume is 69.42%.

  • A 95% confidence interval is [64.98%, 73.29%].

P-value

Code
summary(alzh.logistic) %>% coefficients()
             Estimate Std. Error   z value     Pr(>|z|)
(Intercept)  6.093409 0.41017858  14.85550 6.408616e-50
hippo       -1.184699 0.06916233 -17.12926 8.979352e-66
  • The p-value is very small, about \(9.0\times 10^{-66}\)
    • indicating to reject the null hypothesis \(H_0: \beta_1=0\)
    • suggesting that AD is significantly associated with hippocampal volume.

Predicted Values vs Observed Values

  • Note, when using predict, it is necessary to specify the correct type of prediction.
    • The default is on the scale of the linear predictors (log-odds).
    • The response type gives predicted probability.

Predicted vs Observed

Code
plot(alzheimer_data$alzh, predict(alzh.logistic, type="response"),
     xlab="observed alzh", ylab="predicted alzh")

Predicted vs \(X\)

Code
plot(alzheimer_data$hippo, predict(alzh.logistic, type="response"),
     xlab="hippo volume", ylab="predicted alzh")

Ex 5: Logistic Regression with Multiple Variables

  • Similar to linear regression, we can include multiple explanatory variables in logistic regression.

  • Connect \(p\) (i.e., \(P(Y=1)\)) to a linear function of the predictors: \[log\frac{\hat p}{1-\hat p} = b_0 + b_1 \times x_{1} + … + b_p \times x_{p}\]

Example

Code
alzh.glm.multi = glm(alzh ~ age + female + educ, family=binomial, data=alzheimer_data)
summary(alzh.glm.multi)$coefficients
               Estimate  Std. Error   z value     Pr(>|z|)
(Intercept) -2.02463880 0.455844285 -4.441514 8.932811e-06
age          0.04015417 0.005001781  8.027974 9.909571e-16
female1     -0.87341587 0.105761683 -8.258339 1.477131e-16
educ        -0.08759757 0.015192451 -5.765862 8.124175e-09

Interpretation

  • Consider the age variable. The estimated coefficient is 0.04015417. What information does it provide?
  • The estimated log-odds AD is \[logit(\hat p_0) = b_0 + b_{age} age + b_2 female + b_3 educ\]
  • Let \(\hat p_1\) denote estimated log-odds after one year \[logit(\hat p_1) = b_0 + b_{age} (age+1) + b_2 female + b_3 educ\]

Interpretation

  • The estimated change in log-odds \[logit(\hat p_1) - logit(\hat p_0)=log\frac{\hat p_1}{1-\hat p_1} -log \frac{\hat p_0}{1-\hat p_0}=0.040154178\]
  • Take exponential of both sides, we have \[\frac{\frac{\hat p_1}{1-\hat p_1}}{\frac{\hat p_0}{1-\hat p_0}} = exp(0.04015417)=104.1\%\]
  • The new odds is 104.1% of the old odds, i.e., the odds of AD increase 104.1% -1 = 4.1% with one year increase in age.

Interpretate Coefficients

  • The odds of AD in one year later is \(exp(0.04015417)=\) of the current odds.

  • When holding gender and education fixed, the estimated increase in odds of AD in a year is \(e^{0.04015417}-1=4.097\%\)

  • A 95% confidence interval is [3.08%, 5.12%].

Code
confint.default(alzh.glm.multi)[2,]
     2.5 %     97.5 % 
0.03035086 0.04995748 
Code
exp(confint.default(alzh.glm.multi)[2,])-1
     2.5 %     97.5 % 
0.03081614 0.05122640 

Interpretate Coefficients

  • What if we are interested in the increase in odds of AD in ten years (everything else is fixed)?
  • The estimated increase in odds of AD in 10 years is \[e^{10*0.04015417}-1=49.4\%\]
  • A \(95\%\) C.I. for 10-year increase in odds: [35.5%, 64.8%].
Code
10*confint.default(alzh.glm.multi)[2,]
    2.5 %    97.5 % 
0.3035086 0.4995748 
Code
exp(10*confint.default(alzh.glm.multi)[2,])-1
    2.5 %    97.5 % 
0.3546032 0.6480204 

Significance and P-value

  • Lastly, let’s take a look at significance.
  • All the three explanatory variables are significant.
Code
summary(alzh.glm.multi)$coefficients[-1,]
           Estimate  Std. Error   z value     Pr(>|z|)
age      0.04015417 0.005001781  8.027974 9.909571e-16
female1 -0.87341587 0.105761683 -8.258339 1.477131e-16
educ    -0.08759757 0.015192451 -5.765862 8.124175e-09