, where the link function indicates the cumulative standard logistic probability distribution function. This chapter examines how car ownership (owncar) is affected by monthly income (income), age, and gender (male). See the appendix for details about the data set.
2. The Binary Logit Regression Model
, where the link function indicates the cumulative standard logistic probability distribution function. This chapter examines how car ownership (owncar) is affected by monthly income (income), age, and gender (male). See the appendix for details about the data set.
2.1 Binary Logit in STATA (.logit)
. logistic owncar income age male
LR chi2(3) = 18.24
Prob > chi2 = 0.0004
Log likelihood = -273.84758 Pseudo R2 = 0.0322
------------------------------------------------------------------------------
owncar | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
income | .9898826 .5677504 -0.02 0.986 .3216431 3.046443
age | 1.279626 .088997 3.55 0.000 1.116561 1.466505
male | 1.513669 .3111388 2.02 0.044 1.011729 2.264633
------------------------------------------------------------------------------
. logit
. logit owncar income age male
Iteration 1: log likelihood = -273.93537
Iteration 2: log likelihood = -273.84761
Iteration 3: log likelihood = -273.84758
Logistic regression Number of obs = 437
LR chi2(3) = 18.24
Prob > chi2 = 0.0004
Log likelihood = -273.84758 Pseudo R2 = 0.0322
------------------------------------------------------------------------------
owncar | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
income | -.010169 .5735533 -0.02 0.986 -1.134313 1.113975
age | .2465678 .0695492 3.55 0.000 .1102539 .3828817
male | .4145366 .2055527 2.02 0.044 .0116606 .8174126
_cons | -4.682741 1.474519 -3.18 0.001 -7.572745 -1.792738
------------------------------------------------------------------------------
. predict r, residual
. test income age
( 2) age = 0
chi2( 2) = 12.57
Prob > chi2 = 0.0019
2.2 Using the SPost Module in STATA
. fitstat
Log-Lik Intercept Only: -282.965 Log-Lik Full Model: -273.848
D(433): 547.695 LR(3): 18.235
Prob > LR: 0.000
McFadden's R2: 0.032 McFadden's Adj R2: 0.018
Maximum Likelihood R2: 0.041 Cragg & Uhler's R2: 0.056
McKelvey and Zavoina's R2: 0.059 Efron's R2: 0.040
Variance of y*: 3.495 Variance of error: 3.290
Count R2: 0.638 Adj Count R2: -0.033
AIC: 1.272 AIC*n: 555.695
BIC: -2084.916 BIC': 0.005
. di 2*(-273.848 - (-282.965))
. listcoef, help
Odds of: 1 vs 0
----------------------------------------------------------------------
owncar | b z P>|z| e^b e^bStdX SDofX
-------------+--------------------------------------------------------
income | -0.01017 -0.018 0.986 0.9899 0.9982 0.1792
age | 0.24657 3.545 0.000 1.2796 1.4876 1.6108
male | 0.41454 2.017 0.044 1.5137 1.2279 0.4953
----------------------------------------------------------------------
b = raw coefficient
z = z-score for test of b=0
P>|z| = p-value for z-test
e^b = exp(b) = factor change in odds for unit increase in X
e^bStdX = exp(b*SD of X) = change in odds for SD increase in X
SDofX = standard deviation of X
. prtab male
----------------------
male | Prediction
----------+-----------
0 | 0.6017
1 | 0.6958
----------------------
income age male
x= .61683982 20.691076 .57208238
. prvalue, x(male=0) rest(mean)
Pr(y=1|x): 0.6017 95% ci: (0.5286,0.6706)
Pr(y=0|x): 0.3983 95% ci: (0.3294,0.4714)
income age male
x= .61683982 20.691076 0
. prchange, help
min->max 0->1 -+1/2 -+sd/2 MargEfct
income -0.0019 -0.0023 -0.0023 -0.0004 -0.0023
age 0.4404 0.0032 0.0555 0.0893 0.0556
male 0.0940 0.0940 0.0932 0.0462 0.0934
0 1
Pr(y|x) 0.3430 0.6570
income age male
x= .61684 20.6911 .572082
sd(x)= .17918 1.61081 .495344
Pr(y|x): probability of observing each y for specified x values
Avg|Chg|: average of absolute value of the change across categories
Min->Max: change in predicted probability as x changes from its minimum to
its maximum
0->1: change in predicted probability as x changes from 0 to 1
-+1/2: change in predicted probability as x changes from 1/2 unit below
base value to 1/2 unit above
-+sd/2: change in predicted probability as x changes from 1/2 standard
dev below base to 1/2 standard dev above
MargEfct: the partial derivative of the predicted probability/rate with
respect to a given independent variable
. prgen income, from(.1) to(1.5) x(male=1) rest(median) generate(ppcar)
income age male
x= .58200002 21 1
2.3 Using the SAS LOGISTIC and PROBIT Procedures
PROC LOGISTIC DESCENDING DATA = masil.students;
MODEL owncar = income age male;
RUN;
Model Information
Data Set MASIL.STUDENTS
Response Variable owncar
Number of Response Levels 2
Model binary logit
Optimization Technique Fisher's scoring
Number of Observations Read 437
Number of Observations Used 437
Response Profile
Ordered Total
Value owncar Frequency
1 1 284
2 0 153
Probability modeled is owncar=1.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 567.930 555.695
SC 572.010 572.015
-2 Log L 565.930 547.695
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 18.2351 3 0.0004
Score 17.4697 3 0.0006
Wald 16.7977 3 0.0008
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -4.6827 1.4745 10.0855 0.0015
income 1 -0.0102 0.5736 0.0003 0.9859
age 1 0.2466 0.0695 12.5686 0.0004
male 1 0.4145 0.2056 4.0670 0.0437
Odds Ratio Estimates
Point 95% Wald
Effect Estimate Confidence Limits
income 0.990 0.322 3.046
age 1.280 1.117 1.467
male 1.514 1.012 2.265
Association of Predicted Probabilities and Observed Responses
Percent Concordant 58.9 Somers' D 0.246
Percent Discordant 34.3 Gamma 0.264
Percent Tied 6.8 Tau-a 0.112
Pairs 43452 c 0.623
PROC LOGISTIC DESCENDING DATA = masil.students;
MODEL owncar(EVENT=’1’) = income age male;
RUN;
PROC LOGISTIC DESCENDING DATA = masil.students;
MODEL owncar = income age male;
UNITS income=SD age=SD;
RUN;
Effect Unit Estimate
income 0.1792 0.998
age 1.6108 1.488
PROC PROBIT DATA = masil.students;
CLASS owncar;
MODEL owncar = income age male /DIST=LOGISTIC;
RUN;
Model Information
Data Set MASIL.STUDENTS
Dependent Variable owncar
Number of Observations 437
Name of Distribution Logistic
Log Likelihood -273.847577
Number of Observations Read 437
Number of Observations Used 437
Class Level Information
Name Levels Values
owncar 2 0 1
Response Profile
Ordered Total
Value owncar Frequency
1 0 153
2 1 284
PROC PROBIT is modeling the probabilities of levels of owncar having LOWER Ordered Values in
the response profile table.
Algorithm converged.
Type III Analysis of Effects
Wald
Effect DF Chi-Square Pr > ChiSq
income 1 0.0003 0.9859
age 1 12.5686 0.0004
male 1 4.0670 0.0437
Analysis of Parameter Estimates
Standard 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 4.6827 1.4745 1.7927 7.5727 10.09 0.0015
income 1 0.0102 0.5736 -1.1140 1.1343 0.00 0.9859
age 1 -0.2466 0.0695 -0.3829 -0.1103 12.57 0.0004
male 1 -0.4145 0.2056 -0.8174 -0.0117 4.07 0.0437
2.4 Using the SAS GENMOD and QLIM Procedures
PROC GENMOD DATA = masil.students DESC;
MODEL owncar = income age male /DIST=BINOMIAL LINK=LOGIT;
RUN;
Model Information
Data Set MASIL.STUDENTS
Distribution Binomial
Link Function Logit
Dependent Variable owncar
Number of Observations Read 437
Number of Observations Used 437
Number of Events 284
Number of Trials 437
Response Profile
Ordered Total
Value owncar Frequency
1 1 284
2 0 153
PROC GENMOD is modeling the probability that owncar='1'.
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 433 547.6952 1.2649
Scaled Deviance 433 547.6952 1.2649
Pearson Chi-Square 433 436.4352 1.0079
Scaled Pearson X2 433 436.4352 1.0079
Log Likelihood -273.8476
Algorithm converged.
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 -4.6827 1.4745 -7.5727 -1.7927 10.09 0.0015
income 1 -0.0102 0.5736 -1.1343 1.1140 0.00 0.9859
age 1 0.2466 0.0695 0.1103 0.3829 12.57 0.0004
male 1 0.4145 0.2056 0.0117 0.8174 4.07 0.0437
Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed.
PROC GENMOD DATA = masil.students DESC;
CLASS male;
MODEL owncar = income age male /DIST=BINOMIAL LINK=LOGIT;
RUN;
PROC GENMOD DATA = masil.students DESC;
FWDLINK link=LOG(_MEAN_/(1-_MEAN_));
INVLINK invlink=1/(1+EXP(-1*_XBETA_));
MODEL owncar = income age male /DIST=BINOMIAL;
RUN;
PROC QLIM DATA=masil.students;
MODEL owncar = income age male;
ENDOGENOUS owncar ~ DISCRETE (DIST=LOGIT);
RUN;
PROC QLIM DATA=masil.students;
MODEL owncar = income age male /DISCRETE (DIST=LOGIT);
RUN;
Discrete Response Profile of owncar
Index Value Frequency Percent
1 0 153 35.01
2 1 284 64.99
Model Fit Summary
Number of Endogenous Variables 1
Endogenous Variable owncar
Number of Observations 437
Log Likelihood -273.84758
Maximum Absolute Gradient 9.63219E-6
Number of Iterations 8
AIC 555.69515
Schwarz Criterion 572.01489
Goodness-of-Fit Measures
Measure Value Formula
Likelihood Ratio (R) 18.235 2 * (LogL - LogL0)
Upper Bound of R (U) 565.93 - 2 * LogL0
Aldrich-Nelson 0.0401 R / (R+N)
Cragg-Uhler 1 0.0409 1 - exp(-R/N)
Cragg-Uhler 2 0.0563 (1-exp(-R/N)) / (1-exp(-U/N))
Estrella 0.0415 1 - (1-R/U)^(U/N)
Adjusted Estrella 0.0234 1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
McFadden's LRI 0.0322 R / U
Veall-Zimmermann 0.071 (R * (U+N)) / (U * (R+N))
McKelvey-Zavoina 0.1699
N = # of observations, K = # of regressors
Algorithm converged.
Parameter Estimates
Standard
Approx
Parameter
Estimate
Error t Value Pr > |t|
Intercept
-4.682741
1.474519 -3.18
0.0015
income
-0.010169
0.573553 -0.02
0.9859
age
0.246568 0.069549
3.55 0.0004
male
0.414537
0.205553
2.02 0.0437
PROC CATMOD DATA = masil.students;
DIRECT income age;
MODEL owncar = income age male /NOPROFILE;
RUN;
2.5 Binary Logit in LIMDEP (Logit$)
LOGIT;
Lhs=owncar;
Rhs=ONE,income,age,male;
Marginal Effects; Means$
+---------------------------------------------+
| Multinomial Logit Model |
| Maximum Likelihood Estimates |
| Model estimated: Sep 17, 2005 at 05:31:28PM.|
| Dependent variable OWNCAR |
| Weighting variable None |
| Number of observations 437 |
| Iterations completed 5 |
| Log likelihood function -273.8476 |
| Restricted log likelihood -282.9651 |
| Chi squared 18.23509 |
| Degrees of freedom 3 |
| Prob[ChiSqd > value] = .3933723E-03 |
| Hosmer-Lemeshow chi-squared = 8.44648 |
| P-value= .39111 with deg.fr. = 8 |
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
Characteristics in numerator of Prob[Y = 1]
Constant -4.682741385 1.4745190 -3.176 .0015
INCOME -.1016896029E-01 .57355331 -.018 .9859 .61683982
AGE .2465677833 .69549211E-01 3.545 .0004 20.691076
MALE .4145365774 .20555276 2.017 .0437 .57208238
(Note: E+nn or E-nn means multiply by 10 to + or -nn power.)
+--------------------------------------------------------------------+
| Information Statistics for Discrete Choice Model. |
| M=Model MC=Constants Only M0=No Model |
| Criterion F (log L) -273.84758 -282.96512 -302.90532 |
| LR Statistic vs. MC 18.23509 .00000 .00000 |
| Degrees of Freedom 3.00000 .00000 .00000 |
| Prob. Value for LR .00039 .00000 .00000 |
| Entropy for probs. 273.84758 282.96512 302.90532 |
| Normalized Entropy .90407 .93417 1.00000 |
| Entropy Ratio Stat. 58.11548 39.88039 .00000 |
| Bayes Info Criterion 565.93495 584.17004 624.05044 |
| BIC - BIC(no model) 58.11548 39.88039 .00000 |
| Pseudo R-squared .03222 .00000 .00000 |
| Pct. Correct Prec. 63.84439 .00000 50.00000 |
| Means: y=0 y=1 y=2 y=3 yu=4 y=5, y=6 y>=7 |
| Outcome .3501 .6499 .0000 .0000 .0000 .0000 .0000 .0000 |
| Pred.Pr .3501 .6499 .0000 .0000 .0000 .0000 .0000 .0000 |
| Notes: Entropy computed as Sum(i)Sum(j)Pfit(i,j)*logPfit(i,j). |
| Normalized entropy is computed against M0. |
| Entropy ratio statistic is computed against M0. |
| BIC = 2*criterion - log(N)*degrees of freedom. |
| If the model has only constants or if it has no constants, |
| the statistics reported here are not useable. |
+--------------------------------------------------------------------+
+-------------------------------------------+
| Partial derivatives of probabilities with |
| respect to the vector of characteristics. |
| They are computed at the means of the Xs. |
+-------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
Characteristics in numerator of Prob[Y = 1]
Constant -1.055282283 .33183024 -3.180 .0015
INCOME -.2291632775E-02 .12925338 -.018 .9859 .61683982
AGE .5556544593E-01 .15534022E-01 3.577 .0003 20.691076
Marginal effect for dummy variable is P|1 - P|0.
MALE .9403411023E-01 .46726710E-01 2.012 .0442 .57208238
(Note: E+nn or E-nn means multiply by 10 to + or -nn power.)
+----------------------------------------+
| Fit Measures for Binomial Choice Model |
| Logit model for variable OWNCAR |
+----------------------------------------+
| Proportions P0= .350114 P1= .649886 |
| N = 437 N0= 153 N1= 284 |
| LogL = -273.84758 LogL0 = -282.9651 |
| Estrella = 1-(L/L0)^(-2L0/n) = .04153 |
+----------------------------------------+
| Efron | McFadden | Ben./Lerman |
| .03963 | .03222 | .56318 |
| Cramer | Veall/Zim. | Rsqrd_ML |
| .04010 | .07099 | .04087 |
+----------------------------------------+
| Information Akaike I.C. Schwarz I.C. |
| Criteria 1.27161 572.01489 |
+----------------------------------------+
Frequencies of actual & predicted outcomes
Predicted outcome has maximum probability.
Threshold value for predicting Y=1 = .5000
Predicted
------ ---------- + -----
Actual 0 1 | Total
------ ---------- + -----
0 21 132 | 153
1 26 258 | 284
------ ---------- + -----
Total 47 390 | 437
LOGISTIC REGRESSION VAR=owncar
/METHOD=ENTER income age male
/CRITERIA PIN(.05) POUT(.10) ITERATE(20) CUT(.5) .
Up: Table of Contents
Next: The Binary Probit Model
Prev: Introduction



