, where the link function indicates the cumulative standard normal probability distribution.
3. The Binary Probit Regression Model
, where the link function indicates the cumulative standard normal probability distribution.
3.1 Binary Probit in STATA (.probit)
. probit owncar income age male
Iteration 1: log likelihood = -273.84832
Iteration 2: log likelihood = -273.81741
Iteration 3: log likelihood = -273.81741
Probit regression Number of obs = 437
LR chi2(3) = 18.30
Prob > chi2 = 0.0004
Log likelihood = -273.81741 Pseudo R2 = 0.0323
------------------------------------------------------------------------------
owncar | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
income | .0005613 .3476842 0.00 0.999 -.6808873 .6820098
age | .1487005 .0409837 3.63 0.000 .068374 .2290271
male | .2579112 .1256085 2.05 0.040 .0117231 .5040993
_cons | -2.823671 .8730955 -3.23 0.001 -4.534907 -1.112435
------------------------------------------------------------------------------
. listcoef
Observed SD: .47755228
Latent SD: 1.0371456
-------------------------------------------------------------------------------
owncar | b z P>|z| bStdX bStdY bStdXY SDofX
-------------+-----------------------------------------------------------------
income | 0.00056 0.002 0.999 0.0001 0.0005 0.0001 0.1792
age | 0.14870 3.628 0.000 0.2395 0.1434 0.2309 1.6108
male | 0.25791 2.053 0.040 0.1278 0.2487 0.1232 0.4953
-------------------------------------------------------------------------------
. prchange, x(income=1 age=21 male=0)
min->max 0->1 -+1/2 -+sd/2 MargEfct
income 0.0002 0.0002 0.0002 0.0000 0.0002
age 0.4900 0.0014 0.0567 0.0912 0.0567
male 0.0937 0.0937 0.0981 0.0487 0.0984
0 1
Pr(y|x) 0.3822 0.6178
income age male
x= 1 21 0
sd(x)= .17918 1.61081 .495344
3.2 Using the PROBIT and LOGISTIC Procedures
PROC PROBIT DATA = masil.students;
CLASS owncar;
MODEL owncar = income age male;
RUN;
Model Information
Data Set MASIL.STUDENTS
Dependent Variable owncar
Number of Observations 437
Name of Distribution Normal
Log Likelihood -273.8174115
Number of Observations Read 437
Number of Observations Used 437
Class Level Information
Name Levels Values
owncar 2 0 1
Response Profile
Ordered Total
Value owncar Frequency
1 0 153
2 1 284
PROC PROBIT is modeling the probabilities of levels of owncar having LOWER Ordered Values in
the response profile table.
Algorithm converged.
Type III Analysis of Effects
Wald
Effect DF Chi-Square Pr > ChiSq
income 1 0.0000 0.9987
age 1 13.1644 0.0003
male 1 4.2160 0.0400
Analysis of Parameter Estimates
Standard 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 2.8237 0.8731 1.1124 4.5349 10.46 0.0012
income 1 -0.0006 0.3477 -0.6820 0.6809 0.00 0.9987
age 1 -0.1487 0.0410 -0.2290 -0.0684 13.16 0.0003
male 1 -0.2579 0.1256 -0.5041 -0.0117 4.22 0.0400
PROC LOGISTIC DATA = masil.students DESC;
MODEL owncar = income age male /LINK=PROBIT;
RUN;
Model Information
Data Set MASIL.STUDENTS
Response Variable owncar
Number of Response Levels 2
Model binary probit
Optimization Technique Fisher's scoring
Number of Observations Read 437
Number of Observations Used 437
Response Profile
Ordered Total
Value owncar Frequency
1 1 284
2 0 153
Probability modeled is owncar=1.
Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Intercept
Intercept and
Criterion Only Covariates
AIC 567.930 555.635
SC 572.010 571.955
-2 Log L 565.930 547.635
Testing Global Null Hypothesis: BETA=0
Test Chi-Square DF Pr > ChiSq
Likelihood Ratio 18.2954 3 0.0004
Score 17.4697 3 0.0006
Wald 17.4690 3 0.0006
Analysis of Maximum Likelihood Estimates
Standard Wald
Parameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 -2.8237 0.8796 10.3048 0.0013
income 1 0.000548 0.3496 0.0000 0.9987
age 1 0.1487 0.0413 12.9602 0.0003
male 1 0.2579 0.1257 4.2096 0.0402
Association of Predicted Probabilities and Observed Responses
Percent Concordant 57.8 Somers' D 0.249
Percent Discordant 32.9 Gamma 0.274
Percent Tied 9.3 Tau-a 0.113
Pairs 43452 c 0.624
3.3 Using the GENMODE and QLIM Procedures
PROC GENMOD DATA = masil.students DESC;
MODEL owncar = income age male /DIST=BINOMIAL LINK=PROBIT;
RUN;
Model Information
Data Set MASIL.STUDENTS
Distribution Binomial
Link Function Probit
Dependent Variable owncar
Number of Observations Read 437
Number of Observations Used 437
Number of Events 284
Number of Trials 437
Response Profile
Ordered Total
Value owncar Frequency
1 1 284
2 0 153
PROC GENMOD is modeling the probability that owncar='1'.
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 433 547.6348 1.2647
Scaled Deviance 433 547.6348 1.2647
Pearson Chi-Square 433 437.0270 1.0093
Scaled Pearson X2 433 437.0270 1.0093
Log Likelihood -273.8174
Algorithm converged.
Analysis Of Parameter Estimates
Standard Wald 95% Confidence Chi-
Parameter DF Estimate Error Limits Square Pr > ChiSq
Intercept 1 -2.8237 0.8731 -4.5349 -1.1124 10.46 0.0012
income 1 0.0006 0.3477 -0.6809 0.6820 0.00 0.9987
age 1 0.1487 0.0410 0.0684 0.2290 13.16 0.0003
male 1 0.2579 0.1256 0.0117 0.5041 4.22 0.0400
Scale 0 1.0000 0.0000 1.0000 1.0000
NOTE: The scale parameter was held fixed.
PROC QLIM DATA=masil.students;
MODEL owncar = income age male /DISCRETE (DIST=NORMAL);
RUN;
Discrete Response Profile of owncar
Index Value Frequency Percent
1 0 153 35.01
2 1 284 64.99
Model Fit Summary
Number of Endogenous Variables 1
Endogenous Variable owncar
Number of Observations 437
Log Likelihood -273.81741
Maximum Absolute Gradient 3.82848E-8
Number of Iterations 10
AIC 555.63482
Schwarz Criterion 571.95456
Goodness-of-Fit Measures
Measure Value Formula
Likelihood Ratio (R) 18.295 2 * (LogL - LogL0)
Upper Bound of R (U) 565.93 - 2 * LogL0
Aldrich-Nelson 0.0402 R / (R+N)
Cragg-Uhler 1 0.041 1 - exp(-R/N)
Cragg-Uhler 2 0.0565 (1-exp(-R/N)) / (1-exp(-U/N))
Estrella 0.0417 1 - (1-R/U)^(U/N)
Adjusted Estrella 0.0235 1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
McFadden's LRI 0.0323 R / U
Veall-Zimmermann 0.0712 (R * (U+N)) / (U * (R+N))
McKelvey-Zavoina 0.0702
N = # of observations, K = # of regressors
Algorithm converged.
Parameter Estimates
Standard
Approx
Parameter
Estimate
Error t Value Pr > |t|
Intercept
-2.823671
0.873096 -3.23
0.0012
income
0.000561
0.347684 0.00
0.9987
age
0.148701
0.040984 3.63
0.0003
male
0.257911
0.125608 2.05
0.0400
3.4 Binary Probit in LIMDEP (Probit$)
PROBIT;
Lhs=owncar;
Rhs=ONE,income,age,male$
+---------------------------------------------+
| Binomial Probit Model |
| Maximum Likelihood Estimates |
| Model estimated: Sep 17, 2005 at 10:28:56PM.|
| Dependent variable OWNCAR |
| Weighting variable None |
| Number of observations 437 |
| Iterations completed 4 |
| Log likelihood function -273.8174 |
| Restricted log likelihood -282.9651 |
| Chi squared 18.29542 |
| Degrees of freedom 3 |
| Prob[ChiSqd > value] = .3822542E-03 |
| Hosmer-Lemeshow chi-squared = 8.18372 |
| P-value= .41573 with deg.fr. = 8 |
+---------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
Index function for probability
Constant -2.823670829 .87309548 -3.234 .0012
INCOME .5612515407E-03 .34768423 .002 .9987 .61683982
AGE .1487005234 .40983697E-01 3.628 .0003 20.691076
MALE .2579111914 .12560848 2.053 .0400 .57208238
(Note: E+nn or E-nn means multiply by 10 to + or -nn power.)
+----------------------------------------+
| Fit Measures for Binomial Choice Model |
| Probit model for variable OWNCAR |
+----------------------------------------+
| Proportions P0= .350114 P1= .649886 |
| N = 437 N0= 153 N1= 284 |
| LogL = -273.81741 LogL0 = -282.9651 |
| Estrella = 1-(L/L0)^(-2L0/n) = .04166 |
+----------------------------------------+
| Efron | McFadden | Ben./Lerman |
| .03984 | .03233 | .56327 |
| Cramer | Veall/Zim. | Rsqrd_ML |
| .04016 | .07121 | .04100 |
+----------------------------------------+
| Information Akaike I.C. Schwarz I.C. |
| Criteria 1.27148 571.95456 |
+----------------------------------------+
Frequencies of actual & predicted outcomes
Predicted outcome has maximum probability.
Threshold value for predicting Y=1 = .5000
Predicted
------ ---------- + -----
Actual 0 1 | Total
------ ---------- + -----
0 5 148 | 153
1 8 276 | 284
------ ---------- + -----
Total 13 424 | 437
COMPUTE n=1.
PROBIT owncar OF n WITH income age male
/LOG NONE /MODEL PROBIT
/PRINT FREQ /CRITERIA ITERATE(20) STEPLIMIT(.1).
Up: Table of Contents
Next: Bivariate Logit/Probit Models
Prev: The Binary Logit Model



