Probit Regression with SAS


LOGISTIC Procedure

The SAS LOGISTIC procedure can be also used for a probit regression. To fit a probit regression use the LINK=NORMIT (or PROBIT) option:

  proc logistic;
  model y=x1 x2 / link=normit;
  run;
Example 11: SAS Probit Regression in PROC LOGISTIC

Using the data in Example 1, you can use:

  proc logistic data=ingot;
  model s=t / link=normit;
  run;

You will have the SAS output:

                       Sample Program: Probit Regression                       

                             The LOGISTIC Procedure

     Data Set: WORK.INGOT
     Response Variable: S
     Response Levels: 2
     Number of Observations: 387
     Link Function: Normit

                                Response Profile

                           Ordered
                             Value       S     Count

                                 1       0        12
                                 2       1       375

      Model Fitting Information and Testing Global Null Hypothesis BETA=0

                               Intercept
                 Intercept        and
   Criterion       Only       Covariates    Chi-Square for Covariates

   AIC             108.988        99.018         .
   SC              112.947       106.934         .
   -2 LOG L        106.988        95.018       11.971 with 1 DF (p=0.0005)
   Score              .             .          15.100 with 1 DF (p=0.0001)

                   Analysis of Maximum Likelihood Estimates

                 Parameter   Standard      Wald         Pr >      Standardized
 Variable   DF    Estimate     Error    Chi-Square   Chi-Square     Estimate

 INTERCPT   1      -2.8004     0.3284      72.7050       0.0001              .
 T          1       0.0391     0.0113      11.9525       0.0005       0.388259

         Association of Predicted Probabilities and Observed Responses

                   Concordant = 59.2%          Somers' D = 0.499
                   Discordant =  9.4%          Gamma     = 0.727
                   Tied       = 31.4%          Tau-a     = 0.030
                   (4500 pairs)                c         = 0.749

PROBIT Procedure

You can use the PROC PROBIT to fit a probit model. The basic syntax you can use is:

  proc probit;
  class y;
  model y=x1 x2;
  run;

or

  proc probit;
  model r/n=x1 x2;
  run;

or

  proc probit;
  class x2;
  model r/n=x1 x2;
  run;

depending on the nature of the data set.

Example 12: SAS Probit Regression in PROC PROBIT

Using the data in Example 1, you can use:

  proc probit data=ingot;
  class s;
  model s=t;
  run;

You will have the following SAS output:

                       Sample Program: Probit Regression                       

                                Probit Procedure
                            Class Level Information

                            Class    Levels    Values

                            S             2    0 1

                       Number of observations used = 387


                                Probit Procedure

   Data Set          =WORK.INGOT
   Dependent Variable=S

         Weighted Frequency Counts for the Ordered Response Categories

                                   Level     Count
                                       0        12
                                       1       375

   Log Likelihood for NORMAL  -47.5087804

                                Probit Procedure

          Variable  DF   Estimate  Std Err ChiSquare  Pr>Chi Label/Value

          INTERCPT   1 -2.8003508 0.331621  71.30839  0.0001 Intercept
          T          1  0.0390757 0.011425  11.69807  0.0006

                Probit Model in Terms of Tolerance Distribution

                                     MU         SIGMA
                               71.66476      25.59135

              Estimated Covariance Matrix for Tolerance Parameters

                                            MU             SIGMA

                          MU        186.336614         98.799500
                       SIGMA         98.799500         55.985053
Example 13: SAS Probit Regression in PROC PROBIT (categorical regressors)

Using the data in Example 7, if you use:

  proc probit data=drug;
  class drug;
  model r/n=x drug;
  run;

you will have the result:

                       Sample Program: Probit Regression                       

                                Probit Procedure
                            Class Level Information

                         Class    Levels    Values

                         DRUG          5    A B C D E

                        Number of observations used = 18

                                Probit Procedure

   Data Set          =WORK.DRUG
   Dependent Variable=R
   Dependent Variable=N
   Number of Observations=  18
   Number of Events      =      99    Number of Trials =      237

   Log Likelihood for NORMAL -114.6516555


                                Probit Procedure

          Variable  DF   Estimate  Std Err ChiSquare  Pr>Chi Label/Value

          INTERCPT   1 0.19031335  0.24926  0.582954  0.4452 Intercept
          X          1 1.15885442 0.438333  6.989539  0.0082

          DRUG       4                      64.33502  0.0001
                     1 -1.7087998 0.331686   26.5416  0.0001 A
                     1 -1.2286831 0.239099  26.40741  0.0001 B
                     1 -2.2309708 0.343196   42.2574  0.0001 C
                     1 -0.5079719 0.291889  3.028612  0.0818 D
                     0          0        0         .   .     E

SAS PROC PROBIT models the probability of Y=0 or of Y's lower sorted value by default. This default can be altered by using the ORDER option in the PROC PROBIT statement. For example,

  proc probit order=freq;

specifies the sorting order for the levels of the classification variables (specified in the CLASS statement) in a descending frequency count; levels with the most observations come first in the order.

Example 14: Altering Order

You may need to model the probability of the value with the higher count. In Example 12, Y=1 has the count 375 and Y=0 has 12. If you use:

  proc probit order=freq data=ingot;
  class s;
  model s=t;
  run;

you will have the following output:

                       Sample Program: Probit Regression                      

                                Probit Procedure
                            Class Level Information

                            Class    Levels    Values

                            S             2    1 0

                       Number of observations used = 387

                                Probit Procedure

   Data Set          =WORK.INGOT
   Dependent Variable=S

         Weighted Frequency Counts for the Ordered Response Categories

                                   Level     Count
                                       1       375
                                       0        12

   Log Likelihood for NORMAL  -47.5087804

                                Probit Procedure

          Variable  DF   Estimate  Std Err ChiSquare  Pr>Chi Label/Value

          INTERCPT   1 2.80035085 0.331621  71.30839  0.0001 Intercept
          T          1 -0.0390757 0.011425  11.69807  0.0006

                Probit Model in Terms of Tolerance Distribution

                                     MU         SIGMA
                               71.66476      25.59135

              Estimated Covariance Matrix for Tolerance Parameters

                                            MU             SIGMA

                          MU        186.336614         98.799500
                       SIGMA         98.799500         55.985053

Sometimes you will need to know the predicted probability values. For example, if you need to know the probability of having an ingot not ready for rolling (Y=0) at T=7 from Example 12, you can compute the probability using the formula:

probit3

from the standard normal probability distribution table. You can obtain this kind of computation using the OUTPUT statement and the PRINT procedure:

  proc probit;
  model r/n=x1 x2;
  output out=filename prob=varname;
  run;
  proc print data=filename;
  run;

where filename is the output data set name and varname is the variable name for predicted probabilities. The SAS output will show all the predicted probabilities for all observation points.

Example 15: Predicted Probability Computation
Using the data in Example 2, if you use:
  proc probit data=ingot2;
  model r/n=t;
  output out=prob2 prob=phat;
  run;
  proc print data=prob2;
  run;

you will have the following additional result:

                       Sample Program: Probit Regression                     

                        OBS     T    S     N       PHAT

                         1      7    0     55    0.00576
                         2     14    2    157    0.01212
                         3     27    7    159    0.04047
                         4     51    3     16    0.20969

GENMOD Procedure

Probit regression can be modeled as a class of generalized linear models in which the response probability function is binomial and the link function is probit. Therefore you can use the PROC GENMOD to fit a probit model:

  proc genmod;
  model r/n=x1 x2 / dist=binomial link=probit;
  run;
Example 16: SAS Probit Regression in PROC GENMOD

Using the data as in Example 2, you may use:

  proc genmod data=ingot2;
  model r/n=t / dist=binomial link=probit;
  run;

You will have the following SAS output:

                   Sample Program: Probit Regression                      

                              The GENMOD Procedure

                               Model Information

                   Description                     Value

                   Data Set                        WORK.INGOT2
                   Distribution                    BINOMIAL
                   Link Function                   PROBIT
                   Dependent Variable              R
                   Dependent Variable              N
                   Observations Used               4
                   Number Of Events                12
                   Number Of Trials                387


                     Criteria For Assessing Goodness Of Fit

              Criterion             DF         Value      Value/DF

              Deviance               2        0.7392        0.3696
              Scaled Deviance        2        0.7392        0.3696
              Pearson Chi-Square     2        0.4228        0.2114
              Scaled Pearson X2      2        0.4228        0.2114
              Log Likelihood         .      -47.5088             .


                        Analysis Of Parameter Estimates

          Parameter    DF    Estimate     Std Err   ChiSquare  Pr>Chi

          INTERCEPT     1     -2.8004      0.3316     71.3084  0.0001
          T             1      0.0391      0.0114     11.6981  0.0006
          SCALE         0      1.0000      0.0000           .       .

NOTE:  The scale parameter was held fixed.
Next: Probit Regression with SPSS
Prev: Probit Regression
Up: Contents