Ordered Logit Regression

Ordered logit model has the form:

orderedlogit

This model is known as the proportional-odds model because the odds ratio of the event ylessj is independent of the category j. The odds ratio is assumed to be constant for all categories.

Ordered Logit Regression with SAS

LOGISTIC Procedure

SAS PROC LOGISTIC is a procedure you can use for an ordered multiple outcome model as well as for a binary model. All previous discussions about the binary logistic regression estimation in PROC LOGISTIC are also valid for ordered logit model. To fit an ordered logit model, you can use:


  proc logistic;

  model y=x1 x2;

  run;

where Y is the ordinally scaled multiple response variable, and X1 and X2 are two regressors of interest.

Example 19: SAS Ordered Logit Regression in PROC LOGISTIC

The following data are from McCullagh and Nelder (McCullagh and Nelder, 1989, Generalized Linear Models, London, Chapman Hall, p. 175) and used in a SAS manual (SAS, 1996, SAS/STAT Software Changes and Enhancements through Release 6.11, pp. 435-438). Consider a study of the effects on taste of various cheese additives. Researchers tested four cheese additives and obtained 52 response ratings for each additive. Each response was measured on a scale of nine values ranging from strong dislike (1) to excellent taste (9). The data set CHEESE has five variables Y, X1, X2, X3, and F. The variable Y contains the response rating and the variables X1, X2, and X3 are dummy variables, representing the first, second, and third additive, respectively; for the fourth additive, X1=X2=X3=0. F gives the frequency of occurrence of the observation. The following DATA step creates the data set CHEESE:


  data cheese;

  input x1 x2 x3 y f;

  cards;

  1    0    0    1    0

  1    0    0    2    0

  1    0    0    3    1

  1    0    0    4    7

  1    0    0    5    8

  1    0    0    6    8

  1    0    0    7   19

  1    0    0    8    8

  1    0    0    9    1

  0    1    0    1    6

  0    1    0    2    9

  0    1    0    3   12

  0    1    0    4   11

  0    1    0    5    7

  0    1    0    6    6

  0    1    0    7    1

  0    1    0    8    0

  0    1    0    9    0

  0    0    1    1    1

  0    0    1    2    1

  0    0    1    3    6

  0    0    1    4    8

  0    0    1    5   23

  0    0    1    6    7

  0    0    1    7    5

  0    0    1    8    1

  0    0    1    9    0

  0    0    0    1    0

  0    0    0    2    0

  0    0    0    3    0

  0    0    0    4    1

  0    0    0    5    3

  0    0    0    6    7

  0    0    0    7   14

  0    0    0    8   16

  0    0    0    9   11

  ;

Because the response variable Y is ordinally scaled, you can estimate an ordered logit model. You can use:


  proc logistic data=cheese;

  freq f;

  model y=x1-x3;

  run;

You will have the following SAS output:


                    Sample Program: Ordered Logit Regression                   



                             The LOGISTIC Procedure



     Data Set: WORK.CHEESE

     Response Variable: Y

     Response Levels: 9

     Number of Observations: 28

     Frequency Variable: F

     Link Function: Logit





                                Response Profile



                           Ordered

                             Value       Y     Count



                                 1       1         7

                                 2       2        10

                                 3       3        19

                                 4       4        27

                                 5       5        41

                                 6       6        28

                                 7       7        39

                                 8       8        25

                                 9       9        12



NOTE: 8 observation(s) having zero frequencies or weights were excluded since

      they do not contribute to the analysis.







                Score Test for the Proportional Odds Assumption



                   Chi-Square = 17.2868 with 21 DF (p=0.6936)





      Model Fitting Information and Testing Global Null Hypothesis BETA=0



                               Intercept

                 Intercept        and

   Criterion       Only       Covariates    Chi-Square for Covariates



   AIC             875.802       733.348         .

   SC              902.502       770.061         .

   -2 LOG L        859.802       711.348      148.454 with 3 DF (p=0.0001)

   Score              .             .         111.267 with 3 DF (p=0.0001)





                    Analysis of Maximum Likelihood Estimates



               Parameter Standard    Wald       Pr >    Standardized     Odds

   Variable DF  Estimate   Error  Chi-Square Chi-Square   Estimate      Ratio



   INTERCP1 1    -7.0802   0.5624   158.4865     0.0001            .     .

   INTERCP2 1    -6.0250   0.4755   160.5507     0.0001            .     .

   INTERCP3 1    -4.9254   0.4272   132.9477     0.0001            .     .

   INTERCP4 1    -3.8568   0.3902    97.7086     0.0001            .     .

   INTERCP5 1    -2.5206   0.3431    53.9713     0.0001            .     .

   INTERCP6 1    -1.5685   0.3086    25.8379     0.0001            .     .

   INTERCP7 1    -0.0669   0.2658     0.0633     0.8013            .     .

   INTERCP8 1     1.4930   0.3310    20.3443     0.0001            .     .

   X1       1     1.6128   0.3778    18.2258     0.0001     0.385954    5.017

   X2       1     4.9646   0.4741   109.6453     0.0001     1.188080  143.257

   X3       1     3.3227   0.4251    61.0936     0.0001     0.795146   27.735





                             The LOGISTIC Procedure



         Association of Predicted Probabilities and Observed Responses



                   Concordant = 67.6%          Somers' D = 0.578

                   Discordant =  9.8%          Gamma     = 0.746

                   Tied       = 22.6%          Tau-a     = 0.500

                   (18635 pairs)               c         = 0.789

This result shows eight fitted regression lines as follows:

logitcomp

where p1 is the probability of being strongly disliked, i.e, the probability of Y=1, and so on. Positive coefficients of X1, X2 and X3 indicate that adding those additives is associated with increased probability of the cheese being disliked. The estimated odds are reported 5.017, 143.257 and 27.735 for X1, X2 and X3 respectively. Each odd is constant for all categories.

Example 20: Predicted Probability Computation
You can compute the predicted probability at a certain level of independent variables. For example, you can use the following formula to compute the predicted probabilities at X1=1, X2=0 and X3=0 for the model in Example 19:
pcomp

and so on. However, this computation can be easily obtained for each combination of additives by using:


  proc logistic data=cheese;

  freq f;

  model y=x1-x3;

  output out=prob predicted=phat;

  run;

  proc print data=prob;

  run;

You will have the following additional output:


                    Sample Program: Ordered Logit Regression                   



            OBS    X1    X2    X3    Y      F     _LEVEL_      PHAT



              1     1     0     0    1      0        1       0.00420

              2     1     0     0    1      0        2       0.01198

              3     1     0     0    1      0        3       0.03514

              4     1     0     0    1      0        4       0.09587

              5     1     0     0    1      0        5       0.28746

              6     1     0     0    1      0        6       0.51106

              7     1     0     0    1      0        7       0.82432

              8     1     0     0    1      0        8       0.95713

                                        .

                                        .

            281     0     0     0    9     11        1       0.00084

            282     0     0     0    9     11        2       0.00241

            283     0     0     0    9     11        3       0.00721

            284     0     0     0    9     11        4       0.02070

            285     0     0     0    9     11        5       0.07443

            286     0     0     0    9     11        6       0.17242

            287     0     0     0    9     11        7       0.48329

            288     0     0     0    9     11        8       0.81652

In this output you have 8 observations for each additive-response combination. The observation with _LEVEL_=1 shows the predicted probability of Y=1, the observation with _LEVEL_=2 shows the predicted probability of Y=1 or 2, and so on.

PROBIT Procedure

To use the SAS PROC PROBIT to fit an ordered logit model, use the syntax:


  proc probit;

  class y;

  model y = x1 x2 / d=logistic;

  run;

where Y is the ordinally scaled multiple response variable, and X1 and X2 are two regressors of interest.

Example 21: SAS Ordered Logit Regression in PROC PROBIT

In this example, you are using the same data set as in Example 19. However, SAS PROC PROBIT does not accept a data set in a frequency format. You need to have the same data set in an individual data format; i.e., you need to have:


    X1   X2   X3    Y

  

     1    0    0    3 

     1    0    0    4

     1    0    0    4

             .

             . 

     1    0    0    4 (7 rows of the same data)

     1    0    0    5

     1    0    0    5

             .

             . 

     1    0    0    5 (8 rows of the same data) 

             .

             . 

             . 

With this new data set, CHEESE2, you can use:


  proc probit data=cheese2;

  class y;

  model y = x1-x3 / d=logistic;

  run;

The resulting SAS output will be:


                    Sample Program: Ordered Logit Regression                  



                                Probit Procedure

                            Class Level Information



                     Class    Levels    Values



                     Y             9    1 2 3 4 5 6 7 8 9



                       Number of observations used = 208





                                Probit Procedure



   Data Set          =WORK.CHEESE2

   Dependent Variable=Y



         Weighted Frequency Counts for the Ordered Response Categories



                                   Level     Count

                                       1         7

                                       2        10

                                       3        19

                                       4        27

                                       5        41

                                       6        28

                                       7        39

                                       8        25

                                       9        12







   Log Likelihood for LOGISTIC -355.6739524





                                Probit Procedure



          Variable  DF   Estimate  Std Err ChiSquare  Pr>Chi Label/Value



          INTERCPT   1 -7.0801649  0.56401  157.5844  0.0001 Intercept

          X1         1  1.6127909 0.380544  17.96169  0.0001

          X2         1 4.96463991 0.476721  108.4546  0.0001

          X3         1 3.32268278  0.42183  62.04439  0.0001

          INTER.2    1  1.0551848 0.324654                              2

          INTER.3    1 2.15474934 0.387165                              3

          INTER.4    1 3.22336352 0.420573                              4

          INTER.5    1 4.55961327 0.454216                              5

          INTER.6    1  5.5116267 0.479248                              6

          INTER.7    1 7.01328969 0.520899                              7

          INTER.8    1 8.57313924 0.587685                              8

Notice that intercept parameter estimates are computed as:

alphacomp

Next: Ordered Probit Regression
Prev: Models for Ordered Multiple Choices
Up: Contents