Multinomial Logit Regression

The multinomial logit model has the form:

multinomiallogit

betak1 can be set to 0 (zero vector) as a normalization and thus:

multinomiallogit2

As a result, the j logit has the form:

multinomiallogit3

Multinomial Logit Regression with SAS

CATMOD Procedure

The SAS CATMOD procedure is capable of dealing with various types of the multinomial logit model. The basic syntax to fit a multinomial logit model is:

  proc catmod;
  direct x1;
  response logits;
  model y=x1 x2;
  run;

where X1 is a continuous quantitative variable and X2 is a categorical variable. You must specify your continuous regressors in the DIRECT statement.

The RESPONSE statement specifies the functions of response probabilities used to model the response functions as a linear combination of the parameters. Depending on your model, you can specify other types of responses beside the LOGITS. For example, among all others,

response

The default is LOGITS (generalized logits) and it models:

logits
Example 25: SAS Multinomial Logit Regression in PROC CATMOD

In this example, you are using a modified version of the data set CHEESE in Example 19. Zero frequency for some observations causes a sparseness of the data and thus you may have problems in fitting the multinomial logit model. In order to avoid the zero frequency, the following is being tried:

 X1   X2   X3    Y    F

  1    0    0    1    1
  1    0    0    2    1
  1    0    0    3    1
  1    0    0    4    5
  1    0    0    5    8
  1    0    0    6    8
  1    0    0    7   19
  1    0    0    8    8
  1    0    0    9    1
  0    1    0    1    6
  0    1    0    2    9
  0    1    0    3   12
  0    1    0    4   11
  0    1    0    5    7
  0    1    0    6    4
  0    1    0    7    1
  0    1    0    8    1
  0    1    0    9    1
  0    0    1    1    1
  0    0    1    2    1
  0    0    1    3    6
  0    0    1    4    8
  0    0    1    5   23
  0    0    1    6    7
  0    0    1    7    4
  0    0    1    8    1
  0    0    1    9    1
  0    0    0    1    1
  0    0    0    2    1
  0    0    0    3    1
  0    0    0    4    1
  0    0    0    5    1
  0    0    0    6    6
  0    0    0    7   14
  0    0    0    8   16
  0    0    0    9   11

You are supposed to rearrange this data in the way you have used in Example 21. Furthermore, you are collapsing the nine response categories into three for a simpler illustration. That is, you will create a new response variable YLESS such that

  if y=1 or y=2 or y=3 then yless=1;
  else if y=4 or y=5 or y=6 then yless=2;
  else yless=3;

With this new data set CHEESE3, if you use:

  proc catmod data=cheese3;
  direct x1-x4;
  response logits;
  model yless=x1-x4 / noiter freq;
  run;

the resulting output will be:

                  Sample Program: Multinomial Logit Regression                 

                                CATMOD PROCEDURE

        Response: YLESS                       Response Levels (R)=     3
        Weight Variable: None                 Populations     (S)=     4
        Data Set: CHEESE3                     Total Frequency (N)=   208
        Frequency Missing: 0                  Observations  (Obs)=   208


                               POPULATION PROFILES
                                                  Sample
                        Sample  X1  X2  X3  X4     Size 
                            1   0   0   0   1         52
                            2   0   0   1   0         52
                            3   0   1   0   0         52
                            4   1   0   0   0         52


                               RESPONSE PROFILES

                                Response  YLESS
                                     1      1
                                     2      2
                                     3      3


                              RESPONSE FREQUENCIES

                                    Response Number
                       Sample        1        2        3
                           1         3        8       41
                           2         8       38        6
                           3        27       22        3
                           4         3       21       28



                 MAXIMUM-LIKELIHOOD ANALYSIS-OF-VARIANCE TABLE

               Source                   DF   Chi-Square      Prob
               --------------------------------------------------
               INTERCEPT                 2        33.46    0.0000
               X1                        2         7.79    0.0203
               X2                        2        36.33    0.0000
               X3                        2        37.07    0.0000
               X4                        0*         .       .

               LIKELIHOOD RATIO          0          .       .

               NOTE: Effects marked with '*' contain one or more
                     redundant or restricted parameters.


                    ANALYSIS OF MAXIMUM-LIKELIHOOD ESTIMATES

                                               Standard    Chi-
        Effect            Parameter  Estimate    Error    Square   Prob
        ----------------------------------------------------------------
        INTERCEPT                 1   -2.6150    0.5981    19.12  0.0000
                                  2   -1.6341    0.3865    17.88  0.0000
        X1                        3    0.3814    0.8525     0.20  0.6546
                                  4    1.3464    0.4824     7.79  0.0053
        X2                        5    4.8122    0.8532    31.81  0.0000
                                  6    3.6266    0.7267    24.90  0.0000
        X3                        7    2.9026    0.8058    12.97  0.0003
                                  8    3.4800    0.5851    35.37  0.0000
        X4                        9         .         .      .     .
                                 10         .         .      .     .

The estimation result shows two regression lines:

logitcomp

Thus, the estimated logit at each combination of X's is

logitcomp2

If you need to know the predicted probabilities you can compute them by applying the formula. For example, at X1=1, X2=0 and X3=0,

probcomp

This computation can be easily obtained by including the PROB statement in the MODEL command as:

  proc catmod data=cheese3;
  direct x1-x4;
  response logits;
  model yless=x1-x4 / noiter freq prob;
  run;

In addition to the output above, you will get the following:

                             RESPONSE PROBABILITIES

                                    Response Number
                       Sample        1        2        3
                           1   0.05769  0.15385  0.78846
                           2   0.15385  0.73077  0.11538
                           3   0.51923  0.42308  0.05769
                           4   0.05769  0.40385  0.53846
Example 26: SAS Multinomial Logit Regression in PROC CATMOD (categorical regressors)

In Example 25, X1, X2, X3, and X4 are dummy variables to denote each type of additive. The same result can be obtained by employing a categorical variable to represent types of additives. To do this, you can create a new variable X such that

  if x1=1 then x=1;
  else if x2=1 then x=2;
  else if x3=1 then x=3;
  else x=4;

Now if you use:

  proc catmod data=cheese3;
  response logits;
  model yless = x / noiter freq prob;
  run;

the resulting SAS output will be:

                  Sample Program: Multinomial Logit Regression                

                                CATMOD PROCEDURE

        Response: YLESS                       Response Levels (R)=     3
        Weight Variable: None                 Populations     (S)=     4
        Data Set: CHEESE3                     Total Frequency (N)=   208
        Frequency Missing: 0                  Observations  (Obs)=   208

                              POPULATION PROFILES
                                           Sample
                              Sample  X     Size 
                                  1   1        52
                                  2   2        52
                                  3   3        52
                                  4   4        52

                               RESPONSE PROFILES

                                Response  YLESS
                                     1      1
                                     2      2
                                     3      3

                              RESPONSE FREQUENCIES

                                    Response Number
                       Sample        1        2        3
                           1         3       21       28
                           2        27       22        3
                           3         8       38        6
                           4         3        8       41


                             RESPONSE PROBABILITIES

                                    Response Number
                       Sample        1        2        3
                           1   0.05769  0.40385  0.53846
                           2   0.51923  0.42308  0.05769
                           3   0.15385  0.73077  0.11538
                           4   0.05769  0.15385  0.78846


                 MAXIMUM-LIKELIHOOD ANALYSIS-OF-VARIANCE TABLE

               Source                   DF   Chi-Square      Prob
               --------------------------------------------------
               INTERCEPT                 2        18.26    0.0001
               X                         6        78.70    0.0000

               LIKELIHOOD RATIO          0          .       .


                    ANALYSIS OF MAXIMUM-LIKELIHOOD ESTIMATES

                                               Standard    Chi-
        Effect            Parameter  Estimate    Error    Square   Prob
        ----------------------------------------------------------------
        INTERCEPT                 1   -0.5909    0.2946     4.02  0.0449
                                  2    0.4791    0.2242     4.57  0.0326
        X                         3   -1.6427    0.5209     9.95  0.0016
                                  4   -0.7668    0.3032     6.39  0.0114
                                  5    2.7881    0.5215    28.59  0.0000
                                  6    1.5133    0.4895     9.56  0.0020
                                  7    0.8786    0.4823     3.32  0.0685
                                  8    1.3667    0.3831    12.73  0.0004

The result shows each estimated logit can be calculated as:

logitcomp3

This is exactly the same logit computation as in the previous example.

Multinomial Logit Regression with SPSS

GENLOG Procedure

The SPSS GENLOG procedure conducts the general loglinear analysis and the logit model can be treated as a special class of loglinear models. However, SPSS is only capable of dealing with a multinomial logit model with categorical independent variables. The basic syntax to fit a multinomial logit model is:

  genlog y by x1 x2
   /model=multinomial
   /design y y*x1 y*x2 y*x1*x2. 

where Y is response variable and X1 and X2 are categorical regressors.

Example 27: SPSS Multinomial Logit Regression in GENLOG

You are using the same data in Example 25. You can use:

  genlog yless by x
   /model=multinomial
   /print freq estim
   /plot none
   /criteria=cin(95) iteration(20) converge(.001) delta(0)
   /design yless yless*x. 

The resulting SPSS output will be:

     
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
                        GENERALIZED LOGLINEAR ANALYSIS
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Data Information

       208 cases are accepted.
         0 cases are rejected because of missing data.
       208 weighted cases will be used in the analysis.
        12 cells are defined.
         0 structural zeros are imposed by design.
         0 sampling zeros are encountered.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Variable Information

Factor     Levels    Value

YLESS          3
                      1.00
                      2.00
                      3.00

X              4
                      1.00
                      2.00
                      3.00
                      4.00

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Model and Design Information

 Model: Multinomial Logit
Design: Constant + YLESS + YLESS*X

Note: There is a separate constant term for each combination of levels
      of the independent factors.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Correspondence Between Parameters and Terms of the Design

Parameter   Aliased  Term

        1            Constant for [X = 1.00]
        2            Constant for [X = 2.00]
        3            Constant for [X = 3.00]
        4            Constant for [X = 4.00]
        5            [YLESS = 1.00]
        6            [YLESS = 2.00]
        7       x    [YLESS = 3.00]
        8            [YLESS = 1.00]*[X = 1.00]
        9            [YLESS = 1.00]*[X = 2.00]
       10            [YLESS = 1.00]*[X = 3.00]
       11       x    [YLESS = 1.00]*[X = 4.00]
       12            [YLESS = 2.00]*[X = 1.00]
       13            [YLESS = 2.00]*[X = 2.00]
       14            [YLESS = 2.00]*[X = 3.00]
       15       x    [YLESS = 2.00]*[X = 4.00]
       16       x    [YLESS = 3.00]*[X = 1.00]
       17       x    [YLESS = 3.00]*[X = 2.00]
       18       x    [YLESS = 3.00]*[X = 3.00]
       19       x    [YLESS = 3.00]*[X = 4.00]

Note: 'x' indicates an aliased (or a redundant) parameter.
      These parameters are set to zero.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Convergence Information

Maximum number of iterations:            20
Relative difference tolerance:         .001
Final relative difference:      2.92779E-14

Maximum likelihood estimation converged at iteration 1.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Table Information

                Observed            Expected
Factor   Value     Count       %       Count       %

X       1.00
 YLESS   1.00       3.00 (  5.77)       3.00 (  5.77)
 YLESS   2.00      21.00 ( 40.38)      21.00 ( 40.38)
 YLESS   3.00      28.00 ( 53.85)      28.00 ( 53.85)

X       2.00
 YLESS   1.00      27.00 ( 51.92)      27.00 ( 51.92)
 YLESS   2.00      22.00 ( 42.31)      22.00 ( 42.31)
 YLESS   3.00       3.00 (  5.77)       3.00 (  5.77)

X       3.00
 YLESS   1.00       8.00 ( 15.38)       8.00 ( 15.38)
 YLESS   2.00      38.00 ( 73.08)      38.00 ( 73.08)
 YLESS   3.00       6.00 ( 11.54)       6.00 ( 11.54)

X       4.00
 YLESS   1.00       3.00 (  5.77)       3.00 (  5.77)
 YLESS   2.00       8.00 ( 15.38)       8.00 ( 15.38)
 YLESS   3.00      41.00 ( 78.85)      41.00 ( 78.85)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Goodness-of-fit Statistics

                    Chi-Square       DF       Sig.

Likelihood Ratio         .0000        0      .
         Pearson         .0000        0      .

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Analysis of Dispersion

Source of Dispersion      Entropy  Concentration       DF

Due to Model              55.4019        35.2404       12
Due to Residual          163.2376        97.3462      402
Total                    218.6395       132.5865      414

Measures of Association

      Entropy =  .2534
Concentration =  .2658

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Parameter Estimates

 Constant   Estimate

        1     3.3322
        2     1.0986
        3     1.7918
        4     3.7136

Note: Constants are not parameters under multinomial assumption.
      Therefore, standard errors are not calculated.

                                               Asymptotic 95% CI
Parameter   Estimate         SE    Z-value      Lower      Upper

        5    -2.6150      .5981      -4.37      -3.79      -1.44
        6    -1.6341      .3865      -4.23      -2.39       -.88
        7      .0000      .            .          .          .
        8      .3814      .8525        .45      -1.29       2.05
        9     4.8122      .8533       5.64       3.14       6.48
       10     2.9026      .8058       3.60       1.32       4.48
       11      .0000      .            .          .          .
       12     1.3464      .4824       2.79        .40       2.29
       13     3.6266      .7268       4.99       2.20       5.05
       14     3.4800      .5851       5.95       2.33       4.63
       15      .0000      .            .          .          .
       16      .0000      .            .          .          .
       17      .0000      .            .          .          .
       18      .0000      .            .          .          .
       19      .0000      .            .          .          .

SPSS output shows the following model structure:

Assuming YLESS=3 is the reference category, the estimated logit of YLESS=1 at X=1 is

logitcomp4

where m11 is the predicted count for YLESS=1 at X=1 and m31 is the predicted count for YLESS=3 at X=1. Similarly, the estimated logit of YLESS=2 at X=1 is

logitcomp5

where m21 is the predicted count for YLESS=2 at X=1. Other possible logit computations at X=2,3 and 4 can be derived in the same manner.

The output provides the following estimation results:

logitcomp6

Next: Conditional Logit Regression
Prev: Multinomial Logit Regression
Up: Models for Unordered Multiple Choices