Conditional Logit Regression

The conditional logit model has the form:

conditional

In this model, subjects are presented with choice alternatives and asked to choose the most preferred alternative. The set of alternatives is typically the same for all subjects and the explanatory variables are all choice specific. Unlike in the multinomial logit model, the parameters are not specific to the choice.

Conditional Logit Regression with SAS

PHREG Procedure

The SAS PHREG procedure performs regression analysis of survival data based on the Cox proportional hazards model. Its likelihood function is similar to that of the conditional logit model.

To fit a conditional logit model with PROC PHREG, you need to rearrange the data set in a form that is consistent with survival analysis data. The most preferred choice is said to occur at time 1 and all other choices are said to occur at later times or to be censored. You also need to create a status variable to denote whether the observation was censored or not, i.e., whether the alternative was chosen or not. The censoring indicator variable has the value of 0 if the alternative was censored (not chosen) and 1 if not censored (chosen). The basic syntax is:

  proc phreg;
  strata strata_varname;
  model time_varname*status_varname(0) = x1 x2;
  run;

where strata_varname is the name of variable to specify the variable that determines the stratification, time_varname is the name of failure time variable (the smaller value means the alternative was chosen), status_varname is the name of the censoring indicator variable, of which 0 is the value to indicate censoring, and X1 and X2 are explanatory variables.

Example 28: SAS Conditional Logit Regression in PROC PHREG

This example is from SAS (SAS, 1995, Logistic Regression Examples Using the SAS System, pp. 2-3). Chocolate candy data are generated in which 10 subjects are presented with eight different chocolate candies. The subjects choose one preferred candy from among the eight types. The eight candies consist of eight combinations of dark(1) or milk(0) chocolate, soft(1) or hard(0) center, and nuts(1) or no nuts(0). The following data step creates the data set CHOCO:

  data choco;
  input subject choose dark soft nuts @@;
  t=2-choose;
  cards;
   1 0 0 0 0   1 0 0 0 1   1 0 0 1 0   1 0 0 1 1
   1 1 1 0 0   1 0 1 0 1   1 0 1 1 0   1 0 1 1 1
   2 0 0 0 0   2 0 0 0 1   2 0 0 1 0   2 0 0 1 1
   2 0 1 0 0   2 1 1 0 1   2 0 1 1 0   2 0 1 1 1
   3 0 0 0 0   3 0 0 0 1   3 0 0 1 0   3 0 0 1 1
   3 0 1 0 0   3 0 1 0 1   3 1 1 1 0   3 0 1 1 1
   4 0 0 0 0   4 0 0 0 1   4 0 0 1 0   4 0 0 1 1
   4 1 1 0 0   4 0 1 0 1   4 0 1 1 0   4 0 1 1 1
   5 0 0 0 0   5 1 0 0 1   5 0 0 1 0   5 0 0 1 1
   5 0 1 0 0   5 0 1 0 1   5 0 1 1 0   5 0 1 1 1
   6 0 0 0 0   6 0 0 0 1   6 0 0 1 0   6 0 0 1 1
   6 0 1 0 0   6 1 1 0 1   6 0 1 1 0   6 0 1 1 1
   7 0 0 0 0   7 1 0 0 1   7 0 0 1 0   7 0 0 1 1
   7 0 1 0 0   7 0 1 0 1   7 0 1 1 0   7 0 1 1 1
   8 0 0 0 0   8 0 0 0 1   8 0 0 1 0   8 0 0 1 1
   8 0 1 0 0   8 1 1 0 1   8 0 1 1 0   8 0 1 1 1
   9 0 0 0 0   9 0 0 0 1   9 0 0 1 0   9 0 0 1 1
   9 0 1 0 0   9 1 1 0 1   9 0 1 1 0   9 0 1 1 1
  10 0 0 0 0  10 0 0 0 1  10 0 0 1 0  10 0 0 1 1
  10 0 1 0 0  10 1 1 0 1  10 0 1 1 0  10 0 1 1 1
  ;

where SUBJECT is the subject number, CHOOSE is the status variable, and T is the time variable. Because this data set is arranged in a survival analysis form you can use the PROC PHREG. You can use the syntax:

  proc phreg data=choco;
  strata subject;
  model t*choose(0)=dark soft nuts;
  run;

As a result, you will have:

                 Sample Program: Conditional Logit Regression                  

                              The PHREG Procedure

     Data Set: WORK.CHOCO
     Dependent Variable: T
     Censoring Variable: CHOOSE
     Censoring Value(s): 0
     Ties Handling: BRESLOW


              Summary of the Number of Event and Censored Values

                                                                  Percent
      Stratum    SUBJECT        Total       Event    Censored    Censored

            1    1                  8           1           7       87.50
            2    2                  8           1           7       87.50
            3    3                  8           1           7       87.50
            4    4                  8           1           7       87.50
            5    5                  8           1           7       87.50
            6    6                  8           1           7       87.50
            7    7                  8           1           7       87.50
            8    8                  8           1           7       87.50
            9    9                  8           1           7       87.50
           10    10                 8           1           7       87.50
        Total                      80          10          70       87.50

                     Testing Global Null Hypothesis: BETA=0

                   Without        With
    Criterion    Covariates    Covariates    Model Chi-Square

    -2 LOG L         41.589        28.727      12.862 with 3 DF (p=0.0049)
    Score              .             .         11.600 with 3 DF (p=0.0089)
    Wald               .             .          8.928 with 3 DF (p=0.0303)

                    Analysis of Maximum Likelihood Estimates

                    Parameter     Standard      Wald         Pr >          Risk
 Variable   DF       Estimate       Error    Chi-Square   Chi-Square      Ratio

 DARK        1       1.386294      0.79057      3.07490       0.0795      4.000
 SOFT        1      -2.197225      1.05409      4.34502       0.0371      0.111
 NUTS        1       0.847298      0.69007      1.50762       0.2195      2.333

The result shows the estimation result as:

conditionalcomp

The positive parameter estimates of DARK and NUTS mean that dark and nuts each increases the preference. The negative parameter estimate of SOFT denotes soft center decreases the preference.

For each of eight types of candies, the predicted probabilities can be computed as follows:

conditionalcomp2

This shows that the most preferred type of candy is the dark chocolate with a hard center and nuts.

Conditional Logit Regression with SPSS

COXREG Procedure

With SPSS, you can use the COXREG procedure to fit a conditional logit model. The basic syntax is:

  coxreg time_varname with X1 X2
   /status=status_varname(1)
   /strata=strata_varname.

where time_varname is the name of the failure time variable (the smaller value means the alternative was chosen), status_varname is the name of the censoring indicator variable, of which 1 is the value to indicate the event has occurred (not censored), strata_varname is the name of variable to specify the variable that determines the stratification, and X1 and X2 are explanatory variables.

Example 29: SPSS Conditional Logit Regression in COXREG procedure

Using the data in Example 28, if you use:

  coxreg t with dark soft nuts
   /status=choose(1)
   /strata=subject.

you will have the following SPSS output:

                        C O X   R E G R E S S I O N

      80  Total cases read
         0  Cases with missing values
         0  Valid cases with non-positive times
         0  Censored cases before the earliest event in a stratum
       0  Total cases dropped
      80  Cases available for the analysis

Dependent Variable:  T

SUBJECT      Events  Censored

    1.00          1         7 (87.5%)
    2.00          1         7 (87.5%)
    3.00          1         7 (87.5%)
    4.00          1         7 (87.5%)
    5.00          1         7 (87.5%)
    6.00          1         7 (87.5%)
    7.00          1         7 (87.5%)
    8.00          1         7 (87.5%)
    9.00          1         7 (87.5%)
   10.00          1         7 (87.5%)

Total            10        70 (87.5%)

Beginning Block Number 0.  Initial Log Likelihood Function

-2 Log Likelihood      41.589

Beginning Block Number 1.  Method:  Enter

Variable(s) Entered at Step Number 1..
    DARK
    NUTS
    SOFT

Coefficients converged after 5 iterations.

-2 Log Likelihood      28.727

                   Chi-Square    df    Sig
Overall (score)        11.600     3  .0089
Change (-2LL) from
 Previous Block        12.862     3  .0049
 Previous Step         12.862     3  .0049


-------------------- Variables in the Equation ---------------------

Variable         B      S.E.      Wald  df     Sig       R    Exp(B)

DARK        1.3863     .7906    3.0749   1   .0795   .1608    4.0000
NUTS         .8473     .6901    1.5076   1   .2195   .0000    2.3333
SOFT       -2.1972    1.0541    4.3450   1   .0371  -.2375     .1111

Covariate Means

Variable         Mean

DARK            .5000
NUTS            .5000
SOFT            .5000

The estimation result is exactly the same as what you obtained with SAS.


Prev: Multinomial Logit Regression
Up: Models for Unordered Multiple Choices