Conditional Logit Regression

The conditional logit model has the form:

conditional

In this model, subjects are presented with choice alternatives and asked to choose the most preferred alternative. The set of alternatives is typically the same for all subjects and the explanatory variables are all choice specific. Unlike in the multinomial logit model, the parameters are not specific to the choice.

Conditional Logit Regression with SAS

PHREG Procedure

The SAS PHREG procedure performs regression analysis of survival data based on the Cox proportional hazards model. Its likelihood function is similar to that of the conditional logit model.

To fit a conditional logit model with PROC PHREG, you need to rearrange the data set in a form that is consistent with survival analysis data. The most preferred choice is said to occur at time 1 and all other choices are said to occur at later times or to be censored. You also need to create a status variable to denote whether the observation was censored or not, i.e., whether the alternative was chosen or not. The censoring indicator variable has the value of 0 if the alternative was censored (not chosen) and 1 if not censored (chosen). The basic syntax is:


  proc phreg;

  strata strata_varname;

  model time_varname*status_varname(0) = x1 x2;

  run;

where strata_varname is the name of variable to specify the variable that determines the stratification, time_varname is the name of failure time variable (the smaller value means the alternative was chosen), status_varname is the name of the censoring indicator variable, of which 0 is the value to indicate censoring, and X1 and X2 are explanatory variables.

Example 28: SAS Conditional Logit Regression in PROC PHREG

This example is from SAS (SAS, 1995, Logistic Regression Examples Using the SAS System, pp. 2-3). Chocolate candy data are generated in which 10 subjects are presented with eight different chocolate candies. The subjects choose one preferred candy from among the eight types. The eight candies consist of eight combinations of dark(1) or milk(0) chocolate, soft(1) or hard(0) center, and nuts(1) or no nuts(0). The following data step creates the data set CHOCO:


  data choco;

  input subject choose dark soft nuts @@;

  t=2-choose;

  cards;

   1 0 0 0 0   1 0 0 0 1   1 0 0 1 0   1 0 0 1 1

   1 1 1 0 0   1 0 1 0 1   1 0 1 1 0   1 0 1 1 1

   2 0 0 0 0   2 0 0 0 1   2 0 0 1 0   2 0 0 1 1

   2 0 1 0 0   2 1 1 0 1   2 0 1 1 0   2 0 1 1 1

   3 0 0 0 0   3 0 0 0 1   3 0 0 1 0   3 0 0 1 1

   3 0 1 0 0   3 0 1 0 1   3 1 1 1 0   3 0 1 1 1

   4 0 0 0 0   4 0 0 0 1   4 0 0 1 0   4 0 0 1 1

   4 1 1 0 0   4 0 1 0 1   4 0 1 1 0   4 0 1 1 1

   5 0 0 0 0   5 1 0 0 1   5 0 0 1 0   5 0 0 1 1

   5 0 1 0 0   5 0 1 0 1   5 0 1 1 0   5 0 1 1 1

   6 0 0 0 0   6 0 0 0 1   6 0 0 1 0   6 0 0 1 1

   6 0 1 0 0   6 1 1 0 1   6 0 1 1 0   6 0 1 1 1

   7 0 0 0 0   7 1 0 0 1   7 0 0 1 0   7 0 0 1 1

   7 0 1 0 0   7 0 1 0 1   7 0 1 1 0   7 0 1 1 1

   8 0 0 0 0   8 0 0 0 1   8 0 0 1 0   8 0 0 1 1

   8 0 1 0 0   8 1 1 0 1   8 0 1 1 0   8 0 1 1 1

   9 0 0 0 0   9 0 0 0 1   9 0 0 1 0   9 0 0 1 1

   9 0 1 0 0   9 1 1 0 1   9 0 1 1 0   9 0 1 1 1

  10 0 0 0 0  10 0 0 0 1  10 0 0 1 0  10 0 0 1 1

  10 0 1 0 0  10 1 1 0 1  10 0 1 1 0  10 0 1 1 1

  ;

where SUBJECT is the subject number, CHOOSE is the status variable, and T is the time variable. Because this data set is arranged in a survival analysis form you can use the PROC PHREG. You can use the syntax:


  proc phreg data=choco;

  strata subject;

  model t*choose(0)=dark soft nuts;

  run;

As a result, you will have:


                 Sample Program: Conditional Logit Regression                  



                              The PHREG Procedure



     Data Set: WORK.CHOCO

     Dependent Variable: T

     Censoring Variable: CHOOSE

     Censoring Value(s): 0

     Ties Handling: BRESLOW





              Summary of the Number of Event and Censored Values



                                                                  Percent

      Stratum    SUBJECT        Total       Event    Censored    Censored



            1    1                  8           1           7       87.50

            2    2                  8           1           7       87.50

            3    3                  8           1           7       87.50

            4    4                  8           1           7       87.50

            5    5                  8           1           7       87.50

            6    6                  8           1           7       87.50

            7    7                  8           1           7       87.50

            8    8                  8           1           7       87.50

            9    9                  8           1           7       87.50

           10    10                 8           1           7       87.50

        Total                      80          10          70       87.50



                     Testing Global Null Hypothesis: BETA=0



                   Without        With

    Criterion    Covariates    Covariates    Model Chi-Square



    -2 LOG L         41.589        28.727      12.862 with 3 DF (p=0.0049)

    Score              .             .         11.600 with 3 DF (p=0.0089)

    Wald               .             .          8.928 with 3 DF (p=0.0303)



                    Analysis of Maximum Likelihood Estimates



                    Parameter     Standard      Wald         Pr >          Risk

 Variable   DF       Estimate       Error    Chi-Square   Chi-Square      Ratio



 DARK        1       1.386294      0.79057      3.07490       0.0795      4.000

 SOFT        1      -2.197225      1.05409      4.34502       0.0371      0.111

 NUTS        1       0.847298      0.69007      1.50762       0.2195      2.333

The result shows the estimation result as:

conditionalcomp

The positive parameter estimates of DARK and NUTS mean that dark and nuts each increases the preference. The negative parameter estimate of SOFT denotes soft center decreases the preference.

For each of eight types of candies, the predicted probabilities can be computed as follows:

conditionalcomp2

This shows that the most preferred type of candy is the dark chocolate with a hard center and nuts.

Conditional Logit Regression with SPSS

COXREG Procedure

With SPSS, you can use the COXREG procedure to fit a conditional logit model. The basic syntax is:


  coxreg time_varname with X1 X2

   /status=status_varname(1)

   /strata=strata_varname.

where time_varname is the name of the failure time variable (the smaller value means the alternative was chosen), status_varname is the name of the censoring indicator variable, of which 1 is the value to indicate the event has occurred (not censored), strata_varname is the name of variable to specify the variable that determines the stratification, and X1 and X2 are explanatory variables.

Example 29: SPSS Conditional Logit Regression in COXREG procedure

Using the data in Example 28, if you use:


  coxreg t with dark soft nuts

   /status=choose(1)

   /strata=subject.

you will have the following SPSS output:


                        C O X   R E G R E S S I O N



      80  Total cases read

         0  Cases with missing values

         0  Valid cases with non-positive times

         0  Censored cases before the earliest event in a stratum

       0  Total cases dropped

      80  Cases available for the analysis



Dependent Variable:  T



SUBJECT      Events  Censored



    1.00          1         7 (87.5%)

    2.00          1         7 (87.5%)

    3.00          1         7 (87.5%)

    4.00          1         7 (87.5%)

    5.00          1         7 (87.5%)

    6.00          1         7 (87.5%)

    7.00          1         7 (87.5%)

    8.00          1         7 (87.5%)

    9.00          1         7 (87.5%)

   10.00          1         7 (87.5%)



Total            10        70 (87.5%)



Beginning Block Number 0.  Initial Log Likelihood Function



-2 Log Likelihood      41.589



Beginning Block Number 1.  Method:  Enter



Variable(s) Entered at Step Number 1..

    DARK

    NUTS

    SOFT



Coefficients converged after 5 iterations.



-2 Log Likelihood      28.727



                   Chi-Square    df    Sig

Overall (score)        11.600     3  .0089

Change (-2LL) from

 Previous Block        12.862     3  .0049

 Previous Step         12.862     3  .0049





-------------------- Variables in the Equation ---------------------



Variable         B      S.E.      Wald  df     Sig       R    Exp(B)



DARK        1.3863     .7906    3.0749   1   .0795   .1608    4.0000

NUTS         .8473     .6901    1.5076   1   .2195   .0000    2.3333

SOFT       -2.1972    1.0541    4.3450   1   .0371  -.2375     .1111



Covariate Means



Variable         Mean



DARK            .5000

NUTS            .5000

SOFT            .5000

The estimation result is exactly the same as what you obtained with SAS.


Prev: Multinomial Logit Regression
Up: Contents