Stat/Math
Software Support
Software Consulting
Software Availability
Purchasing Software
Software Price
Contact

User Support
Documentation
Knowledge Base
Education
Consulting
Podcasts

Systems & Services
Cyberinfrastructure
Supercomputers
Grid Computing
Storage
Visualization
Digital Libraries & Data

Results & Impact
Publications
Grants & Grant Info
Events & Outreach
Economic Impact
Survey Results

About RT
Leadership
Vision & Planning
News & Features

### 8. The Nested Logit Regression Model

Consider a nested structure of choices. The first choice is made and the second choice follows conditional on the first choice. When the IIA assumption is violated, one of the alternatives is the nested (multinomial) logit model. This chapter replicates the nested logit model discussed in Greene (2003).

8.1 Nested Logit in Stata (.nlogit)

The Stata .nlogit command estimates the nested logit model using the full information maximum-likelihood (FIML) method. First you need to create a variable based on the specification of the tree using the .nlogitgen command. From the top, the parent-level has fly and ground branches; the fly branch at the child-level has air flight (1); the ground branch has train (2), bus (3), and car (4). Note that fly and ground below are not variable names but arbitrary names you prefer.

. nlogitgen tree = mode(fly: 1, ground: 2 | 3 | 4)

new variable tree is generated with 2 groups
label list lb_tree
lb_tree:
1 fly
2 ground

The .nlogittree command.displays the tree-structure defined by the .nlogitgen command.

. nlogittree mode tree

tree structure specified for the nested logit model

top --> bottom

tree          mode
--------------------------
fly             1
ground             2
3
4

In Stata 10, .nlogit by default uses parameterization consistent with random utility maximization and introduces new syntax different from one in previous edition (Stata 2007: 434). The command is followed by a binary dependent variable, a list of independent variables, specifications of each level, and options. The case() is required to specify identification variable and nonnormalized is needed to request unscaled parameterization. Remind that tree was defined by .nlogitgen above.

. nlogit choice air train bus cost time || tree: air_inc || ///
mode:, case(subject) nonnormalized nolog noconstant notree

The notree option does not show the tree-structure and the nolog suppresses an iteration log of the log likelihood. Note that the /// joins the next command line to the current line.

If you prefer old style, list a binary dependent or choice variable, utility functions of the parent and child-levels, and options. The group()option is equivalent to case() in version 10. Do not forget to run the .version command to use a previsions version of command interpreter.

. version 9 . nlogit choice (mode=air train bus cost time) (tree=air_inc), ///
group(subject) notree nolog

Nested logit regression
Levels             =          2                 Number of obs      =       840
Dependent variable =     choice                 LR chi2(8)         =  194.9313
Log likelihood     = -193.65615                 Prob > chi2        =    0.0000

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mode         |
air |   6.042255   1.198907     5.04   0.000     3.692441     8.39207
train |   5.064679   .6620317     7.65   0.000     3.767121    6.362237
bus |   4.096302   .6151582     6.66   0.000     2.890614     5.30199
cost |  -.0315888   .0081566    -3.87   0.000    -.0475754   -.0156022
time |  -.1126183   .0141293    -7.97   0.000    -.1403111   -.0849254
-------------+----------------------------------------------------------------
tree         |
air_inc |   .0153337   .0093814     1.63   0.102    -.0030534    .0337209
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
tree         |
/fly |   .5859993   .1406199     4.17   0.000     .3103894    .8616092
/ground |   .3889488   .1236623     3.15   0.002     .1465753    .6313224
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(2)=   10.94    Prob > chi2 = 0.0042
------------------------------------------------------------------------------

The notree option does not show the tree-structure and the nolog suppresses an iteration log of the log likelihood. Note that the /// joins the next command line with the current line.

Top

8.2 Nested Logit in SAS

The SAD MDC procedure fits the conditional logit model as well as the nested multinomial logit model. For the nested logit model, you have to use the UTILITY statement to specify utility functions of the parent (level 2) and child level (level 1), and the NEST statement to construct the decision-tree structure. Note that “2 3 4 @ 2” reads that there are three nodes at the child level under the branch 2 at the parent-level.

PROC MDC DATA=masil.travel;
MODEL choice = air train bus cost time air_inc /TYPE=NLOGIT CHOICE=(mode);
ID subject;
UTILITY U(1,) = air train bus cost time,
U(2, 1 2) = air_inc;
NEST LEVEL(1) = (1 @ 1, 2 3 4 @ 2),
LEVEL(2) = (1 2 @ 1);
RUN;

The MDC Procedure

Nested Logit Estimates

Algorithm converged.

Model Fit Summary

Dependent Variable                   choice
Number of Observations                  210
Number of Cases                         840
Log Likelihood                   -193.65615
Maximum Absolute Gradient         0.0000147
Number of Iterations                     15
Optimization Method          Newton-Raphson
AIC                               403.31230
Schwarz Criterion                 430.08916

Discrete Response Profile

Index     mode     Frequency    Percent

0          1            58      27.62
1          2            63      30.00
2          3            30      14.29
3          4            59      28.10

Goodness-of-Fit Measures

Measure                       Value    Formula

Likelihood Ratio (R)         194.93    2 * (LogL - LogL0)
Upper Bound of R (U)         582.24    - 2 * LogL0
Aldrich-Nelson               0.4814    R / (R+N)
Cragg-Uhler 1                0.6048    1 - exp(-R/N)
Cragg-Uhler 2                0.6451    (1-exp(-R/N)) / (1-exp(-U/N))
Estrella                     0.6771    1 - (1-R/U)^(U/N)
Adjusted Estrella            0.6485    1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
McFadden's LRI               0.3348    R / U
Veall-Zimmermann              0.655    (R * (U+N)) / (U * (R+N))

N = # of observations, K = # of regressors

Nested Logit Estimates

Parameter Estimates

Standard                 Approx
Parameter       DF     Estimate        Error    t Value    Pr > |t|

air_L1           1       6.0423       1.1989       5.04     <.0001
train_L1         1       5.0646       0.6620       7.65     <.0001
bus_L1           1       4.0963       0.6152       6.66     <.0001
cost_L1          1      -0.0316     0.008156      -3.87     0.0001
time_L1          1      -0.1126       0.0141      -7.97     <.0001
air_inc_L2G1     1       0.0153     0.009381       1.63     0.1022
INC_L2G1C1       1       0.5860       0.1406       4.17     <.0001
INC_L2G1C2       1       0.3890       0.1237       3.15     0.0017

The /fly and /ground in the Stata output above are equivalent to the INC_L2G1C1 and INC_L2G1C2 in the SAS output. SAS and Stata produce the same result.

Top