8. The Nested Logit Regression Model
Consider a nested structure of choices. The first choice is made and the second choice follows conditional
on the first choice.
When the IIA assumption is violated, one of the alternatives is the nested (multinomial) logit model. This chapter replicates the nested logit model discussed in Greene (2003).
8.1 Nested Logit in Stata (.nlogit)
The Stata .nlogit command estimates the nested logit model using the full information maximum-likelihood
(FIML) method. First you need to create a variable based on the specification of the tree using the
.nlogitgen command. From the top, the parent-level has fly and ground branches; the fly branch at the
child-level has air flight (1); the ground branch has train (2), bus (3), and car (4). Note that fly and
ground below are not variable names but arbitrary names you prefer.
. nlogitgen tree = mode(fly: 1, ground: 2 | 3 | 4)
new
variable tree is generated with 2 groups
label list lb_tree
lb_tree:
1 fly
2 ground
label list lb_tree
lb_tree:
1 fly
2 ground
The .nlogittree command.displays the tree-structure defined by the .nlogitgen command.
. nlogittree mode tree
tree
structure specified for the nested logit model
top --> bottom
tree mode
--------------------------
fly 1
ground 2
3
4
top --> bottom
tree mode
--------------------------
fly 1
ground 2
3
4
In Stata 10, .nlogit by default uses parameterization consistent with random utility maximization and
introduces new syntax different from one in previous edition (Stata 2007: 434). The command is followed by
a binary dependent variable, a list of independent variables, specifications of each level, and options.
The case() is required to specify identification variable and nonnormalized is needed to request unscaled
parameterization. Remind that tree was defined by .nlogitgen above.
. nlogit choice air train bus cost time || tree: air_inc || ///
mode:, case(subject) nonnormalized nolog noconstant notree
mode:, case(subject) nonnormalized nolog noconstant notree
The notree option does not show the tree-structure and the nolog suppresses an iteration log of the log
likelihood. Note that the /// joins the next command line to the current line.
If you prefer old style, list a binary dependent or choice variable, utility functions of the parent and
child-levels, and options. The group()option is equivalent to case() in version 10. Do not forget to run
the .version command to use a previsions version of command interpreter.
. version 9
. nlogit choice (mode=air train bus cost time) (tree=air_inc), ///
group(subject) notree nolog
group(subject) notree nolog
Nested
logit regression
Levels = 2 Number of obs = 840
Dependent variable = choice LR chi2(8) = 194.9313
Log likelihood = -193.65615 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mode |
air | 6.042255 1.198907 5.04 0.000 3.692441 8.39207
train | 5.064679 .6620317 7.65 0.000 3.767121 6.362237
bus | 4.096302 .6151582 6.66 0.000 2.890614 5.30199
cost | -.0315888 .0081566 -3.87 0.000 -.0475754 -.0156022
time | -.1126183 .0141293 -7.97 0.000 -.1403111 -.0849254
-------------+----------------------------------------------------------------
tree |
air_inc | .0153337 .0093814 1.63 0.102 -.0030534 .0337209
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
tree |
/fly | .5859993 .1406199 4.17 0.000 .3103894 .8616092
/ground | .3889488 .1236623 3.15 0.002 .1465753 .6313224
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(2)= 10.94 Prob > chi2 = 0.0042
------------------------------------------------------------------------------
Levels = 2 Number of obs = 840
Dependent variable = choice LR chi2(8) = 194.9313
Log likelihood = -193.65615 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mode |
air | 6.042255 1.198907 5.04 0.000 3.692441 8.39207
train | 5.064679 .6620317 7.65 0.000 3.767121 6.362237
bus | 4.096302 .6151582 6.66 0.000 2.890614 5.30199
cost | -.0315888 .0081566 -3.87 0.000 -.0475754 -.0156022
time | -.1126183 .0141293 -7.97 0.000 -.1403111 -.0849254
-------------+----------------------------------------------------------------
tree |
air_inc | .0153337 .0093814 1.63 0.102 -.0030534 .0337209
-------------+----------------------------------------------------------------
(incl. value |
parameters) |
tree |
/fly | .5859993 .1406199 4.17 0.000 .3103894 .8616092
/ground | .3889488 .1236623 3.15 0.002 .1465753 .6313224
------------------------------------------------------------------------------
LR test of homoskedasticity (iv = 1): chi2(2)= 10.94 Prob > chi2 = 0.0042
------------------------------------------------------------------------------
The notree option does not show the tree-structure and the nolog suppresses an iteration log of the log likelihood. Note that the /// joins the next command line with the current line.
The SAD MDC procedure fits the conditional logit model as well as the nested multinomial logit model. For the nested logit model, you have to use the UTILITY statement to specify utility functions of the parent (level 2) and child level (level 1), and the NEST statement to construct the decision-tree structure. Note that “2 3 4 @ 2” reads that there are three nodes at the child level under the branch 2 at the parent-level.
PROC MDC DATA=masil.travel;
MODEL choice = air train bus cost time air_inc /TYPE=NLOGIT CHOICE=(mode);
ID subject;
UTILITY U(1,) = air train bus cost time,
U(2, 1 2) = air_inc;
NEST LEVEL(1) = (1 @ 1, 2 3 4 @ 2),
LEVEL(2) = (1 2 @ 1);
RUN;
MODEL choice = air train bus cost time air_inc /TYPE=NLOGIT CHOICE=(mode);
ID subject;
UTILITY U(1,) = air train bus cost time,
U(2, 1 2) = air_inc;
NEST LEVEL(1) = (1 @ 1, 2 3 4 @ 2),
LEVEL(2) = (1 2 @ 1);
RUN;
The MDC Procedure
Nested Logit Estimates
Algorithm converged.
Model Fit Summary
Dependent Variable choice
Number of Observations 210
Number of Cases 840
Log Likelihood -193.65615
Maximum Absolute Gradient 0.0000147
Number of Iterations 15
Optimization Method Newton-Raphson
AIC 403.31230
Schwarz Criterion 430.08916
Discrete Response Profile
Index mode Frequency Percent
0 1 58 27.62
1 2 63 30.00
2 3 30 14.29
3 4 59 28.10
Goodness-of-Fit Measures
Measure Value Formula
Likelihood Ratio (R) 194.93 2 * (LogL - LogL0)
Upper Bound of R (U) 582.24 - 2 * LogL0
Aldrich-Nelson 0.4814 R / (R+N)
Cragg-Uhler 1 0.6048 1 - exp(-R/N)
Cragg-Uhler 2 0.6451 (1-exp(-R/N)) / (1-exp(-U/N))
Estrella 0.6771 1 - (1-R/U)^(U/N)
Adjusted Estrella 0.6485 1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
McFadden's LRI 0.3348 R / U
Veall-Zimmermann 0.655 (R * (U+N)) / (U * (R+N))
N = # of observations, K = # of regressors
Nested Logit Estimates
Parameter Estimates
Standard Approx
Parameter DF Estimate Error t Value Pr > |t|
air_L1 1 6.0423 1.1989 5.04 <.0001
train_L1 1 5.0646 0.6620 7.65 <.0001
bus_L1 1 4.0963 0.6152 6.66 <.0001
cost_L1 1 -0.0316 0.008156 -3.87 0.0001
time_L1 1 -0.1126 0.0141 -7.97 <.0001
air_inc_L2G1 1 0.0153 0.009381 1.63 0.1022
INC_L2G1C1 1 0.5860 0.1406 4.17 <.0001
INC_L2G1C2 1 0.3890 0.1237 3.15 0.0017
Nested Logit Estimates
Algorithm converged.
Model Fit Summary
Dependent Variable choice
Number of Observations 210
Number of Cases 840
Log Likelihood -193.65615
Maximum Absolute Gradient 0.0000147
Number of Iterations 15
Optimization Method Newton-Raphson
AIC 403.31230
Schwarz Criterion 430.08916
Discrete Response Profile
Index mode Frequency Percent
0 1 58 27.62
1 2 63 30.00
2 3 30 14.29
3 4 59 28.10
Goodness-of-Fit Measures
Measure Value Formula
Likelihood Ratio (R) 194.93 2 * (LogL - LogL0)
Upper Bound of R (U) 582.24 - 2 * LogL0
Aldrich-Nelson 0.4814 R / (R+N)
Cragg-Uhler 1 0.6048 1 - exp(-R/N)
Cragg-Uhler 2 0.6451 (1-exp(-R/N)) / (1-exp(-U/N))
Estrella 0.6771 1 - (1-R/U)^(U/N)
Adjusted Estrella 0.6485 1 - ((LogL-K)/LogL0)^(-2/N*LogL0)
McFadden's LRI 0.3348 R / U
Veall-Zimmermann 0.655 (R * (U+N)) / (U * (R+N))
N = # of observations, K = # of regressors
Nested Logit Estimates
Parameter Estimates
Standard Approx
Parameter DF Estimate Error t Value Pr > |t|
air_L1 1 6.0423 1.1989 5.04 <.0001
train_L1 1 5.0646 0.6620 7.65 <.0001
bus_L1 1 4.0963 0.6152 6.66 <.0001
cost_L1 1 -0.0316 0.008156 -3.87 0.0001
time_L1 1 -0.1126 0.0141 -7.97 <.0001
air_inc_L2G1 1 0.0153 0.009381 1.63 0.1022
INC_L2G1C1 1 0.5860 0.1406 4.17 <.0001
INC_L2G1C2 1 0.3890 0.1237 3.15 0.0017
The /fly and /ground in the Stata output above are equivalent to the INC_L2G1C1 and INC_L2G1C2 in the SAS output. SAS and Stata produce the same result.
Up: Table of Contents
Next: Conclusion
Prev: The Conditional Logit Model



