Chapter 14 Using EXCEL in Multiple Regression


From William E. Becker, Statistics for Business and Economics Using Microsoft Excel 97, S.R.B. Publishing, 1997, p. 509-511. Reproduced with permission of S.R.B. Publishing.

QUERY 14.6: An article in Newsweek (January 22, 1996) stated that "Compared with real wine-drinking countries, the United States is practically dry. That may be a reason, scientists say, that our rate of heart disease is higher." Dr. Charles Fuchs is also quoted saying that drinking beer versus wine may produce extraordinary differences in life expectancy. Questions: 1) Does wine, beer or liquor consumption provide an explanation of death rates from heart disease? 2) Does wine consumption significantly lower death rates from heart disease? The following data on average country death rates and average country alcoholic beverage consumption, in liters per capita, were provided in the Newsweek article. "Heart Disease" is defined as the death rate per 100,000.

Death Rate from
Heart Disease

Wine*

Beer*

Liquor*

France

61.1

63.5

40.1

2.5

Italy

94.1

58.0

25.1

0.9

Switzerland

106.4

46.0

65.0

1.7

Australia

173.0

15.7

102.1

1.2

Britain

199.7

12.2

100.0

1.5

U.S.A.

176.0

8.9

87.8

2.0

Russia

373.6

2.7

17.1

3.8

Czech Republic

283.7

1.7

140.0

1.0

Japan

34.7

1.0

55.0

2.1

Mexico

36.4

0.2

50.4

0.8

*Consumption Per Capita

 

ANSWER: Death rates from heart disease can be specified as a function of wine, beer and liquor consumption. The implied multiple regression can then be estimated in EXCEL via the path "Tools" to "Data Analysis" to "Regression." The regression dialog box is completed as shown in Chart 14.2. (Notice that the "Input X range" requires that the columns of explanatory variables abut.) The resulting regression output is shown in Chart 14.3. The "R Square 0.474945" indicates that 47.5 percent of the variability in death rates from heart disease, around the sample mean death rate from heart disease, is associated with wine, beer and liquor consumption. That is, the explanatory power of this regression is not great. In fact, as seen by the relatively small calculated "t Stat," none of the explanatory variables show much statistical significance. Each of the 95 percent confidence intervals for the respective population slope coefficients includes zero. More specifically, although the sample coefficient for wine is negative, its one-tail p-value is 0.31405 = P(t < -0.51024). Thus, at the typical Type I error levels, the null hypothesis b = 0 cannot be rejected in favor of the alternative b < 0 for wine consumption. The idea that death rates from heart disease are negatively related to wine consumption in the population of all countries is not supported.

 

Chart 14.2 EXCEL Dialog Box for Multiple Regression of the Death Rate from Heart Disease on Wine, Beer and Liquor Consumption

Regression Statistics
Multiple R 0.689162
R Square 0.474945
Adjusted R Square 0.212417
Standard Error 98.346537
Observations 10

ANOVA df SS MS F Significance F
Regression 3 52493.7524 17497.9175 1.8091 0.2457
Residual 6 58032.2486 9672.0414
Total 9 110526.0010

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercepts -68.02613 150.42307 -0.45223 0.66699 -436.09838 300.0461
Wine -0.76033 1.49014 -0.51024 0.62810 -4.40657 2.8859
Beer 1.43092 1.09326 1.30886 0.23847 -1.24418 4.1060
Liquor 80.10335 42.67670 1.87698 0.10961 -24.32284 184.5295

RESIDUAL OUTPUT
Observation Predict Heart Residuals
1 141.3310 -80.2310
2 -4.1164 98.2164
3 126.1841 -19.7841
4 162.2577 10.7423
5 185.9449 13.7551
6 211.0485 -35.0485
7 258.7824 114.8176
8 211.1136 72.5864
9 178.1312 -143.4312
10 68.0229 -31.6229

 

 


Go to:

Examples of EXCEL Use
Statistics for Business and Economics WWW Page