// Task 1 routine setup capture log close set more off estimates clear log using cda08c-ex8done.log, replace text version 9.2 set scheme s2mono // pgm: cda08c-ex8done.do // task: 8 - Count Outcomes - Excercise // project: CDA Lab Guide // author: your name // date: today's date use science3, clear // Task 2: examine and clean the variables to be analyzed. keep pub3 faculty mcit3 pub1 sum drop if pub3>=. | faculty>=. | mcit3>=. | pub1>=. tab1 pub3 faculty mcit3 pub1, missing // Task 3: Use poisson and nbreg to regress pub3 on faculty, mcit3, and pub1. // Compute the standardized and unstandardized coefficients. // Task 3a: estimate a PRM and list the coefficients. poisson pub3 faculty mcit3 pub1, nolog listcoef, help // Task 3b: estimate a NBRM and list the coefficients. nbreg pub3 faculty mcit3 pub1, nolog listcoef, help // Task 3c: for the NBRM, interpret the standardized factor change coefficient // for pub1 and the unstandardized coefficients for pub1 and faculty. /* Type your answer here For a standard deviation increase in a scientist's first-year publications, the expected publications in year three increases by a factor of 1.7, holding all other variables constant. For each additional publication in the first year, the expected publications ending in the third year increases by a factor of 1.2, holding all other variables constant. Being a faculty member in a university increases a scientist's publications by a factor of 1.3, holding all other variables constant. */ // Task 3d: test the NBRM against the alternative of the PRM. /* Type your answer here Because there is significant evidence of overdispersion (X2=91.52, p<.01), the negative binomial regression model is preferred to the Poisson regression model. */ // Task 4: Use prcounts to create predicted probabilities and plot them. // Task 4a: estimate the PRM of pub3 on fauclty, mcit3, and pub1. poisson pub3 faculty mcit3 pub1, nolog // Task 4b: computed predictions with prcounts. prcounts prm, plot max(9) // Task 4c: estimate the NBRM. nbreg pub3 faculty mcit3 pub1, nolog // Task 4d: computed predictions with prcounts. prcounts nbr, plot max(9) sum prm* nbr* // Task 4e: plot predicted probabilities graph twoway connected nbrobeq nbrpreq prmpreq nbrval, /// connect(direct direct direct) graph export cda08c-ex8-fig1.emf, replace // Task 4f: decide which model you would prefer. /* Type your answer here Since the NBRM does a better job reproducing the observed proportions of counts and because the dispersion parameter was significant, I prefer the NBRM. */ // Task 5: Estimate the same model using zip and zinb. // Step 5.1: zip model zip pub3 faculty mcit3 pub1, inf(faculty mcit3 pub1) nolog // Step 5.2: zip model zinb pub3 faculty mcit3 pub1, inf(faculty mcit3 pub1) nolog // Step 5.3: which model do you prefer? /* Type your answer here */ log close exit