********************************************************* ** Sitka Spruce lab ************************************* clear set more off * read in the data using the infile command * change the location of the data to where you've saved it cd "C:\Documents and Settings\Sandrah Eckel\Desktop\LDA lab7" infile logsize days chamber ozone year tree using "sitka.data" * the first observation is blank because the 'sitka.data' file * had the column names listed in the file, so drop the first observation drop in 1 * rename the tree identifier from 'tree' to 'id' rename tree id * you may want to save the data as .dta file for later convenience save "sitka.dta", replace ** limit our analysis to the 1988 growing season keep if year==88 ** explore missing data patterns xtset id days xtdes ** reshape wide for some data exploration reshape wide logsize, i(id) j(days) ** describe the pattern of data ** summ for continuous variable summ logsize152-logsize258 ** tab for (non-time varying) categorical variable ** use xttab if the categorical variable changes over time tab chamber tab ozone ** exploratory analysis in long format reshape long logsize, i(id) j(days) ** Describe the change in logsize over time, across normal and ozone environments ** ozone = 0 (normal), ozone = 1 (ozone) ** scatter plot ** twoway scatter logsize days, by(ozone) ylab(2 (2) 8) twoway scatter logsize days, by(ozone) jitter(3) ylab(2 (2) 8) ** box plot ** graph box logsize, over(ozone) over(days) ylab(2 (2) 8) ** mean trend plot ** ** using offset option to shift the plots, so that they do not lay on top of each other ** xtgraph logsize, group(ozone) ti("Mean logsize vs days") bar(ci) offset(1.5) xtgraph logsize, group(ozone) ti("Median logsize vs days") av(median) bar(iqr) offset(2) ** ** Spaghetti plots sort ozone id days twoway line logsize days, by(ozone) c(L) ** in order to reproduce the results on p 76 of the Diggle, Heagerty, Liang and Zeger textbook, ** generate an interaction between day and ozone using following notation gen control=1-ozone gen day_control=days/100*control ** briefly explore the correlation structure on residuals from an OLS regression * indep corr * xi, noomit: reg logsize i.days control day_control, noconstant predict residforcorr, resid autocor residforcorr days id ** independent corr ** xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(ind) xtcorr ** uniform corr ** xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(exc) xtcorr ** AR1 corr ** *xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(ar1) * above code doesn't work since data aren't equally spaced * use the 'force' option to make stata treat the observations as equally spaced xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(ar1) force xtcorr ** unstructured corr ** xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(uns) xtcorr ** Model Choice ** ** install QIC, the extension of AIC for xtgee models ** ssc install qic ** QIC for GEE models with the 4 different correlation structures xtset id days * independent xi, noomit: qic logsize i.days control day_control, noconstant fam(gaussian) corr(indep) * uniform xi, noomit: qic logsize i.days control day_control, noconstant fam(gaussian) corr(exc) * ar1 xi, noomit: qic logsize i.days control day_control, noconstant fam(gaussian) corr(ar1) force * unstructured xi, noomit: qic logsize i.days control day_control, noconstant fam(gaussian) corr(unst) ** AIC - compare random effect models ** * no random intercept - i.e., independent quietly xi, noomit: reg logsize i.days control day_control, noconstant estat ic * random intercept equiv to gee with uniform corr * quietly xi, noomit: xtreg logsize i.days control day_control, re i(id) mle noconstant nolog estimates store model_unif estat ic * random intercept without the effect of ozone quietly xi, noomit: xtreg logsize i.days , re i(id) mle noconstant nolog estimates store model_unif_noozone estat ic lrtest model_unif_noozone model_unif ** robust estimation ** xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(ind) robust nolog xtcorr xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(exc) robust nolog xtcorr xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(exc) nolog xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(uns) robust nolog xtcorr xi, noomit: xtgee logsize i.days control day_control, noconstant i(id) corr(uns) nolog