*clear out any existing data clear *increase the memory set memory 40m *increase the number of variables you can include in the datset set matsize 500 ** read in the data set ** ** Make sure you specify the path correctly! cd "C:\Documents and Settings\Sandrah Eckel\Desktop\LDA lab 4" use pigs.stata.dta, clear ** alternatively click on: file>open>pigs.stata.dta ** reshape to long ** reshape long week, i(Id) j(time) ** rename the weight variable ** rename week weight ** ALways look at your data ** ** sort data appropriately ** sort Id time ** spaghetti plot data ** twoway line weight time,c(L) ** OLS estimates by hand using stata matrix functions gen intercept=1 mkmat intercept time, matrix(X) **look at which matrices stata has defined matrix dir **read in outcome variable as a vector mkmat weight, matrix(Y) ** use definition of OLS estimates to get parameter ests matrix betahatOLS = invsym(X'*X)*X'*Y ** look at your estimates matrix list betahatOLS ** WOW! ** Check and see if your estimates are the same as STATAs ** regress weight time ** Try Uniform corr, WLS estimates ** xtreg weight time, be wls i(Id) **--- Stata doesn't really have a generic WLS command, do it ourselves ** WLS: 1) Get uniform (exchangeable) correlation estimate ** (Use GEE routine for convenience) ** (Another possibility, average all the corrs in the acf) ** the quietly option hides the results from the output window ** quietly xtgee weight time, i(Id) corr(exch) xtcorr ** So a good estimate of rho is 0.7717 *** WLS: 2) Create WLS weighting matrix *** *** Use uniform correlation formula on slide 13 from lecture 4 *** Corr = (1-rho)I + rho(11') *** Using Stata matrix commands ** this is one way to define a number(scaler) in STATA, since the ** STATA data set only deals with variables matrix rho=0.7717 matrix list rho ** create a vector of 1s, with the length as the number of observations within each subject, here is 9 ** matrix one = (1\1\1\1\1\1\1\1\1) matrix list one ** create the matrix of all 1s ** matrix oneonet = one*(one') matrix list oneonet ** create the identity matrix using the I() command, ** ** with the dimension as the number of observations within each subject, here is 9 ** matrix I = I(9) matrix list I ** create the correlation matrix according to the fomular ** matrix R = (1-rho)*I + rho*oneonet matrix list R ** My WLS trick: directly specify the correlation structure you created ** in a marginal model xtgee weight time, i(Id) t(time) corr(fixed R) ** try WLS with another uniform correlation, like 0.5 for fun! *** Generate uniform correlation structure: Corr = (1-rho)I + rho(11') *** Using Stata matrix commands matrix rho=0.5 matrix one = (1\1\1\1\1\1\1\1\1) matrix oneonet = one*(one') matrix I = I(9) matrix R = (1-rho)*I + rho*oneonet matrix list R ** Do WLS with new weighting(correlation) matrix and compare xtgee weight time, i(Id) t(time) corr(fixed R) ** Comparison of the available procedure in STATA for different correlation structure: ** ** OLS with plot ** regress weight time predict wtpred twoway line weight wtpred time,c(L) drop wtpred ** population average regression with indep corr, to compare with OLS ** xtgee weight time, i(Id) corr(ind) xtcorr predict wtpred twoway line weight wtpred time,c(L) drop wtpred ** ~Same ** Alternative commands, all give same answers with indep corr ** ** xtreg , pa is just GEE estimate ** xtreg weight time, pa i(Id) corr(ind) ** xtgls does generalized least square estimates, which is a special case for WLS. ** xtgls weight time, i(Id) corr(ind) ** Marginal specifications: Uniform corr structure ** xtgee weight time, i(Id) corr(exch) ** Alternative to fit marginal uniform corr with GEE ** xtreg weight time, pa i(Id) corr(exch) ** look at the estimated correlation matrix ** xtcorr ** Marginal uniform corr plot ** xtreg weight time, pa i(Id) corr(exch) predict wtpred twoway line weight wtpred time,c(L) drop wtpred ** Conditional (Random intercept) specifications: yield Uniform corr structure ** ** fit random intercept model with GLS ** xtreg weight time, re i(Id) ** fit random intercept model with MLE ** xtreg weight time, re i(Id) mle ** Alternative to fit random intercept model with MLE ** xtreg weight time, i(Id) mle ** random intercepts plot ** quietly xtreg weight time, re i(Id) mle ** predict the marginal mean function ** predict wtpred ** predict the individual pig mean function ** predict wtpredi, xbu twoway line weight wtpredi wtpred time,c(L) drop wtpred wtpredi ** AR(1) GEE ** xtgee weight time, i(Id) corr(AR1) t(time) xtcorr ** AR(1) RE ** ** xtregar cannot use the variable specificed for time as covariate in regression ** ** generate variable "week" for the tsset command ** gen week=time tsset Id week xtregar weight time ** predict the marginal mean trend ** predict wtpred ** predict the random intercept ** predict ui, u ** generate the individual mean trend ** gen wtpredi = wtpred + ui sort Id time twoway line weight wtpredi wtpred time,c(L) drop wtpred wtpredi u tsset Id time ** note that the marginal and conditional models ** now give slightly different results ** Marginal Model with unstructured corr GEE ** xtgee weight time, i(Id) corr(unstr) t(time) xtcorr * Q1? does this look stationary? * Q2? how can we program an "unstructured" corr structure with random effects?