140.655 - Third Term 2005
LONGITUDINAL DATA ANALYSIS
Instructor: Francesca Dominici
Teaching Assistants: Sorina Eftim , Yue Yin and Yijie Zhou
LECTURE: Hampton House B14B M, W 10:30 - 12:00
LAB: Hampton House B14B M, W 9:15 - 10:15
IMPORTANT ANNOUNCEMENTS:
  • Solution to the final exam is posted.
  • Final exam is ready for pick-up in E3527 (main biostat office).

INDEX
Course info
Announcements
Exams
Lecture Notes
Taped Lectures
Lab Notes
Data sets
Software (S+, STATA, SAS)
LDA books
Acknowledgments

COURSE INFO
  • COURSE OBJECTIVES: [ps] [pdf]
  • Dr. Dominici's Office Hour: Monday, 12:30-1:30 pm E3634
  • TA office hours: Wednesday, Thursday 12:15-1:15 pm W3031

ANNOUNCEMENTS AND IMPORTANT DATES

HOMEWORKS     Homeworks are not required and will not be part of the grade. But the homeworks turned in before due dates will be graded and feedbacks will be provided.

EXAMS

LECTURE NOTES
  • 1. Examples of Longitudinal Data Sets [ps] [pdf]
  • 2. Exploratory Data Analysis [ps] [pdf]
  • 3. Linear Regression: a review [ps] [pdf]
  • 4. Linear Models for Correlated data: examples [ps] [pdf]
  • 5. Linear Models for Correlated data: inference [ps] [pdf]
  • 6. Parametric Models for Covariance Structure [ps] [pdf]
  • 7. Parametric Models for Covariance Structure: examples [ps] [pdf]
  • 8. Generalized Linear Models for Longitudinal Data [ps] [pdf]
    READING ASSIGNEMENT
  • Longitudinal Data Analysis Using Generalized Linear Models by Liang K.Y. and Zeger S.L. Biometrika 1986 [pdf]
  • 9. Marginal Logistic Regression Model and GEE [ps] [pdf]
  • 10. Marginal Poisson Regression Model and GEE [ps] [pdf]
  • 11. Generalized Linear Models with Random Effects [ps] [pdf]
  • 12. Transition Models [ps] [pdf]

TAPED LECTURES
  • Lecture on Monday, 1/24 [rm] WARNING: NOT OF GOOD QUALITY.
  • Lecture on Wednesday, 1/26 [rm]
  • Lecture on Monday, 1/31 [rm]
  • Lecture on Wednesday, 2/02 [rm]
  • Lecture on Monday, 2/07 [rm]
  • Lecture on Wednesday, 2/09 [rm]
  • Lecture on Monday, 2/14 [rm]
  • Lecture on Wednesday, 2/16 [rm]
  • No Lecture on Monday, 2/21
  • Lecture on Wednesday, 2/23 - no tape available
  • Lecture on Monday, 2/28 [rm]
  • Lecture on Wednesday, 3/2 [rm]
  • Lecture on Monday, 3/14 [rm]
  • Lecture on Monday, 3/16 [rm]

LAB NOTES
Download each of the STATA *.ado files and the *.hlp files. Please use "Save As Source" when ou save them to your hard disk from your web browser. To use the *.ado files, put them in your current directory, in your STATA "ado" directory, or in a directory where STATA will know where to look for them. These are ***not*** throughly tested functions. Please let a TA know of any bug you find in these functions.
INTRODUCTION AND EXPLORATORY DATA ANALYSIS

LAB 1, Wednesday 1/26: Introduction to Statistical software: STATA

  • Introduction to STATA for longitudinal data analysis[stata_intro.pdf] [stata_intro2.pdf] [lab1.log] [lab1.do]

    Additional Material

  • Faster function to generate smooth model fits [ksmapprox.ado][ksmapprox.hlp]
  • Function for making plots of means over time [xtgraph.ado] [xtgraph.hlp] pdf demonstration file
  • Function to compute sample autocorrelation function for fixed time points of equal lag [autocor.ado] pdf help file
  • Introductions to SAS [sas_intro1.pdf] [sas_intro2.pdf]
  • Glossary of Macros [sascode.pdf]
  • Also, please check the corresponding SAS functions and analysis in the Software part.


  • LAB 2, Monday 1/31: Exploratory Data Analysis and Exploring the correlation structure

    NOTES BEING CORRECTED FOR THE SPAGHETTI PLOT PART. Using Dental Dataset [lab2.do] [Lab2.doc]

    Additional Material

  • Estimating variance within subjects and between subjects [xtsumcorr.ado] [xtsumcorr.hlp]
  • Variogram Plot [variogram.ado] [variogram.hlp] This function requires [xtdiff.ado] and [ksmapprox.ado]
  • Introduction to Matrix Algebra [matrix_intro.pdf]

  • LAB 3, Wed. 2/2: EDA AND LINEAR MODELS FOR LONGITUDINAL DATA

    • STATA analysis of the cows data set . Interpretation of plots, variogram and autocorrelation output.[cows_Lab3.pdf] [cows_Lab3.do]

      Additional Material

    • Downloading the .ado files tutorial. [ado.pdf]
    • Multiple regression in matrix notation [matrix.pdf]
    • Ordinary Least Squares in STATA [pdf]

    LAB 4, Mon. 2/7: Ordinary Least Squares and Weighted Least Squares for LONGITUDINAL DATA

    NOTES BEING CORRECTED FOR THE SPAGHETTI PLOT PART.
  • Independence correlation stucture, uniform correlation structure and random intercept model. -- A full analysis of pig data
  • [Lab4_pig.pdf] [Lab4.do]

    LINEAR MODELS FOR CORRELATED DATA

    LAB 5, Wednesday 2/9: Independent and Uniform Correlation Models. Random Effect Models.

    • Use the [Lab4_pig.pdf] handout. Discussion : [Lab5.pdf]
    • Handout with STATA Commands for analysis of continuous longitudinal data [pdf] Note: The handout states that the xtreg, mle and xtreg, re are equivalent, but not from the estimation method point of view (MLE vs. GLS), but from the fact that they both estimate uniform correlation structure models.
    • XTSUMCORR vs. ANOVA [pdf]
    • STATA commands Ordinary Least Squares [pdf]

    • ROBUST ESTIMATION FOR LINEAR MODELS FOR CORRELATED DATA

      LAB 6, Monday 2/14:

    • Scientific questions for the growth of Sitka spruce [Lab7-question.doc]
    • STATA analysis: do file [lab6.do], Results: [Lab7-tree-output]
    • Data set for class memo: Growth of Sitka spruce [sitka.raw]

    LAB 7, Wednesday 2/16: More on Robust estimation. Parametric Models for Covariance Structure

  • WLS and GLS. STATA Commands for analysis of continuous longitudinal data [pdf] .Discussion: [Lab7.pdf]
  • Parametric Models for Covariance Structure

    Additional Material

  • Robust Estimation of the sitka spruce data set: STATA analysis [sitka_Lab7.pdf] Do file: [sitka_Lab7.do]

  • LAB 8, Monday 2/21: Parametric Models for Covariance Structure

    Additional Material

  • Analysis with Nepal data but with different scientific question [pdf]

  • LAB 9, Wednesday 2/23: Parametric Models for Covariance Structure. Model fitting.

    Additional Material


    LOGISTIC REGRESSION FOR LONGITUDINAL DATA

    LAB 10, Monday 2/28: Introduction on commands: Logistic regression in both cross-sectional and longitudinal data analysis.

    • Review of Logistic regression in STATA for uncorrelated data.
    • Cannot use autocorrelation function and variogram for logistic model.
    • Logistic STATA analysis of Nepal dsataset. [pdf] [pdf] [.do file]

    ROBUST option for XTGEE command

      Without the ROBUST option, XTGEE use Iterative Reweight Least Square estimator. With the ROBUST option, XTGEE use a sandwich estimator of variance, which under certain conditions makes your standard error esimates still valid even if you specify the wrong correlation matrix. And this is one of the key ideas of GEE.

    LAB 11, Wednesday 3/2: Logistic Regression for Longitudinal Data.

    Logistic regression analysing AFCR data

    Additional Material

    • Introduction to Logistic Regression for independent data: STATA Analysis of the Myocardical Infarction Data [pdf][.do file]
    • Introduction to Logistic Regression: SAS Analysis of the Myocardial infarction data [program] and output [output]

    LAB 12, Wednesday 3/9: Review of HW2. [HW2.do]


    LAB 13, Wednesday 3/9: FAQ Session. [LDA.FAQ]


    LAB 14, Monday 3/14:POISSON REGRESSION AND GEE

    STATA Analyses of the Epileptic seizures data set [.do] (Final data after transformation) [.dta]
    • Marginal Poisson Regression and GEE
    • Random effect Poisson regression

    Additional Material:

    • Analysis of epileptic seizure data using a population-averaged model and GEE, PROC GENMOD) [pdf] [program] and output [output]
    • SAS and STATA analyses of the CD4+ data [output]

    LAB 15, Wednesday 3/16: Analysis of the 3x3 pain crossover trial data

  • STATA and SAS analyses of the 3x3 Pain Crossover Trial Data [lab15crossover3x3.pdf]
  • [lab15crossover3x3.v2.pdf]
  • STATA Analysis of the 2x2 Crossover Trial Data: transition model, RE model, alternating logistic regression model [.do file] [.dta file]
  • DATA SETS
    The data sets are posted in a raw format to be analyzed under SAS, STATA, Splus and R. Please look at the readme file for columns names

    SOFTWARE
    Stata functions:

    S-plus functions:

    SAS functions:
    • Macro for calculating autocorrelation function in SAS [pdf] [readme]
    • Macro for fitting splines to Nepal Data [splinfit.sas]
    • Generate correlated normal data [gendat.sas]
    • PROC MIXED for the sitka.data [sitka.sas] and handout [ps] [pdf]
    • Fit OLS and WLS models for gendat.sas data [owlsfit.sas]
    • SAS analysis of the dental data set [program] [output]
    • Analysis of dental data using a random coefficient model, PROC MIXED [program] and output [output]
    • Analysis of dental data using linear mixed effects model, PROC MIXED) [program] and output [output]
    • Fit a Logistic Regression Model to the Myocardial infarction data [program] and output [output]
    • Analysis of epileptic seizure data using a population-averaged model and GEE, PROC GENMOD[program]   and output [output]
    • Comparing the SAS GLM and Mixed Procedures for Repeated Measures [pdf]

    LDA BOOKS