Abstract

The replication of important findings by multiple independent investigators is fundamental to the accumulation of scientific evidence. Researchers in the biological and physical sciences expect results to be replicated using independent data, analytic methods, laboratories, and instruments. Epidemiologic studies are commonly used to quantify small health effects of important, but subtle, risk factors and replication is of critical importance where results can inform substantial policy decisions. However, because of the time, expense, and opportunism of many current epidemiologic studies, it is often impossible to fully replicate their findings. An attainable minimum standard is "reproducibility", which calls for datasets and software to be made available for verifying published findings and conducting alternative analyses. We outline a standard for reproducibility and evaluate the reproducibility of current epidemiologic research. We also propose methods for reproducible research and implement them using a case study in air pollution and health.

The full text of the article is available from the American Journal of Epidemiology

Data

Data for the literature review conducted in the article "Reproducible Epidemiologic Research" by Peng, Dominici, and Zeger is available here as a comma-separated-value file (CSV) and as an R workspace file (.rda).

If you download the R workspace file ('litreview-data.rda' file) you do not need to download the 'conclusions.csv' file.

Links

Compendium for "Seasonal analyses of air pollution and mortality in 100 US cities" by Peng et al.

The questionnaire used to review articles.