book cover


Advances in statistical methodology and computing have played an important role in allowing researchers to more accurately assess the health effects of ambient air pollution. The methods and software developed in this area are applicable to a wide array of problems in environmental epidemiology. This book provides an overview of the methods used for investigating the health effects of air pollution and gives examples and case studies in R which demonstrate the application of those methods to real data. The book will be useful to statisticians, epidemiologists, and graduate students working in the area of air pollution and health and others analyzing similar data.

The authors describe the different existing approaches to statistical modeling and cover basic aspects of analyzing and understanding air pollution and health data. The case studies in each chapter demonstrate how to use R to apply and interpret different statistical models and to explore the effects of potential confounding factors. A working knowledge of R and regression modeling is assumed. In-depth knowledge of R programming is not required to understand and run the examples.

Researchers in this area will find the book useful as a "live" reference. Software for all of the analyses in the book is downloadable from the web and is available under a Free Software license. The reader is free to run the examples in the book and modify the code to suit their needs. In addition to providing the software for developing the statistical models, the authors provide the entire database from the National Morbidity Mortality and Air Pollution Study (NMMAPS) in a convenient R package. With the database, readers can run the examples and experiment with their own methods and ideas.

[Publisher's website | Order from Amazon]

NOTE: Due to a request from the National Center for Health Statistics, we have had to remove the NMMAPS data and the NMMAPSlite package from CRAN. Therefore, the code in the book involving the NMMAPSlite package will no longer work.


  1. Studies of Air Pollution and Health
  2. Introduction to R and Air Pollution and Health Data
  3. Reproducible Research Tools
  4. Statistical Issues in Estimating the Health Effects of Spatial–Temporal Environmental Exposures
  5. Exploratory Data Analyses
  6. Statistical Models
  7. Pooling Risks Across Locations and Quantifying Spatial Heterogeneity
  8. A Reproducible Seasonal Analysis of Particulate Matter and Mortality in the United States

Sample Chapter

The preface and Chapter 5 of the book are available as a free download.

Reproducibility Packages

Reproducibility packages for the chapters in the book are available via the Reproducible Research Archive and can be downloaded directly using the cacher package. The identification strings for each package are:
Chapter 52a04c4d5523816f531f98b141c0eb17c6273f308
Chapter 649c090223e7b16d72240a928f69bccd72a0a164c
Chapter 7fd9f843bd5ad0b9e2265dacf1a8cda3fb813db50
Chapter 83b720fc96d96a1ffb12a334fa91956e02a163e9b
For example, to download the reproducibility package for Chapter 5, you can call the folllowing functions in R:
clonecache(id = "2a04c4d5523816f531f98b141c0eb17c6273f308")

Miscellaneous Files


The authors can be reached at the following addresses.
Roger D. Peng
Department of Biostatistics
Johns Hopkins Bloomberg School of Public Health
615 North Wolfe Street
Baltimore MD 21205
Francesca Dominici
Department of Biostatistics
Harvard School of Public Health
655 Huntington Avenue
SPH2, 4th Floor
Boston MA 02115