Students who successfully master this course will be able to:

1. Use statistical reasoning to formulate public health questions in
quantitative terms:

(a) Distinguish the summary measures of association
applicable to retrospective and prospective study designs.

(b) Distinguish between the appropriate regression models for
handling continuous outcomes, binary outcomes and
time-to-events.

(c) Conduct an intent-to-treat statistical analysis of data
from a randomized community trial and correctly interpret the
findings about the treatment efficacy.

(d) Conduct a basic analysis of data from a cohort study and
correctly interpret the findings about the association between
exposure and outcome.

(e) Conduct a basic analysis of data from a case-control
study and correctly interpret the findings about exposure and
outcome.

(f) Use stratification in design and analysis to minimize
confounding and identify effect modification

2. Design and interpret graphical and tabular displays of
statistical information:

(a) Use the statistical analysis package Stata to construct
statistical tables and graphs of journal quality.

3. Use probability models to describe trends and random variation
in public health data:

(a) Distinguish among the underlying probability
distributions for modeling continuous, categorical, binary and
time-to-event data.

(b) Calculate the sample size necessary for estimating either
a continuous or binary outcome in a single group.

(c) Estimate the sample size necessary for determining a
statistically significant difference in either a continuous or
binary outcome between two groups.

(d) Recognize the assumptions required in performing
statistical tests assessing relationships between an outcome and
a risk factor.

4. Use statistical methods for inference, including confidence
intervals and tests, to draw valid public health inferences from
study data:

(a) Estimate two proportions and their difference, and
confidence intervals for each. Interpret the interval estimates
within a scientific context. Recognize the importance of other
sources of uncertainty beyond those captured by the confidence
interval

(b) Estimate an odds ratio or relative and its associated
confidence interval. Explain the difference between the two and
when each is appropriate.

(c) Perform and interpret one-way analysis of variance to
test for differences in means among three or more populations.
Evaluate whether underlying probability model assumptions are
appropriate.

(d) Contrast mean outcomes among pairwise groups using
multiple comparisons procedures.

(e) Interpret the correlation coefficient as a measure of the
strength of a linear association between a continuous response
variable and a continuous predictor variable.

(f) Perform and correctly interpret the results from a simple
linear regression analysis to describe the dependence of a
continuous response variable on a single predictor variable.

(g) Use data transformations such as logs and square roots so
that regression model assumptions are more nearly satisfied.

(h) Perform and correctly interpret the results from a simple
logistic regression analysis to describe the dependence of a
dichotomous response variable on a single predictor variable.

The course is designed to enable students to develop their data
analysis skills. Four important datasets will be analyzed by the
students using the statistical package Stata throughout the 621-624
course sequence.