THESIS DEFENSE ABSTRACT
Likelihood Ratio Testing under Nonidentifiability with Applications to
Chongzhi Di, PhD Candidate, Johns Hopkins Department of Biostatistics
This dissertation contains statistical research in two main
areas. The first area is likelihood ratio testing when one of the key regularity
conditions -- identifiability -- does not hold under the null hypothesis. The
second area is statistical analysis of multilevel functional data. The work on
both topics is motivated by and applied to public health and biomedical studies.
The first main part of the dissertation considers likelihood ratio testing under
nonidentifiability. In particular, we focus on two classes of hypothesis testing
problems in which one or more parameters are present only under the alternative.
In Class 1, the null hypothesis is specified via the parameter of interest while
a nuisance parameter is not identifiable under the null. It has been established
that the LRT statistic converges to the supremum of a squared Gaussian process.
We characterize conditions under which such limiting distributions simplify to
chi-square. When these conditions are not satisfied, we also provide efficient
computational algorithms to approximate p values based on the principal
component decomposition of Gaussian processes. These are illustrated by
Andersonís stereotype models for ordinal response data. In Class 2, the null
hypothesis can be specified equivalently via each of the two parameters, and
under either specification, the other parameter is not identifiable. Motivating
examples for this class include testing homogeneity in admixture models and
testing linearity versus a nonlinear trend in generalized linear models. This
class has received relatively less attention in the literature, except for the
special case of mixture models. We derive the limiting distribution of the LRT
statistic in this situation. We also present a penalized likelihood ratio test
that has a simple chi-square limiting distribution under the null, based on
previous work in mixture models (Chen et al. 2001). These approaches are
compared through statistical power and illustrated in a genetic linkage study of
schizophrenia that is subject to genetic heterogeneity.
The second part involves development of new statistical methods in functional
data analysis, motivated by the Sleep Heart Health Study (SHHS), a comprehensive
landmark study of sleep and its impacts on health outcomes. The SHHS data
contains quasi-continuous electroencephalographic (EEG) signals for each
subject, at two visits. The volume and importance of this data presents enormous
challenges for analysis. To address these challenges, we introduce multilevel
functional principal component analysis, a novel statistical methodology
designed to extract intra- and inter-subject geometric components of multilevel
functional data. The proposed methodology is generally applicable to many modern
scientific studies of hierarchical functional data.
to Home Page