Ni Zhao

Associate Professor & PhD Program Director  ·  Department of Biostatistics
Bloomberg School of Public Health, Johns Hopkins University
📞 402-955-9993 ✉️ nzhao10@jhu.edu 📍 615 N. Wolfe St, E3622, Baltimore, MD

Research

Our team develops and applies statistical methods for high-dimensional biomedical and public health data, with a particular focus on omics fields including genomics, epigenomics, and metagenomics.

1. Understanding Microbial Community Structure and Its Role in Human Health

We developed a suite of methods for association testing between the microbiome community and outcomes, including MiRKAT (Microbiome Regression-based Kernel Association Test) and its multiple extensions to longitudinal, time-to-event, multivariate, and multi-categorical outcomes. Available as a CRAN package, MiRKAT has been considered one of the classic approaches for microbiome community analysis.

MiRKAT method illustration

2. Addressing Biases, Batch Effects, and Other Heterogeneity in Microbiome Sequencing Studies

Microbiome bias illustration showing actual versus observed proportions across samples

Microbiome sequencing data are subject to systematic biases and batch effects introduced at multiple stages of the experimental and analytical pipeline. We develop methods to detect, characterize, and correct for these sources of variation to enable reproducible discovery across studies.

Our tools include ConQuR for batch effect removal via conditional quantile regression, QuanT for identifying unmeasured heterogeneity via quantile thresholding, and CAFT (Compositional Accelerated Failure Time model) for robust differential abundance analysis.

3. Microbiome Data Integration

Microbiome data integration across studies and sequencing platforms

Integrating microbiome data across multiple studies and platforms is essential for improving statistical power and reproducibility. We develop statistical frameworks for the integrative analysis of multiple microbiome datasets, addressing technical variability while enabling principled inference about microbial community structure and its relationship to health outcomes.

Our research includes methods for integrative analysis of alpha diversity and beta diversity (SMRmix).

4. Spatial Microbiome Data

Emerging spatial technologies now allow the microbiome to be profiled with spatial resolution, revealing how microbial communities are organized within tissues and ecosystems. We develop statistical approaches to characterize spatial organization and community structure in spatial microbiome data, addressing unique challenges posed by complex correlation structures and high-dimensional count data.

Our work focuses on identifying spatial microbial niches, characterizing microbial interaction patterns, and integrating spatial context into microbiome association analyses.

Spatial microbiome visualization

Selected Publications

Zhao, N. indicates corresponding author; * indicates student/postdoc mentee. For a complete list, visit Google Scholar.

  • Roy, A., Satten, G.A., Zhao, N.# (2026). DTH: A nonparametric test for homogeneity of multivariate dispersions. Bioinformatics. In Press.
  • Lu, J.*, Satten, G.A., Meyer, K.A., Launer, L.J., Ling, W., Zhao, N.# (2026). Identifying unmeasured heterogeneity in microbiome data via quantile thresholding (QuanT). Microbiome, 14, 84.
  • Samorodnitsky, S., Campbell, K., Little, A., Ling, W., Zhao, N., Chen, Y.C., & Wu, M.C.# (2026). Detecting clinically relevant topological structures in multiplexed spatial proteomics imaging using TopKAT. Patterns, 7(1), 101456.
  • He, M.*, & Zhao, N.# (2026). A mixed effect similarity matrix regression model (SMRmix) for integrating multiple microbiome datasets at community level and its application in HIV. Biometrics. In Press.
  • He, M.*, Zhao, N.#, Satten, G.A. (2024). MIDASim: A fast and simple simulator for realistic microbiome data. Microbiome, 12, 135.
  • Ling, W., Lu, J., Zhao, N.#, ⋯ Wu, M.C.# (2022). Batch effects removal for microbiome data via conditional quantile regression. Nature Communications, 13, 5418.
  • Zhang, H., Ahearn, T.U., Lecarpentier, J., ⋯ Zhao, N., ⋯ (2020). Genome-wide association study identifies 32 novel breast cancer susceptibility loci from overall and subtype-specific analyses. Nature Genetics, 52, 572–581.
  • Tuddenham, S., Koay, W., Zhao, N., White, J.R., Ghanem, K., Sears, C., and the HIV Microbiome Re-analysis Consortium. (2019). The impact of HIV infection on gut microbiota alpha-diversity: An individual level meta-analysis. Clinical Infectious Diseases, 70(4), 615–627.
  • Zhao, N., Chen, J., Carroll, I.M., ⋯ Wu, M.C.# (2015). Testing in microbiome profiling studies with the microbiome regression-based kernel association test (MiRKAT). American Journal of Human Genetics, 96(5), 797–807.

Teaching

  • 140.648 / 140.649 Essentials of Probability and Statistical Inference III–IV
    Primary Instructor, 2021–2023, 2025–2026  ·  Core curriculum for ScM and PhD students
  • Microbiome Data Analysis
    Primary Instructor, 2022
  • 140.688.01 Statistics for Genomics
    Primary Instructor, 2018–2020
  • 140.686.01 Advanced Methods for Statistical Genetics and Genomics
    New Course, Co-Instructor, 2019–2020 ⭐ Outstanding Student Evaluation, 2020
  • Special Topic Advanced Methods for Statistical Genetics and Genomics
    New Course, Co-Instructor, 2018

Presentations

Scientific Meetings

  • A Compositional Log-Linear Model for Microbiome Data with Zero Cells. Joint Statistical Meeting. Boston, MA. August 2026.
  • A Compositional Log-Linear Model for Microbiome Data with Zero Cells. International Chinese Statistical Association. Shenzhen, China. June 2026.
  • Addressing known and unknown heterogeneity in microbiome studies (Featured Invited Talk). Lloyd Roeling UL Lafayette Mathematics Conference / Louisiana Chapter of the ASA. Lafayette, LA. October 2025.
  • Addressing known and unknown heterogeneity in microbiome studies. Banff International Research Station (BIRS) Workshop. Banff, Alberta, Canada. June 2025.
  • Compositional analysis of microbiome data via accelerated failure time models (CAFT). Conference on Statistics in Genomics and Genetics (STATGEN 2025). Minneapolis, MN. May 2025.
  • Addressing known and unknown heterogeneity in microbiome studies. 10th Workshop on Biostatistics and Bioinformatics. Atlanta, GA. May 2025.
  • Identifying unmeasured heterogeneity in microbiome data via QUANtile Thresholding (QuanT). 7th International Conference on Econometrics and Statistics (EcoSta 2024). Beijing, China. July 2024.
  • Addressing unmeasured heterogeneity in microbiome data via QUANtile Thresholding (QuanT). Conference on Statistics in Genomics and Genetics (STATGEN 2024). Pittsburgh, PA. May 2024.
  • Integrative analysis of multiple microbiome studies. 6th International Conference on Econometrics and Statistics (EcoSta 2023). Tokyo, Japan. August 2023.
  • SMRmix for integrative analysis of microbiome beta diversity. ENAR. Nashville, TN. March 2023.
  • SMRmix for integrative analysis of microbiome beta diversity. CMStatistics. London, United Kingdom. December 2022.
  • Integrative analysis of multiple microbiome studies. 2nd International Congress on Spatial Lifecourse Health. Wuhan, China. December 2022.
  • A log-linear model for inference of biases in microbiome studies. Virtual International Indian Statistical Association Meeting. May 2021.
  • Integrative analysis for microbiome beta diversities using SMRmix. Virtual International Biometric Society–ENAR. March 2021.
  • A powerful microbial group association test based on the higher criticism analysis. Virtual Joint Statistical Meeting. August 2020.
  • A powerful microbial group association test based on the higher criticism analysis. Virtual International Biometric Society–ENAR. March 2020.
  • Community level association studies for microbiome data — in face of more complex study design. ASA-BI-NESS Statistics Webinar Series. June 2020.
  • Working with open-source Human Microbiome Project Data: Efficient Data Access and Analysis Workflow. BioC 2019. New York City, NY. June 2019.
  • A benchmark project for differential abundance testing in microbiome studies. Banff International Research Station (BIRS) Workshop. Banff, Alberta, Canada. September 2019.
  • A robust distance-based kernel association test for correlated microbiome data. International Chinese Statistical Association Symposium. Raleigh, NC. June 2019.
  • MiRKAT — a Suite of Methods for Association Testing in Microbiome Profiling Studies incorporating Phylogenetic Structure. R-Ladies Baltimore. May 2018.
  • Generalized Hotelling's Test for Paired Compositional Data. International Chinese Statistical Association Symposium. Chicago, IL. June 2017.
  • Analysis of Genomic Data via Likelihood Ratio Test in Composite Kernel Machine Regression. Joint Statistical Meeting. Seattle, WA. August 2015.
  • Analysis of Genomic Data via Likelihood Ratio Test in Composite Kernel Machine Regression. International Biometric Society–ENAR. Miami, FL. March 2015.
  • Global Analysis of Methylation Profiles via Kernel Machine Regression Framework. International Biometric Society–ENAR. Baltimore, MD. April 2014.
  • Kernel Machine Methods for Joint Testing in Genome Wide Methylation and Genotyping Studies. International Chinese Statistical Association Applied Statistics Symposium. Portland, OR. June 2014.
  • (Poster) A Statistical Approach for Testing Gene by Microbiome Interaction. American Society of Human Genetics. Baltimore, MD. October 2015.
  • (Poster) A Statistical Approach for Testing Gene by Microbiome Interaction. International Genetic Epidemiology Society. Baltimore, MD. October 2015.
  • (Poster) Kernel Machine Methods for Integrative Analysis of Genome-Wide Methylation and Genotyping Studies. American Society of Human Genetics. Boston, MA. October 2013.
  • (Poster) Genome-level analysis of genetic regulation of sex-specific gene expression in mouse liver. Society of Toxicology 48th Annual Meeting. Baltimore, MD. March 2009.

Invited Seminars

  • TBD. Computational Genomics Forum, Mayo Clinic. Rochester, MN. November 2026.
  • Batches, Biases, and Hidden Heterogeneities in Microbiome Sequencing Studies. CMSE Colloquium, Michigan State University. East Lansing, MI. March 2026.
  • Addressing known and unknown heterogeneities in microbiome sequencing studies. Department of Public Health Sciences, University of Chicago. Chicago, IL. October 2025.
  • Addressing known and unknown heterogeneity in microbiome studies. Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania. Philadelphia, PA. September 2025.
  • A Practical Introduction to Microbiome Data Analysis (Invited Training Webinar). Section on Statistical Genetics and Genomics, American Statistical Association. February 2025.
  • Addressing known and unknown heterogeneity in microbiome studies. University of North Carolina at Chapel Hill. Chapel Hill, NC. December 2024.
  • Addressing known and unknown heterogeneity in microbiome studies. The National Institute of Environmental Health Sciences. Durham, NC. December 2024.
  • Identifying unmeasured heterogeneity in microbiome data via QUANtile Thresholding (QuanT). Department of Biostatistics, Vanderbilt University Medical Center. Nashville, TN. April 2024.
  • Microbiome Data Analysis — Promises, Challenges and Cautions. Best Practices in Quantitative Sciences Meeting, Department of Oncology, Johns Hopkins University Medical Center. Baltimore, MD. April 2024.
  • Identifying unmeasured heterogeneity in microbiome data via QUANtile Thresholding (QuanT). Department of Biostatistics, Yale University. New Haven, CT. March 2024.
  • Integrative analysis of multiple microbiome studies. Biostatistics Seminar, Fred Hutchinson Cancer Center. Seattle, WA. December 2022.
  • Gender equity in academic rewards — from a counterfactual lens. 1st Delong International Public Health Forum. Shanghai, China. October 2022.
  • Integrative analysis of multiple microbiome studies. Bioinformatics Program, University of Guelph. Canada. October 2022.
  • Integrative analysis of multiple microbiome studies. Institute for Genome Sciences, University of Maryland. Baltimore, MD. September 2022.
  • Bias in microbiome sequencing studies and ways to handle it. Department of Mathematics, University of Maryland at College Park. College Park, MD. February 2022.
  • SMRmix for integrative analysis of multiple microbiome datasets in HIV studies. Investigator Meeting, Center for AIDS Research, Johns Hopkins University. Baltimore, MD. January 2021.
  • What does your poop tell you and how to listen to it using statistics. UCLA Biomathematics Seminar Series. Virtual. October 2020.
  • A benchmark study for differential abundance analysis in microbiome. Microbiome Working Group, Fred Hutchinson Cancer Research Center. Seattle, WA. February 2020.
  • Kernel based approaches for microbiome association test in face of more-complex study designs. Department of Biostatistics, Ohio State University. Columbus, OH. November 2019.
  • Kernel based approaches for microbiome association test in face of more-complex study designs. Department of Biostatistics, Virginia Commonwealth University. Richmond, VA. October 2019.
  • Generalized Higher Criticism Test for Microbiome Community Level Analysis. Oncology Biostatistics/Bioinformatics Working Group, Johns Hopkins University. Baltimore, MD. March 2019.
  • Community Level Analysis for Microbiome Data. 12th Annual Symposium on Genomics & Bioinformatics, Johns Hopkins University. Baltimore, MD. October 2018.
  • Statistical Inference Methods for High Dimensional Omics Data. Department of Mathematical Sciences, Binghamton University. Binghamton, NY. December 2015.
  • Statistical Inference Methods for High Dimensional Omics Data. Fred Hutchinson Cancer Research Center. Seattle, WA. December 2015.
  • Statistical Inference Methods for High Dimensional Omics Data. Department of Epidemiology and Biostatistics, University of Texas Health Science Center San Antonio. San Antonio, TX. January 2016.
  • Statistical Inference Methods for High Dimensional Omics Data. Department of Epidemiology and Biostatistics, University of Pennsylvania. Philadelphia, PA. January 2016.
  • Statistical Inference Methods for High Dimensional Omics Data. Department of Biostatistics, Johns Hopkins University. Baltimore, MD. January 2016.

Group Members

Current Members

Asmita Roy

Asmita Roy

Postdoctoral Fellow

July 2024 – Present
Ph.D. in Statistics, Texas A&M University

Daxuan Deng

Daxuan Deng

Postdoctoral Fellow

July 2025 – Present
Ph.D. in Biostatistics, Pennsylvania State University

Shuai Li

Shuai Li

PhD Student

Jointly mentored with Dr. Hongkai Ji
Biostatistics, Johns Hopkins University
sli201@jh.edu
Project: Similarities in T-cell receptor profiles

Hanbo Dong

Hanbo Dong

PhD Student

August 2025 – Present
Department of Biostatistics, Johns Hopkins University

Zixu Luo

Zixu Luo

ScM Student

May 2025 – Present
Department of Biostatistics, Johns Hopkins University
Thesis: Multivariate Latent Class Modeling for Disease Outcomes in Diabetic Patients
Admitted to PhD Program, Fall 2026

Yiheng Wang

Yiheng Wang

ScM Student

Department of Applied Mathematics and Statistics, Johns Hopkins University

Bipasha Akhter

Bipasha Akhter

External Advisee

December 2023 – Present
Junior Bioinformatician and Data Analyst
Multi-omics for Mothers and Infants (MOMI) Consortium
Projahnmo Research Foundation, Dhaka, Bangladesh

Former Group Members

  • Radiah Azmyne KhanExternal AdviseeDec 2023–May 2025→ MSc Genome Medicine, University of Oxford
  • Shilan LiPostdoctoral FellowSep 2023–2024Ph.D., Georgetown University, 2023
  • Jiayi XueScM StudentSep 2024–May 2025→ PhD Student, University of Pittsburgh
  • Jiuyao LuPhD StudentSep 2021–May 2024→ PhD Student, Wharton School, University of Pennsylvania
  • Shengtao WangScM StudentSep 2022–May 2024→ PhD Student, Johns Hopkins University
  • Zhichen (Ella) XiongScM StudentSep 2023–May 2024→ Statistician, Eli Lilly
  • Darren LinScM StudentSep 2023–May 2024→ PhD Student, UCLA
  • Mo LiPostdoctoral Fellow2021–2023Assistant Professor of Statistics, University of Louisiana at Lafayette
  • Runzhe LiPhD in Biostatistics2021–2023→ Two Sigma
  • Danwei YaoScM Student2022–2023→ PhD Student, Emory University
  • Yuehan ZhangPhD in Cancer Epidemiology2020–2022→ Associate/Epidemiologist, Analysis Group
  • Mengyu HeScM Student2020–2021→ PhD Student, Biostatistics, Emory University
  • Hyunwook KohPostdoctoral Fellow2018–2020→ Assistant Professor, Applied Mathematics and Statistics, SUNY Korea
  • Haoyu ZhangPhD Student (co-advised)2016–2019Earl Stadtman Principal Investigator, National Cancer Institute
  • Zhenyi WuScM Student2018–2019→ PhD Student, Statistics, Purdue University
  • Yue CaoScM Student2017–2018→ Senior Portfolio Analyst, Goal Solutions

Fun Group Activities

Spring Walk at Cylburn Garden
April 2026  ·  Spring Walk at Cylburn Garden
Fall at Lake Roland Park
October 2025  ·  Fall @ Lake Roland Park