Hongkai’s Computational biology Group

Welcome to Hongkai Ji’s Research Group

We are interested in developing statistical and computational methods for analyzing high-throughput genomic data. We apply these tools to study gene regulatory programs in development and diseases.


1. Congratulations: Yingying Wei received  the 2013-14 Margaret Merrell Award which recognizes outstanding research by a doctoral student in biostatistics.

2. Welcome: Weiqiang Zhou and  Fang Du joined our group as postdoctoral research fellows. Also welcome new members who joined us since 2013 summer: Dan Jiang, Bing He, Zhicheng Ji and Ashwini Patil.

3. Congratulations: George Wu successfully defended his PhD thesis!

4. New papers: Differential principal component analysis (dPCA)  appears in  PNAS; ChIP-PED appears in Bioinformatics.

5. Congratulations: Our group member Yingying Wei won 2012 Culley Award. and 2013 ENAR Distinguished Student Paper Award.


Main Projects, Resources and Tools:




Postdoc and graduate student research assistant positions are available until filled. If you are interested in these positions, please email your CV and recommendation letters to  hji@jhsph.edu.

Description of the postdoc position is here.


Associate Professor

Department of Biostatistics

Johns Hopkins Bloomberg School of Public Health

615 North Wolfe Street, Room E3638

Baltimore, MD 21205, USA

Phone: (410) 955-3517

Fax: (410) 955-0958

Email: hji@jhsph.edu

(1) CisGenome: integrated software for peak calling, annotation, motif analysis, etc.

(2) dPCA: a software tool for analyzing differential binding. It compares the quantitative ChIP-seq signals in multiple ChIP-seq datasets between two biological conditions and considers the variability in replicate samples.

(3) hmChIP: a database of public human and mouse ChIP-seq/ChIP-chip data.

(4) iASeq: an R/bioconductor package for detecting allele-specific binding by jointly analyzing multiple ChIP-seq data sets

(5) PolyaPeak: a tool for improving ChIP-seq peak calling using peak shape information.

(6) TileMap: a software tool for ChIP-chip peak calling.

(7) TileProbe: a software tool for removing probe effects in Affymetrix tiling array data.

(8) JAMIE: joint analysis of multiple ChIP-chip datasets for improving peak calling.

(9) ChIPXpress: improve target gene ranking using gene expression data in GEO.

1. Develop statistical and computational tools for ChIP-seq and ChIP-chip data analysis:



(1) ChIP-PED: an R package for discovering regulatory pathway activities in a large compendium of gene expression data from GEO.

(2) CorMotif: an R/bioconductor package for jointly analyzing multiple gene expression datasets to simultaneously detect differentially expression genes and patterns.

(3) PowerExpress: a tool for finding genes with a user-specified pattern of interest from multiple gene expression experiments.

2. Develop tools for gene expression data analysis:

(1) CisGenome: de novo motif discovery, known motif mapping, motif enrichment analysis based on matched genomic control regions.

3. Develop tools for sequence motif analysis:

(1) ChIP-PED: increasing the value of ChIP-seq/ChIP-chip experiments by  expanding discoveries to other cell types using large compendiums of publicly available gene expression data in GEO.

(2) dPCA: integrative analysis of quantitative ChIP-seq signals in multiple datasets for detecting binding differences between different biological conditions.

(3) iASeq: integrative analysis of multiple ChIP-seq studies to improve inference of allele specificity.

(4) JAMIE: joint analysis of multiple ChIP-chip datasets for improving peak calling

(5) TileProbe: using publicly available ChIP-chip data in GEO to improve probe effect model in the tiling array data.

(6) CorMotif: integrative analysis of multiple gene expression experiments.

4. Develop new statistical methods for ‘omics data integration and data mining:

(1) Analysis tool for TIP-chip: detecting active transposon elements in human genome

5. Develop data analysis methods and tools for new high-throughput genomic technologies:

(1) Stem cells: roles of MYC [1], Sox17 [2], Gata6 etc. in embryonic stem cells.

(2) Early development: sonic hedgehog signaling pathway in limb bud and neural tube development [3,4,5]

(3) Cancers: B cell lymphoma [1], medulloblastoma [5], leukemia [6], liver cancer

(4) Other diseases: schizophrenia [7], lyme disease

(5) Transcription factors: MYC [1], GLI [3,4,5], Sox17 [2], FoxO [8], Oct4/Sox2 [9], Gata6, KLF9, TCF4

(6) Epigenetics and epigenomics: histone modifications and DNase hypersensitivity [10]

(7) Yeast metabolic cycle

6. Decode gene regulatory programs in development and diseases: