Hongkai’s Computational biology Group |
Welcome to Hongkai Ji’s Research Group |
We are interested in developing statistical and computational methods for analyzing big and complex data, particularly high-throughput genomic data. We apply these tools to study gene regulatory programs in development and diseases. |
News: 1. New paper: A systematic benchmark study of 18 single-cell RNA-seq imputation methods in Genome Biology. [Link]
2. New paper: Our Genome Biology paper that describes a new method, SCATE, for single-cell ATAC-seq analysis. SCATE allows more accurate estimation of activities of each individual cis-regulatory element. [Link]
3. Congratulations: Zhicheng Ji has started his new assistant professor job at the Duke University [Link]. |
Main Projects, Resources and Tools:
|
Openings: Postdoc and Graduate student research assistant positions are available until filled. If you are interested in these positions, please email your CV and recommendation letters to hji@jhu.edu.
|
HONGKAI JI, Ph.D. Professor & Director of the Graduate Program Department of Biostatistics Johns Hopkins Bloomberg School of Public Health 615 North Wolfe Street, Room E3638 Baltimore, MD 21205, USA Phone: (410) 955-3517 Fax: (410) 955-0958 Email: hji@jhu.edu |
(1) CisGenome: integrated software for peak calling, annotation, motif analysis, etc. (2) BIRD: genome-wide prediction of chromatin accessibility using RNA-seq or exon array data (3) dPCA: a software tool for analyzing differential binding. It compares the quantitative ChIP-seq signals in multiple ChIP-seq datasets between two biological conditions and considers the variability in replicate samples. (4) hmChIP: a database of public human and mouse ChIP-seq/ChIP-chip data. (5) iASeq: an R/bioconductor package for detecting allele-specific binding by jointly analyzing multiple ChIP-seq data sets (6) PDDB:a database of predicted regulatory element activities based on BIRD (7) PolyaPeak: a tool for improving ChIP-seq peak calling using peak shape information. (8) TileMap: a software tool for ChIP-chip peak calling. (9) TileProbe: a software tool for removing probe effects in Affymetrix tiling array data. (10) JAMIE: joint analysis of multiple ChIP-chip datasets for improving peak calling. (11) ChIPXpress: improve target gene ranking using gene expression data in GEO. |
2. Statistical and computational tools for ChIP-seq, ChIP-chip, DNase-seq, ATAC-seq:
|
(1) BIRD: genome-wide prediction of chromatin accessibility using RNA-seq or exon array data (2) GSCA: a software tool with graphical user interface for mining publicly available gene expression data. It allows one to systematically identify biological contexts associated with user-specified gene set activity patterns. (3) CorMotif: an R/bioconductor package for jointly analyzing multiple gene expression datasets to simultaneously detect differentially expression genes and patterns. (4) ChIP-PED: an R package for discovering regulatory pathway activities in a large compendium of gene expression data from GEO. (5) PowerExpress: a tool for finding genes with a user-specified pattern of interest from multiple gene expression experiments. |
3. Methods and tools for gene expression data analysis: |
(1) CisGenome: de novo motif discovery, known motif mapping, motif enrichment analysis based on matched genomic control regions. |
4. Tools for sequence motif analysis: |
(1) BIRD: genome-wide prediction of chromatin accessibility using gene expression (2) ChIP-PED: increasing the value of ChIP-seq/ChIP-chip experiments by expanding discoveries to other cell types using large compendiums of publicly available gene expression data in GEO. (3) CorMotif: integrative analysis of multiple gene expression experiments. (4) dPCA: integrative analysis of quantitative ChIP-seq signals in multiple datasets for detecting binding differences between different biological conditions. (5) GSCA: a software tool with graphical user interface for mining publicly available gene expression data. It allows one to systematically identify biological contexts associated with user-specified gene set activity patterns. (6) iASeq: integrative analysis of multiple ChIP-seq studies to improve inference of allele specificity. (7) JAMIE: joint analysis of multiple ChIP-chip datasets for improving peak calling (8) TileProbe: using publicly available ChIP-chip data in GEO to improve probe effect model in the tiling array data. |
5. Statistical methods for ‘omics data integration and data mining: |
(1) Analysis tool for TIP-chip: detecting active transposon elements in human genome |
6. Data analysis methods and tools for new high-throughput genomic technologies: |
(1) Stem cells: roles of MYC [1], Sox17 [2], Gata6 etc. in embryonic stem cells. (2) Early development: sonic hedgehog signaling pathway in limb bud and neural tube development [3,4,5] (3) Cancers: B cell lymphoma [1], medulloblastoma [5], leukemia [6], liver cancer (4) Other diseases: schizophrenia [7], lyme disease (5) Transcription factors: MYC [1], GLI [3,4,5], Sox17 [2], FoxO [8], Oct4/Sox2 [9], Gata6, KLF9, TCF4 (6) Epigenetics and epigenomics: histone modifications and DNase hypersensitivity [10] (7) Yeast metabolic cycle |
7. Gene regulatory programs in development and diseases: |
1. Analytical methods for single-cell genomics: |