We study gene regulation and cellular behavior by developing statistical and experimental methods. Our primary goal is to develop new technologies to map molecular networks, including RNA-RNA interactome [Nat Comm, 2016], RNA-chromatin interactome [Curr Biol, 2017][PNAS, 2019], and protein-protein interactome. Our secondary quest is to model the variations of these networks in three axes, namely developmental time, personal difference, and evolutionary change. Our major tools include epigenomic and single-cell assays, single-molecule imaging, statistical modeling, and large scale computation.


RNA-RNA interaction

Mapping RNA-RNA interactions in vivo.

RNA-DNA interaction

Finding any RNA attached to any place on the genome.

Single molecule detection

Single molecules RNA FISH.

Single cell analysis

Single cell analysis of cell fate.

Big data search engine

Online search for epigenomic and transcriptomic big data.


Using deep single-cell RNA-seq of matched sister blastomeres, we found highly reproducible differences among the single cells within early stage (2- and 4-cell) pre-implantation mouse embryos [Genome Res, 2014, cover]. We developed a time-variant clustering model for analysis of time-course single-cell gene expression data [PNAS, 2014].
We discovered transposon-mediated re-wiring of transcription networks that govern pre-implantation embryonic development [Genome Res, 2010, cover; Research Highlight in Nature, 2010].
We contributed to initiating "comparative epigenomics", a research field that studies genomic functions by cross-species epigenomic comparison [Cell, 2012].
We pioneered in modeling the impact of epigenome-genome interaction to transcription factor binding, and to personal variation [PLoS Comp Biol, 2013].
We contributed to the derivation of the rules of dynamic gene regulation and temporal epigenomic changes [Genome Res , 2013, cover].

Previous work

We brought evolutionary biology ideas into elucidating structures and functions of mammalian Gene Regulatory Networks (GRN). We developed methods to describe the evolutionary changes of different components of mammalian GRNs, including transcription factor binding sites (TFBS) and TFBS modules [Genome Res, 18:1325-1335], co-expression modules of genes [PLoS Comp Biol, 6(3): e1000707] [Nucl Acids Res, 35: W105-W114], protein-protein interactions [Genome Res, 20: 804-815], transcription factor (TF)-DNA interactions [Genome Res, 20: 804-815], and epigenomes [Cell, 149: 1381-1391].
On the theoretical end, we developed an evolutionary model that simultaneous describes the evolutionary changes of multiple components of a GRN [PLoS Comp Biol, 7(6): e1002064]. This model enables using multi-species DNA and gene expression data for simultaneous identification of GRNs in every species under consideration.
We reported that close to forty percent of the genes shared by humans, mice and cows have different expression patterns in the early stages of embryonic development. We traced these differences to a set of specific evolutionary changes of the genomes, including insersion of transcription factor binding sites by transposons. See cover article in Genome Research, and research highlight in Nature.
Transcription factor (TF) - DNA interaction is at the core of transcriptional regulation. We developed methods to efficiently calculate TF-DNA binding affinities for a long stretch (200-500bp) of genomic sequence, taking into account interactions between strong and weak, homotypical and heterotypic TF binding sites [PLoS ONE, 4(12): e8155][BMC Genomics, 9:S18]. These methods led to the utility of high-throughput sequencing data for reconstruction of a transcription network [BMC Genomics, 9:S19] and the discovery of a second DNA recognition motif of Nanog [PLoS ONE, 4(12): e8155], which was verified by subsequent studies from our group [Genome Res, 20: 804-815] and others [Nucl Acids Res, 2012].
Extending from this work, we developed a thermodynamic model to calculate the TF-DNA binding affinity taking into account of the epigenome [PLoS Comp Biol, 2013].
We developed a probabilistic model (mixture of HMMs) to annotate the genome using temporal epigenomic data. This model clusters genomic sequences based on the similarity of temporal changes of multiple epigenomic marks during a cellular differentiation process [Genome Res, 23:352-384]. Also see cover description. With this model, we found that temporal changes of H3K4me2, unmethylated CpG, and H2A.Z were predictive of 5-hmC changes, as well as a simple rule governing the relationship between 5-hmC and gene expression levels.
We developed a computational tool, perEdit [Bioinformatics, 27: 3427-3429], to assemble both alleles of a personal genome. Personal ChIP-seq and RNA-seq data can then mapped to the individual genome and thus identifying individual variation and allele differences.
We developed statistical methods to model temporal gene expression data, allowing for identifying different temporal expression patterns [Bioinformatics 26: 2944-2951] and dissecting subpopulations of cell types is a heterogeneous cell population [PLoS Comp Biol, 5: e1000607]. We developed a Hidden Branching Process model to cluster time-course data [PNAS, 2014].


Complete list of publications on Google Scholar, NCBI

Selected papers

  • Extracellular RNA in a single droplet of human serum reflects physiologic and disease states. Zixu Zhou, Qiuyang Wu, Zhangming Yan, Haizi Zheng, Chienju Chen, Yuan Liu, Zhijie Qi, Riccardo Calandrelli, Zhen Chen, Shu Chien, H. Irene Su, Sheng Zhong
    PNAS, 2019, 116:19200–19208. Text.
  • Mapping RNA-chromatin interactions by sequencing with iMARGI. Weixin Wu, Zhangming Yan, Tri C. Nguyen, Zhen Chen, Shu Chien, Sheng Zhong
    Nature Protocols, 2019, 14:3243–3272. Text, Protocol, Software, 4DN web portal
  • Genome-wide co-localization of RNA-DNA interactions and fusion RNA pairs. Zhangming Yan, Norman Huang, Weixin Wu, Weizhogn Chen, Yiqun Jiang, Jingyao Chen, Xuerui Huang, Xingzhao Wen, Jie Xu, Qiushi Jin, Kang Zhang, Zhen Chen, Shu Chien, Sheng Zhong.
    PNAS, 2019, 116 (8) 3328-3337. Text 4DN web portal
  • RNA, action through interactions. Tri C. Nguyen, Kathia Zaleta-Rivera, Xuerui Huang, Xiaofeng Dai, Sheng Zhong.
    Trends in Genetics, 2018, 34:867-882. Text
  • RNAs as proximity labeling media for identifying nuclear speckle positions relative to the genome. Weizhong Chen, Zhangming Yan, Simin Li, Norman Huang, Xuerui Huang, Jin Zhang, Sheng Zhong.
    iScience, 2018, 4:204-215. Text, Cover proposal
  • Systematic mapping of RNA-chromatin interactions in vivo. Bharat Sridhar, Marcelo Rivas-Astroza, Tri C. Nguyen, Weizhong Chen, Zhangming Yan, Xiaoyi Cao, Lucie Hebert, Sheng Zhong.
    Current Biology, 2017, 27(4): 602–609. Text, Data, Protocols, Bioinformatic pipeline, Access the recommendation on F1000Prime
  • Mapping RNA-RNA interactome and RNA structure in vivo by MARIO. Tri C. Nguyen, Xiaoyi Cao, Pengfei Yu, Shu Xiao, Jia Lu, Fernando H. Biase, Bharat Sridhar, Norman Huang, Kang Zhang, Sheng Zhong.
    Nature Communications, 2016, 7:12023. Text, Software, Data
  • The 4D nucleome project. Job Dekker, Andrew S. Belmont, Mitchell Guttman, Victor O. Leshyk, John T. Lis, Stavros Lomvardas, Leonid A. Mirny, Clodagh C. O’Shea, Peter J. Park, Bing Ren, Joan C. Ritland Politz, Jay Shendure, Sheng Zhong & the 4D Nucleome Network.
    Nature, 2017, 549:219–226. Text, Artwork
  • SMARCAD1 contributes to regulation of naïve pluripotency by interacting with histone citrullination. Shu Xiao, Jia Lu, Bharat Sridhar, Xiaoyi Cao, Pengfei Yu, Chieh-Chun Chen, Darina McDee, Laura Sloofman, Yang Wang, Marcelo Rivas-Astroza, Bhanu Prakash V.L. Telugu, Dana Levasseur, Kang Zhang, Han Liang, Jing Crystal Zhao, Tetsuya S. Tanaka, Gary Stormo, Sheng Zhong.
    Cell Reports, 2017, 18:3117-3128. Text, Raw images, Cover proposal
  • Spatiotemporal clustering of epigenome reveals rules of dynamic gene regulation. Pengfei Yu, Shu Xiao, Xiaoyun Xin, Chun-Xiao Song, Wei Huang, Darina McDee, Tetsuya Tanaka, Ting Wang, Chuan He, Sheng Zhong.
    Genome Research, 2013, 23:352-384. Cover article, Abstract, Software, Data; Review
  • Understanding variation in transcription factor binding by modeling transcription factor genome-epigenome interactions. Chieh-Chun Chen, Shu Xiao, Dan Xie, Xiaoyi Cao, Chun-Xiao Song, Ting Wang, Chuan He, Sheng Zhong.
    PLoS Computational Biology, 2013, 9(12): e1003367. Text, Software, Supplementary Figures
  • A likelihood approach to testing hypotheses on the co-evolution of epigenome and genome. Jia Lu, Xiaoyi Cao, Sheng Zhong.
    PLoS Computational Biology, 2018, 14(12):e1006673. Text
  • Comparative epigenomic annotation of regulatory DNA. Shu Xiao, Dan Xie, Xiaoyi Cao, Pengfei Yu, Xiaoyun Xing, Chieh-Chun Chen, Meagan Musselman, Mingchao Xie, Franklin D. West, Harris A. Lewin, Ting Wang, Sheng Zhong.
    Cell, 2012, 49: 1381-1391. Abstract, Data, Comparative Epigenome Browser.
    Reviewed by: J Stem Cell Res Ther, 2012, S10:007. SCIENCE CHINA Life Sciences, 2013, 56(3): 213-219. WIREs Systems Biol Med, 2012, 4(6): 525-545.
  • Towards an evolutionary model of transcription networks. Dan Xie, Chieh-Chun Chen, Xin He, Xiaoyi Cao, Sheng Zhong.
    PLoS Computational Biology, 2011, 7(6): e1002064. Text. Website.
  • Modeling co-expression across species for complex traits: insights to the difference of human and mouse embryonic stem cells. Jun Cai, Dan Xie, Zhewen Fan, John Marden, Wing H. Wong, Sheng Zhong.
    PLoS Computational Biology, 2010, 6(3): e1000707. Text, Data, Software
  • Cross-species de novo identification of cis-regulatory modules with GibbsModule: application to gene regulation in embryonic stem cells. Dan Xie, Jun Cai, Na-Yu Chia, Huck H. Ng and Sheng Zhong.
    Genome Research, 2008, 18:1325-1335. Text. Software
  • Cross-species microarray analysis with the OSCAR system suggests an INSR-Pax6-NQO1 neuro-protective pathway in ageing and Alzheimer's disease. Yue Lu, Xin He and Sheng Zhong.
    Nucleic Acids Research, 2007, 35: W105-W114. TEXT.
  • Time-variant clustering model for understanding cell fate decisions. Wei Huang, Xiaoyi Cao, Fernando H. Biase, Pengfei Yu, Sheng Zhong.
    PNAS, 2014, 111(44):E4797-E4806. Abstract
  • Network based comparison of temporal gene expression patterns. Wei Huang, Xiaoyi Cao, Sheng Zhong.
    Bioinformatics, 2010, 26(23): 2944-2951. Abstract, Software
  • Dissecting early differentially expressed genes in a mixture of differentiating embryonic stem cells. Feng Hong, Fang Fang, Xuming He, Xiaoyi Cao, Hiram Chipperfield, Dan Xie, Wing H. Wong, Huck H. Ng, Sheng Zhong.
    PLoS Computational Biology, 2009, 5(12): e1000607. Text, Data
  • Reproducibility Probability Score - incorporating measurement variability across laboratories for gene selection. Guixian Lin, Xuming He, Hanlee Ji, Leming Shi, Ronald Davis, Sheng Zhong.
    Nature Biotechnology, 2007, 41:105-115. 24(12): 6-7. Text, Software, Supplementary Material. The article has been reviewed by: Pharmacogenomics, 2007, 8(8): 1037-1049. European Journal of Cancer, 2007, 5(5): 97-104. Current Opinion in Biotechnology Systems Biomedicine: Concepts and Perspectives, Edison Liu, Douglas Lauffenburger (editors), Elsevier, 2009, p.172. WIREs Systems Biol Med, 2012, 4(1): 39-49. WIREs Systems Biol Med, 2012, 4(6): 525-545.
  • EpiAlignment: alignment with both DNA sequence and epigenomic data. Jia Lu, Xiaoyi Cao, Sheng Zhong.
    Nucleic Acids Research, 2019, 47(W1):W11-W19. Text, Software.
  • GITAR: An Open Source Tool for Analysis and Visualization of Hi-C Data. Riccardo Calandrelli, Qiuyang Wu, Jihong Guan, Sheng Zhong.
    Genomics, Proteomics & Bioinformatics, 2018, 16(5):365-372. Text, Software.
  • GIVE: portable genome browsers for personal websites. Xiaoyi Cao, Zhangming Yan, Qiuyang Wu, Alvin Zheng, Sheng Zhong.
    Genome Biology, 2018, 19:92. Text, Software, News & Comments: Nature 549:117, Research Highlight: Genome Biology 19:93 , Technical feature: Nature 576:171-172.
  • GeNemo: a search engine for web-based functional genomic data. Yongqing Zhang, Xiaoyi Cao, Sheng Zhong.
    Nucleic Acids Research, 2016, 44: W122-W127. Text, Software
    News coverage: HIT Consultant, Science Daily, MediaPost, HealthDataManagement
  • Enabling interspecies epigenomic comparison with CEpBrowser. Xiaoyi Cao, Sheng Zhong.
    Bioinformatics, 2013, 29(9):1223–1225. Text. Software
  • Mapping personal functional data to personal genomes. Marcelo Rivas-Astroza, Dan Xie, Xiaoyi Cao, Sheng Zhong.
    Bioinformatics, 2011, 27(24):3427-3429. Text. Software

Latest bioRxiv papers

In endothelial cells (ECs) treated by high-glucose and TNFα, we employed single-cell RNA-sequencing and in situ mapping of RNA-genome interaction (iMARGI) assay to delineate temporal changes in transcriptome and RNA-chromatin interactome. ECs displayed dramatic and heterogeneous changes in single cell transcriptome, accompanied by a dynamic and strong increase in inter-chromosomal RNA-DNA interactions, particularly among super enhancers. [Inter-chromosomal RNA-DNA interactions, bioRxiv, 2019].

Text book

3D Genome: from technologies to visualization (draft). [ISBN: 987-1-17325643-0-5]. Textbook for BENG183, BENG203/CSE283 . Suggestions and content contributions are welcome!


Build your own genome browser website.


Internet search for genomic big data.


Analyze RNA interaction data.


Comparative Epigenome Browser.


Sequence mapping on personal genome.


Genome annotation using temporal epigenomic data.

4D Nucleome web portal

4D Nucleome Portal

Entry to NIH 4D Nucleome network.


We welcome applictions for postdoc, lab manager, software engineer, and graduate student.

Get in Touch

Powell-Focht Bioengineering Hall 371, University of California San Diego, 9500 Gilman Drive, MC 0412, La Jolla, CA 92093-0412

Lab Phone: (858) 822-5649