UCSD Homepage


NEWS
OPENINGS
CONTACT

RESEARCH SUMMARY

We study causal relationships between gene regulation and cellular behaviors, by developing computational and experimental methods on network modeling, stem cell engineering, epigenomic and single-cell analyses.

We discovered genetic differences of early embryonic development among humans, mice, and cows [Genome Res, 20: 804-815, Research Highlight "Hidden Differences", Nature 464: 1248]. We contributed to the introduction of "comparative epigenomics" - using cross-species epigenomic comparison to annotate the genomes [Cell, 149: 1381-1391]. We contributed to the derivation of the rules of dynamic gene regulation and temporal epigenomic changes [Genome Res, 23:352-384]. We pioneered in modeling the impact of epigenome-genome interaction to transcription factor binding, and to personal variation [PLoS Comp Biol, 9(12): e1003367].  

 


Evolution of mammalian gene regulatory networks

Gene regulation involves coordinated interactions of many proteins and DNA segments, namely gene regulatory networks (GRNs). We study the structure, dynamics, and evolution of GRNs and how GRNs influence cellular behaviors, including stem cell differentiation and cancer formation.

We brought evolutionary biology ideas into elucidating GRN structures and functions in mammals. We developed methods to describe the evolutionary changes of different components of mammalian GRNs, including transcription factor binding sites (TFBS) and TFBS modules [Genome Res, 18:1325-1335], co-expression modules of genes [PLoS Comp Biol, 6(3): e1000707][Nucl Acids Res, 35: W105-W114], protein-protein interactions [Genome Res, 20: 804-815], transcription factor (TF)-DNA interactions [Genome Res, 20: 804-815], and epigenomes [Cell, 149: 1381-1391]. These studies contributed empirical data and derived initial rules underlying GRN functions.

On the theoretical end, we developed an evolutionary model that simultaneous describes the evolutionary changes of multiple components of a GRN [PLoS Comp Biol, 7(6): e1002064]. This model enables using multi-species DNA and gene expression data for simultaneous identification of GRNs in every species under consideration.

Personal variation and evolutionary change are linked. We developed a computational tool, perEdit [Bioinformatics, 27: 3427-3429], to assemble both alleles of a personal genome. Personal ChIP-seq and RNA-seq data can then mapped to the individual genome and thus identifying individual variation and allele differences.  

Genetic and epigenetic re-wiring of transcription networks

We reported that close to forty percent of the genes shared by humans, mice and cows have different expression patterns in the early stages of embryonic development. We traced these differences to a set of specific evolutionary changes of the genomes, including insersion of transcription factor binding sites by transposons. This work suggested that more than one GRN can guide mammalian preimplantation development. See cover article in Genome Research, and research highlight in Nature.

Only a small fraction of the interspecies differences in gene expression could be traced to genomic differences. This prompted us to compare epigenomes. Comparing epigenomes in human, mouse, and pig pluripotent stem cells, we found 5-mC, H3K27ac, and H3K36me3 to be conserved in both negatively and positively selected genomic sequences. We reported the conservation of co-localization of eigenomic marks as an indicator of cis-regulatory sequences. Combined with cell differentiation experiments, we identified a different class of "poised promoters" marked by H2A.Z (repressive) and H3K4me3 (active) [Cell, 149: 1381-1391].  We developed a Comparative Epigenome Browser to allow interactive visualization and analysis of the multi-species epigenomes [Bioinformatics, 29: 1223-1225]. 

Thermodynamic modeling of interactions among transcription factors, DNA, and epigenome

Transcription factor (TF) - DNA interaction is at the core of transcriptional regulation. We developed methods to efficiently calculate TF-DNA binding affinities for a long stretch (200-500bp) of genomic sequence, taking into account interactions between strong and weak, homotypical and heterotypic TF binding sites [PLoS ONE, 4(12): e8155][BMC Genomics, 9:S18]. These methods led to the utility of high-throughput sequencing data for reconstruction of a transcription network [BMC Genomics, 9:S19] and the discovery of a second DNA recognition motif of Nanog [PLoS ONE, 4(12): e8155], which was verified by subsequent studies from our group [Genome Res, 20: 804-815] and others [Nucl Acids Res, 2012].

We introduced a model to calculate the TF-DNA binding energy in the presence of epigenomic modifications. This model shows theoretically epigenomic modifications can boost the cooperativity of nearby binding sites, and more importantly, personal variations of TF binding can be explained by personal epigenome and personal genome data [PLoS Comp Biol, 9(12): e1003367].

Temporal epigenomic changes and dynamics of gene expresssion

We developed a probabilistic model to annotate the genome using temporal epigenomic data. This model clusters genomic sequences based on the similarity of temporal changes of multiple epigenomic marks during a cellular differentiation process [Genome Res, 23:352-384]. Also see cover description. With this model, we found that temporal changes of H3K4me2, unmethylated CpG, and H2A.Z were predictive of 5-hmC changes, suggesting unmethylated CpG and H3K4me2 as potential upstream signals guiding TETs to specific sequences. Several rules on combinatorial epigenomic changes and their effects on mRNA expression and ncRNA expression were derived, including a simple rule governing the relationship between 5-hmC and gene expression levels.

We developed statistical methods to model temporal gene expression data, allowing for identifying different temporal expression patterns [Bioinformatics 26: 2944-2951] and dissecting subpopulations of cell types is a heterogeneous cell population [PLoS Comp Biol, 5: e1000607]. The latter method led to the discovery of regulatory function of chromatin remodeling protein SMARCAD1 in embryonic stem cells.

We developed a method to minimize lab-to-lab variations in identifying differentially expressed genes and identified colorectal cancer specific genes [Nature Biotech, 24(12): 6-7].   

Research support from

NIH, NSF, March of Dimes Foundation



Copyright 2012 Zhong Lab. All rights reserved. Last updated: Jan 12, 2012