Chromatin three-dimensional interactions mediate genetic effects on gene expression.

Published on May 3, 2019in Science41.037
· DOI :10.1126/science.aat8266
Olivier Delaneau28
Estimated H-index: 28
(Swiss Institute of Bioinformatics),
Marianna Zazhytska5
Estimated H-index: 5
(UNIL: University of Lausanne)
+ 19 AuthorsEmmanouil T. Dermitzakis84
Estimated H-index: 84
(Swiss Institute of Bioinformatics)
INTRODUCTION Genome-wide studies on the genetic basis of gene expression have advanced considerably our understanding of the function of the human genome. Large collections of expression quantitative trait loci (i.e., genetic variations affecting gene expression; eQTLs) are now available across many cell types, tissues, and conditions and are commonly used to better interpret the effects of noncoding genetic variations. Although this constitutes an extraordinary resource to study complex organismal traits and diseases, we still have a poor understanding of how they affect the regulatory machinery, which regulatory elements (REs) they perturb, and how their effects propagate along regulatory interactions. RATIONALE In this study, we aimed to characterize the complex and cell type–specific interplay between genetic variation, REs, and gene expression to dissect cis- and trans-regulatory coordination. To this end, we assembled and analyzed a population-scale dataset combining the activity of REs [measured by chromatin immunoprecipitation sequencing (ChIP-seq) for methylated histone 3 at lysine 4 (H3K4me1), trimethylated histone 3 at lysine 4 (H3K4me3), and acetylated histone 3 at lysine 27 (H3K27ac)], the expression of genes (using RNA-seq), and genetic variations for 317 lymphoblastoid and 78 fibroblast cell lines, all from European ancestry. RESULTS First, we show that the regulatory activity is structured in 12,583 well-delimited cis-regulatory domains (CRDs) that respect the local chromatin organization into topologically associating domains (TADs) but constitute finer organization. Our work suggests that three-dimensional (3D) organization in cis can be broadly categorized into functionally linked and unlinked domains, with the most-linked ones corresponding to CRDs. In addition, we found 25,315 significant associations between CRDs located on distinct chromosomes that form 30 trans-regulatory hubs (TRHs). These TRHs are consistent with a higher-order chromatin organization into A and B nuclear compartments and show a signal of allelic coordination, suggesting that some of the trans associations are not transcriptionally mediated and result from a complex and higher-order 3D nucleus organization. Second, we show that CRDs and TRHs essentially delimit sets of active REs involved in the expression of most genes and provide a dense genome-wide map linking REs and genes. We show that these links vary substantially across cell types and are key factors involved in the cis and trans coexpression of genes. Third, we show that CRDs are under strong genetic control. We discovered a total of 58,968 chromatin peaks affected by nearby genetic variants (cQTLs), 6157 QTLs that affect the activity of CRDs (aCRD-QTLs), and 110 QTLs that affect the correlation structure within CRDs (sCRD-QTLs). These QTLs tend to locate close to their genomic targets, are enriched within functional regions of the genome, and frequently overlap genetic variants associated with complex traits. Finally, we show that CRDs and TRHs capture complex regulatory networks along which the effects of eQTLs are propagated and synergized to affect gene expression. Overall, we estimate that 75% of the eQTLs also affect the activity of CRDs and describe four specific types of genetic effects that can be mediated by CRDs: (i) cis-eQTLs affecting distal genes, (ii) multiple cis-eQTLs with independent effects, (iii) multiple rare variants that have a cumulative effect, and (iv) trans-eQTLs. CONCLUSION We provide a genome-wide map of the coordination between REs and describe how this serves as a backbone for the propagation of noncoding genetic effects in cis and trans onto gene expression. We show how these types of data can reveal higher-order functional attributes of the genome and can serve as an effective prior to boost future association studies at both the discovery and interpretation levels. Overall, our study reveals the complexity and specificity of the cis- and trans-regulatory circuitry and its perturbation by genetic variations.
  • References (62)
  • Citations (15)
📖 Papers frequently viewed together
3,571 Citations
857 Citations
34 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
El cancer de seno es el cancer mas comun y tambien es la primera causa de muerte por cancer en mujeres a nivel mundial. En Colombia es la primera causa de muerte por cancer en mujeres, de ahi la importancia en la identificacion de genes que puedan estar implicados en el desarrollo y progresion de la enfermedad. En este trabajo se llevo a cabo un estudio de asociacion tipo gen candidato en el gen DEAR1. El gen DEAR1, codifica para un miembro de la subfamilia TRIM de proteinas “finger RING” (TRIM ...
3,678 CitationsSource
#1Kevin Monahan (Columbia University)H-Index: 8
#2Adan Horta (Columbia University)H-Index: 3
Last. Stavros Lomvardas (Columbia University)H-Index: 25
view all 3 authors...
The genome is partitioned into topologically associated domains and genomic compartments with shared chromatin valence. This architecture is constrained by the DNA polymer, which precludes interactions between genes on different chromosomes. Here we report a marked divergence from this pattern of nuclear organization that occurs in mouse olfactory sensory neurons. Chromatin conformation capture using in situ Hi-C on fluorescence-activated cell-sorted olfactory sensory neurons and their progenito...
32 CitationsSource
#1Benjamin R. Sabari (MIT: Massachusetts Institute of Technology)H-Index: 9
#2Alessandra Dall’Agnese (MIT: Massachusetts Institute of Technology)H-Index: 7
Last. Richard A. Young (MIT: Massachusetts Institute of Technology)H-Index: 143
view all 23 authors...
INTRODUCTION Mammalian genes that play prominent roles in healthy and diseased cellular states are often controlled by special DNA elements called super-enhancers (SEs). SEs are clusters of enhancers that are occupied by an unusually high density of interacting factors and drive higher levels of transcription than most typical enhancers. This high-density assembly at SEs has been shown to exhibit sharp transitions of formation and dissolution, forming in a single nucleation event and collapsing ...
202 CitationsSource
#1Rachel E. Gate (UCSF: University of California, San Francisco)H-Index: 9
#2Christine S. Cheng (Broad Institute)H-Index: 12
Last. Aviv Regev (MIT: Massachusetts Institute of Technology)H-Index: 110
view all 22 authors...
Over 90% of genetic variants associated with complex human traits map to non-coding regions, but little is understood about how they modulate gene regulation in health and disease. One possible mechanism is that genetic variants affect the activity of one or more cis-regulatory elements leading to gene expression variation in specific cell types. To identify such cases, we analyzed ATAC-seq and RNA-seq profiles from stimulated primary CD4+ T cells in up to 105 healthy donors. We found that regio...
22 CitationsSource
#1Emily S. W. Wong (EMBL-EBI: European Bioinformatics Institute)H-Index: 19
#2Bianca M. Schmitt (University of Cambridge)H-Index: 5
Last. Paul Flicek (EMBL-EBI: European Bioinformatics Institute)H-Index: 84
view all 12 authors...
Noncoding regulatory variants play a central role in the genetics of human diseases and in evolution. Here we measure allele-specific transcription factor binding occupancy of three liver-specific transcription factors between crosses of two inbred mouse strains to elucidate the regulatory mechanisms underlying transcription factor binding variations in mammals. Our results highlight the pre-eminence of cis-acting variants on transcription factor occupancy divergence. Transcription factor bindin...
13 CitationsSource
#1Halit Ongen (Swiss Institute of Bioinformatics)H-Index: 26
#2Andrew A. Brown (Swiss Institute of Bioinformatics)H-Index: 23
Last. Emmanouil T. Dermitzakis (Swiss Institute of Bioinformatics)H-Index: 84
view all 6 authors...
This study presents a new approach to estimate the tissues contributing to the genetic causality for complex traits and diseases. The method assesses tissue sharing of eQTLs among 44 tissues and then uses these tissue-sharing estimates to infer the tissues where trait-associated variants likely exert their function.
59 CitationsSource
#2Lead analysts (Swiss Institute of Bioinformatics)H-Index: 2
#8Alexis Battle (Johns Hopkins University)H-Index: 30
Last. Stephen B. Montgomery (Stanford University)H-Index: 47
view all 5 authors...
857 CitationsSource
#1Adam G. LarsonH-Index: 12
#2Daniel ElnatanH-Index: 7
Last. Geeta J. NarlikarH-Index: 33
view all 9 authors...
Phosphorylation or DNA binding promotes the physical partitioning of HP1α out of a soluble aqueous phase into droplets, suggesting that the repressive action of heterochromatin may in part be mediated by the phase separation of HP1.
298 CitationsSource
#1Alexandre Fort (University of Geneva)H-Index: 10
#2Nikolaos I. Panousis (Swiss Institute of Bioinformatics)H-Index: 9
Last. Olivier Delaneau (Swiss Institute of Bioinformatics)H-Index: 28
view all 7 authors...
Motivation: Large genomic datasets combining genotype and sequence data, such as for expression quantitative trait loci (eQTL) detection, require perfect matching between both data types. Results: We described here MBV (Match BAM to VCF); a method to quickly solve sample mislabeling and detect cross-sample contamination and PCR amplification bias. Availability and Implementation: MBV is implemented in C ++ as an independent component of the QTLtools software package, the binary and source codes ...
5 CitationsSource
#1Olivier Delaneau (Swiss Institute of Bioinformatics)H-Index: 28
#2Halit Ongen (Swiss Institute of Bioinformatics)H-Index: 26
Last. Emmanouil T. Dermitzakis (Swiss Institute of Bioinformatics)H-Index: 84
view all 6 authors...
Population scale studies combining genetic information with molecular phenotypes (for example, gene expression) have become a standard to dissect the effects of genetic variants onto organismal phenotypes. These kinds of data sets require powerful, fast and versatile methods able to discover molecular Quantitative Trait Loci (molQTL). Here we propose such a solution, QTLtools, a modular framework that contains multiple new and well-established methods to prepare the data, to discover proximal an...
39 CitationsSource
Cited By15
#1Gabriel E. Hoffman (ISMMS: Icahn School of Medicine at Mount Sinai)H-Index: 16
#2Jaroslav BendlH-Index: 9
Last. Panos RoussosH-Index: 43
view all 4 authors...
MOTIVATION: Identifying correlated epigenetic features and finding differences in correlation between individuals with disease compared to controls can give novel insight into disease biology. This framework has been successful in analysis of gene expression data, but application to epigenetic data has been limited by the computational cost, lack of scalable software and lack of robust statistical tests. RESULTS: Decorate, differential epigenetic correlation test, identifies correlated epigeneti...
#1Jennifer Y Tan (UNIL: University of Lausanne)H-Index: 7
#2Ana C MarquesH-Index: 19
Pervasive enhancer transcription is at the origin of more than half of all long noncoding RNAs in humans. Transcription of enhancer-associated long noncoding RNAs (elncRNA) contribute to their cognate enhancer activity and gene expression regulation in cis. Recently, splicing of elncRNAs was shown to be associated with elevated enhancer activity. However, whether splicing of elncRNA transcripts is a mere consequence of accessibility at highly active enhancers or if elncRNA splicing directly impa...
#1Simone Rubinacci (UNIL: University of Lausanne)H-Index: 1
#2Diogo Ribeiro (UNIL: University of Lausanne)H-Index: 1
Last. Olivier Delaneau (UNIL: University of Lausanne)H-Index: 1
view all 4 authors...
Low-coverage whole genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined as current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performa...
1 CitationsSource
#1Armando Reyes-Palomares (Complutense University of Madrid)H-Index: 6
#2Mingxia Gu (Stanford University)H-Index: 15
Last. Simon Miao (Stanford University)
view all 14 authors...
Environmental and epigenetic factors often play an important role in polygenic disorders. However, how such factors affect disease-specific tissues at the molecular level remains to be understood. Here, we address this in pulmonary arterial hypertension (PAH). We obtain pulmonary arterial endothelial cells (PAECs) from lungs of patients and controls (n = 19), and perform chromatin, transcriptomic and interaction profiling. Overall, we observe extensive remodeling at active enhancers in PAH PAECs...
Last. Dimitrios H Roukos (Academy of Athens)H-Index: 4
view all 4 authors...
#1Edward J. Banigan (MIT: Massachusetts Institute of Technology)H-Index: 10
#2Aafke A. van den Berg (MIT: Massachusetts Institute of Technology)H-Index: 1
Last. Leonid A. Mirny (MIT: Massachusetts Institute of Technology)H-Index: 60
view all 5 authors...
SMC complexes organize chromatin throughout the cell cycle across many cell types. Experiments indicate that this is achieved by an energy-consuming process known as loop extrusion, in which SMC complexes, such as condensin or cohesin, reel in DNA/chromatin, extruding and progressively growing a DNA/chromatin loop. Theoretical modeling assuming two-sided loop extrusion has successfully reproduced key features of chromatin organization across different organisms. Recent in vitro single-molecule e...
4 CitationsSource
#1Masaru Koido (UTokyo: University of Tokyo)H-Index: 2
Last. Piero CarninciH-Index: 93
view all 11 authors...
Transcription is regulated through complex mechanisms involving non-coding RNAs (ncRNAs). However, because transcription of ncRNAs, especially enhancer RNAs, is often low and cell type-specific, its dependency on genotype remains largely unexplored. Here, we developed mutation effect prediction on ncRNA transcription (MENTR), a quantitative machine learning framework reliably connecting genetic associations with expression of ncRNAs, resolved to the level of cell type. MENTR-predicted mutation e...
#1Haruka Tsuchiya (UTokyo: University of Tokyo)H-Index: 5
#2Mineto Ota (UTokyo: University of Tokyo)H-Index: 3
Last. Keishi Fujio (UTokyo: University of Tokyo)H-Index: 27
view all 15 authors...
In rheumatoid arthritis (RA), synovial fibroblasts (SFs) produce pathogenic molecules in the inflamed synovium. Despite their potential importance, comprehensive understanding of SFs under inflammatory conditions remains elusive. Here, to elucidate the actions of SFs and their contributions to RA pathogenesis, we stimulated SFs with 8 proinflammatory cytokines and analyzed the outcome using genomic, epigenomic and transcriptomic approaches. We observed stimulated transcription of pathogenic mole...
#1Miguel Madrid-Mencia (Paul Sabatier University)
#2Emanuele RaineriH-Index: 18
Last. Vera Pancaldi (Paul Sabatier University)
view all 4 authors...
We introduce an R package and a web-based visualization tool for the representation, analysis and integration of epigenomic data in the context of 3D chromatin interaction networks. GARDEN-NET allows for the projection of user-submitted genomic features on pre-loaded chromatin interaction networks, exploiting the functionalities of the ChAseR package to explore the features in combination with chromatin network topology properties. We demonstrate the approach using published epigenomic and chrom...
The causal relationship between 3D chromatin domains and gene regulation has been of considerable debate in recent years. Initial Hi-C studies profiling the 3D chromatin structure of the genome described evolutionarily conserved Topologically Associating Domains (TADs) that correlated with gene expression. Subsequent evidence from mouse models and human disease directly linked TADs to gene regulation. However, a number of focused genetic and genome-wide studies questioned the relevance of 3D chr...