Chromatin three-dimensional interactions mediate genetic effects on gene expression
INTRODUCTION Genome-wide studies on the genetic basis of gene expression have advanced considerably our understanding of the function of the human genome. Large collections of expression quantitative trait loci (i.e., genetic variations affecting gene expression; eQTLs) are now available across many cell types, tissues, and conditions and are commonly used to better interpret the effects of noncoding genetic variations. Although this constitutes an extraordinary resource to study complex organismal traits and diseases, we still have a poor understanding of how they affect the regulatory machinery, which regulatory elements (REs) they perturb, and how their effects propagate along regulatory interactions. RATIONALE In this study, we aimed to characterize the complex and cell type–specific interplay between genetic variation, REs, and gene expression to dissect cis- and trans-regulatory coordination. To this end, we assembled and analyzed a population-scale dataset combining the activity of REs [measured by chromatin immunoprecipitation sequencing (ChIP-seq) for methylated histone 3 at lysine 4 (H3K4me1), trimethylated histone 3 at lysine 4 (H3K4me3), and acetylated histone 3 at lysine 27 (H3K27ac)], the expression of genes (using RNA-seq), and genetic variations for 317 lymphoblastoid and 78 fibroblast cell lines, all from European ancestry. RESULTS First, we show that the regulatory activity is structured in 12,583 well-delimited cis-regulatory domains (CRDs) that respect the local chromatin organization into topologically associating domains (TADs) but constitute finer organization. Our work suggests that three-dimensional (3D) organization in cis can be broadly categorized into functionally linked and unlinked domains, with the most-linked ones corresponding to CRDs. In addition, we found 25,315 significant associations between CRDs located on distinct chromosomes that form 30 trans-regulatory hubs (TRHs). These TRHs are consistent with a higher-order chromatin organization into A and B nuclear compartments and show a signal of allelic coordination, suggesting that some of the trans associations are not transcriptionally mediated and result from a complex and higher-order 3D nucleus organization. Second, we show that CRDs and TRHs essentially delimit sets of active REs involved in the expression of most genes and provide a dense genome-wide map linking REs and genes. We show that these links vary substantially across cell types and are key factors involved in the cis and trans coexpression of genes. Third, we show that CRDs are under strong genetic control. We discovered a total of 58,968 chromatin peaks affected by nearby genetic variants (cQTLs), 6157 QTLs that affect the activity of CRDs (aCRD-QTLs), and 110 QTLs that affect the correlation structure within CRDs (sCRD-QTLs). These QTLs tend to locate close to their genomic targets, are enriched within functional regions of the genome, and frequently overlap genetic variants associated with complex traits. Finally, we show that CRDs and TRHs capture complex regulatory networks along which the effects of eQTLs are propagated and synergized to affect gene expression. Overall, we estimate that 75% of the eQTLs also affect the activity of CRDs and describe four specific types of genetic effects that can be mediated by CRDs: (i) cis-eQTLs affecting distal genes, (ii) multiple cis-eQTLs with independent effects, (iii) multiple rare variants that have a cumulative effect, and (iv) trans-eQTLs. CONCLUSION We provide a genome-wide map of the coordination between REs and describe how this serves as a backbone for the propagation of noncoding genetic effects in cis and trans onto gene expression. We show how these types of data can reveal higher-order functional attributes of the genome and can serve as an effective prior to boost future association studies at both the discovery and interpretation levels. Overall, our study reveals the complexity and specificity of the cis- and trans-regulatory circuitry and its perturbation by genetic variations.