Transcriptomic insights into genetic diversity of protein-coding genes in X. laevis

Published on Apr 1, 2017in Developmental Biology2.936
路 DOI :10.1016/j.ydbio.2017.02.019
Virginia Savova16
Estimated H-index: 16
(Harvard University),
Esther J. Pearl6
Estimated H-index: 6
(MBL: Marine Biological Laboratory)
+ 4 AuthorsLeonid Peshkin31
Estimated H-index: 31
(Harvard University)
We characterize the genetic diversity of Xenopus laevis strains using RNA-seq data and allele-specific analysis. This data provides a catalogue of coding variation, which can be used for improving the genomic sequence, as well as for better sequence alignment, probe design, and proteomic analysis. In addition, we paint a broad picture of the genetic landscape of the species by functionally annotating different classes of mutations with a well-established prediction tool (PolyPhen-2). Further, we specifically compare the variation in the progeny of four crosses: inbred genomic (J)-strain, outbred albino (B)-strain, and two hybrid crosses of J and B strains. We identify a subset of mutations specific to the B strain, which allows us to investigate the selection pressures affecting duplicated genes in this allotetraploid. From these crosses we find the ratio of non-synonymous to synonymous mutations is lower in duplicated genes, which suggests that they are under greater purifying selection. Surprisingly, we also find that function-altering ("damaging") mutations constitute a greater fraction of the non-synonymous variants in this group, which suggests a role for subfunctionalization in coding variation affecting duplicated genes.
  • References (32)
  • Citations (1)
馃摉 Papers frequently viewed together
20112.78PLOS ONE
6 Authors (Chris Showell, ..., Frank L. Conlon)
6 Citations
16 Citations
78% of Scinapse members use related papers. After signing in, all features are FREE.
#1Adam M. Session (University of California, Berkeley)H-Index: 2
#2Yoshinobu Uno (Nagoya University)H-Index: 16
Last. Daniel S. Rokhsar (University of California, Berkeley)H-Index: 85
view all 74 authors...
The two homoeologous subgenomes in the allotetraploid frog Xenopus laevis evolved asymmetrically; one often retained the ancestral state, whereas the other experienced gene loss, deletion, rearrangement and reduced gene expression.
289 CitationsSource
#1Leonid Peshkin (Harvard University)H-Index: 31
#2Martin W眉hr (Harvard University)H-Index: 18
Last. Marc W. Kirschner (Harvard University)H-Index: 144
view all 10 authors...
A biochemical explanation of development from the fertilized egg to the adult requires an understanding of the proteins and RNAs expressed over time during embryogenesis. We present a comprehensive characterization of protein and mRNA dynamics across early development in Xenopus. Surprisingly, we find that most protein levels change little and duplicated genes are expressed similarly. While the correlation between protein and mRNA levels is poor, a mass action kinetics model parameterized using ...
96 CitationsSource
#1Yoichi MatsudaH-Index: 45
#2Yoshinobu UnoH-Index: 16
Last. Masanori TairaH-Index: 33
view all 8 authors...
Xenopus laevis (XLA) is an allotetraploid species which appears to have undergone whole-genome duplication after the interspecific hybridization of 2 diploid species closely related to Silurana/Xenopus tropicalis (XTR). Previous cDNA fluorescence in situ hybridization (FISH) experiments have identified 9 sets of homoeologous chromosomes in X. laevis, in which 8 sets correspond to chromosomes 1-8 of X. tropicalis (XTR1-XTR8), and the last set corresponds to a fusion of XTR9 and XTR10. In addition...
36 CitationsSource
#1Jing Wang (Vandy: Vanderbilt University)H-Index: 15
#2Leonid Raskin (Vandy: Vanderbilt University)H-Index: 1
Last. Yan Guo (Vandy: Vanderbilt University)H-Index: 40
view all 5 authors...
Motivation: The transition/transversion (Ti/Tv) ratio and heterozygous/nonreference-homozygous (het/nonref-hom) ratio have been commonly computed in genetic studies as a quality control (QC) measurement. Additionally, these two ratios are helpful in our understanding of the patterns of DNA sequence evolution. Results: To thoroughly understand these two genomic measures, we performed a study using 1000 Genomes Project (1000G) released genotype data (N = 1092). An additional two datasets (N = 581 ...
36 CitationsSource
#1Shahar Alon (TAU: Tel Aviv University)H-Index: 17
#2Sandra C. Garrett (University of Connecticut Health Center)H-Index: 6
Last. Eli Eisenberg (TAU: Tel Aviv University)H-Index: 31
view all 7 authors...
For living cells to create a protein, a genetic code found in its DNA must first be 鈥榯ranscribed鈥 to create a corresponding molecule of messenger RNA (mRNA). DNA and RNA are both made from smaller molecules called nucleotides that are linked together into long chains; the information in both DNA and RNA is contained in the sequence of these molecules. The mRNA nucleotides coding for proteins are 鈥榯ranslated鈥 in groups of three, and most of these nucleotide triplets instruct for a specific amino ...
53 CitationsSource
#1Jing Wang (Vandy: Vanderbilt University)H-Index: 34
#2David C. Samuels (Vandy: Vanderbilt University)H-Index: 42
Last. Yan Guo (Vandy: Vanderbilt University)H-Index: 40
view all 4 authors...
Background Characterizing genetic diversity is crucial for reconstructing human evolution and for understanding the genetic basis of complex diseases; however, human population genetics are very complicated. Previously, we proved that based on the Hardy-Weinberg equilibrium, the heterozygous vs. non-reference homozygous single nucleotide polymorphism (SNP) ratio (het/nonref-hom) is two [1]. Later, we found that this ratio is race dependent, with African being the most genetically diverse race an...
1 CitationsSource
#1Erica E. Davis (Duke University)H-Index: 38
#2Stephan Frangakis (Duke University)H-Index: 3
Last. Nicholas Katsanis (Duke University)H-Index: 79
view all 3 authors...
Rapid advances and cost erosion in exome and genome analysis of patients with both rare and common genetic disorders have accelerated gene discovery and illuminated fundamental biological mechanisms. The thrill of discovery has been accompanied, however, with the sobering appreciation that human genomes are burdened with a large number of rare and ultra rare variants, thereby posing a significant challenge in dissecting both the effect of such alleles on protein function and also the biological ...
38 CitationsSource
#1Anthony Bolger (MPG: Max Planck Society)H-Index: 11
#2Marc Lohse (MPG: Max Planck Society)H-Index: 21
Last. Bjoern Usadel (MPG: Max Planck Society)H-Index: 16
view all 3 authors...
Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data. Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks...
10.8k CitationsSource
#1Martin W眉hr (Harvard University)H-Index: 18
#2Robert M. Freeman (Harvard University)H-Index: 16
Last. Marc W. Kirschner (Harvard University)H-Index: 144
view all 7 authors...
Summary Background: Mass spectrometry-based proteomics enables the global identification and quantification of proteins and their posttranslational modifications in complex biological samples. However, proteomic analysis requires a complete and accurate reference set of proteins and is therefore largely restricted to model organisms with sequenced genomes. Results:Here,wedemonstratethefeasibilityofdeepgenomefree proteomics by using a reference proteome derived from heterogeneous mRNA data. We id...
109 CitationsSource
#1Anwesha Nag (Harvard University)H-Index: 7
#2Virginia Savova (Harvard University)H-Index: 16
Last. Alexander A. Gimelbrant (Harvard University)H-Index: 20
view all 7 authors...
Understanding how genes are activated and silenced is one of the central challenges in modern biology. These processes underpin the development of a fertilized egg into a complex organism, and they can also lead to life-threatening diseases when they go wrong. There are two copies of each gene in a human cell, a maternal copy and a paternal copy, and it is thought that both copies are usually regulated together. However, there are exceptions to this rule: for certain genes only the maternal copy...
49 CitationsSource
Cited By1
#1James Briggs (Harvard University)H-Index: 10
#2Caleb Weinreb (Harvard University)H-Index: 10
Last. Allon M. Klein (Harvard University)H-Index: 32
view all 7 authors...
INTRODUCTION Metazoan聽development represents a big jump in complexity compared with unicellular life in聽two aspects: cell-type differentiation and cell spatial organization. In vertebrate embryos, many distinct cell types appear within just a single day of life after fertilization. Studying the developmental dynamics of all embryonic cell types is complicated by factors such as the speed of early development, complex cellular spatial organization, and scarcity of raw material for conventional an...
102 CitationsSource