scinapse is loading now...

Sequence logos: a new way to display consensus sequences

Published on Jan 1, 1990in Nucleic Acids Research 11.56
· DOI :10.1093/nar/18.20.6097
Thomas D. Schneider39
Estimated H-index: 39
,
R M Stephens1
Estimated H-index: 1
Abstract
A graphical method is presented for displaying the patterns in a set of aligned sequences. The characters representing the sequence are stacked on top of each other for each position in the aligned sequences. The height of each letter is made proportional to its frequency, and the letters are sorted so the most common one is on top. The height of the entire stack is then adjusted to signify the information content of the sequences at that position. From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content (measured in bits) at every position in a site or sequence. The logo displays both significant residues and subtle sequence patterns.
  • References (2)
  • Citations (2273)
Cite
References2
Newest
Published on Jan 1, 1985
Kathleen Jensen5
Estimated H-index: 5
,
Niklaus Wirth36
Estimated H-index: 36
+ 6 AuthorsJuris Hartmanis40
Estimated H-index: 40
This manual is directed to those who have previously acquired some programming skill. The intention is to provide a means of learning Pascal without outside guidance. It is based on The Programming Language Pascal (Revised Report) [1]--the basic definition of Pascal and concise reference manual for the experienced Pascal programmer. The linear structure of a book is by no means ideal for introducing a language, whether it be a formal or natural one. Nevertheless, it is recommended to follow the ...
619 Citations
Published on Jan 1, 1983
Edward R. Tufte17
Estimated H-index: 17
(Princeton University)
Slik lyder tre av omtalene av Edward Tuftes The visual display of quantitative information. Siden forste utgave kom ut i 1983, har den blitt sett pa som en tidlos klassiker og bestselger innen informasjonsgrafikk. Siden har den hatt tre «oppfolgere»: Envisioning information (1990), Visual explanations (1997) og Beautiful Evidence (2006). Med sin hoye kompetanse innen informasjonsgrafikk blir Edward Tufte i dag sett pa som en av de fremste pioneerene innen faget, og han har blitt tildelt over 40 ...
5,044 Citations
Cited By2273
Newest
Published on Jan 23, 2019in bioRxiv
Neerja Thakkar (Dartmouth College), Chris Bailey-Kellogg25
Estimated H-index: 25
(Dartmouth College)
Repertoire sequencing is enabling deep explorations into the cellular immune response, including the characterization of commonalities and differences among T cell receptor (TCR) repertoires from different individuals, pathologies, and antigen specificities. In seeking to understand the generality of patterns observed in different groups of TCRs, it is necessary to balance how well each pattern represents the diversity among TCRs from one group (sensitivity) vs. how many TCRs from other groups i...
Source Cite
Published on May 15, 2019in BMC Bioinformatics 2.21
Jaume Bonet12
Estimated H-index: 12
(École Polytechnique Fédérale de Lausanne),
Zander Harteveld1
Estimated H-index: 1
(École Polytechnique Fédérale de Lausanne)
+ 2 AuthorsBruno E. Correia14
Estimated H-index: 14
(École Polytechnique Fédérale de Lausanne)
Background Large-scale datasets of protein structures and sequences are becoming ubiquitous in many domains of biological research. Experimental approaches and computational modelling methods are generating biological data at an unprecedented rate. The detailed analysis of structure-sequence relationships is critical to unveil governing principles of protein folding, stability and function. Computational protein design (CPD) has emerged as an important structure-based approach to engineer protei...
Source Cite
Published on Mar 1, 2019in BMC Evolutionary Biology 3.03
Jan Mrázek18
Estimated H-index: 18
(University of Georgia),
Anna C. Karls6
Estimated H-index: 6
(University of Georgia)
Background Interactions between transcription factors and their specific binding sites are a key component of regulation of gene expression. Until recently, it was generally assumed that most bacterial transcription factor binding sites are located at or near promoters. However, several recent works utilizing high-throughput technology to detect transcription factor binding sites in bacterial genomes found a large number of binding sites in unexpected locations, particularly inside genes, as opp...
Source Cite
Published on Jan 17, 2019in Nature Communications 12.35
César Díez-Villaseñor1
Estimated H-index: 1
,
F. Rodriguez-V A Lera61
Estimated H-index: 61
Smacoviridae is a family of small (~2.5 Kb) CRESS-DNA (Circular Rep Encoding Single-Stranded (ss) DNA) viruses. These viruses have been found in faeces, were thought to infect eukaryotes and are suspected to cause gastrointestinal disease in humans. CRISPR-Cas systems are adaptive immune systems in prokaryotes, wherein snippets of genomes from invaders are stored as spacers that are interspersed between a repeated CRISPR sequence. Here we report several spacer sequences in the faecal archaeon Ca...
1 Citations Source Cite
Published on Feb 27, 2019in Nature Communications 12.35
Cheng Zhu3
Estimated H-index: 3
(University of North Carolina at Chapel Hill),
Elena Dukhovlinova8
Estimated H-index: 8
(University of North Carolina at Chapel Hill)
+ 13 AuthorsNikolay V. Dokholyan55
Estimated H-index: 55
An array of carbohydrates masks the HIV-1 surface protein Env, contributing to the evasion of humoral immunity. In most HIV-1 isolates ‘glycan holes’ occur due to natural sequence variation, potentially revealing the underlying protein surface to the immune system. Here we computationally design epitopes that mimic such surface features (carbohydrate-occluded neutralization epitopes or CONE) of Env through ‘epitope transplantation’, in which the target region is presented on a carrier protein sc...
Source Cite
Published on Jul 1, 2019in Food Chemistry 4.95
Rosita Russo6
Estimated H-index: 6
,
Mariangela Valletta1
Estimated H-index: 1
+ 5 AuthorsAngela Chambery23
Estimated H-index: 23
Abstract Probiotic lactic acid bacteria (LAB) are generally employed in food industry because they contribute to nutritional value of fermented foods. Although knowledge of LAB composition is of high relevance for various industrial and biotechnological applications, the comprehensive identification of LAB species is sometimes technically challenging. Recently, MALDI-TOF MS-based methodologies for bacteria detection/identification in clinical diagnostics and agri-food proved to be an attractive ...
Source Cite
Published on May 15, 2019in Nature Protocols 12.42
Anthony W. Purcell59
Estimated H-index: 59
(Monash University, Clayton campus),
Sri-Harsha Ramarathinam12
Estimated H-index: 12
(Monash University, Clayton campus),
Nicola Ternette19
Estimated H-index: 19
(University of Oxford)
Peptide antigens bound to molecules encoded by the major histocompatibility complex (MHC) and presented on the cell surface form the targets of T lymphocytes. This critical arm of the adaptive immune system facilitates the eradication of pathogen-infected and cancerous cells, as well as the production of antibodies. Methods to identify these peptide antigens are critical to the development of new vaccines, for which the goal is the generation of effective adaptive immune responses and long-lasti...
Source Cite
Published on Nov 1, 2018in Molecular Plant 9.33
Xuelei Lai3
Estimated H-index: 3
(Centre national de la recherche scientifique),
Arnaud Stigliani2
Estimated H-index: 2
(Centre national de la recherche scientifique)
+ 5 AuthorsFrançois Parcy33
Estimated H-index: 33
(Centre national de la recherche scientifique)
Abstract Transcription factors (TFs) are key cellular components that control gene expression. They recognize specific DNA sequences, the TF binding sites (TFBSs), and thus are targeted to specific regions of the genome where they can recruit transcriptional co-factors and/or chromatin regulators to fine-tune spatiotemporal gene regulation. Therefore, the identification of TFBSs in genomic sequences and their subsequent quantitative modeling is of crucial importance for understanding and predict...
1 Citations Source Cite
Published on Jun 1, 2019in Machine Learning 1.85
Ralf Eggeling5
Estimated H-index: 5
(University of Helsinki),
Ivo Grosse34
Estimated H-index: 34
(Martin Luther University of Halle-Wittenberg),
Mikko Koivisto19
Estimated H-index: 19
(University of Helsinki)
Parsimonious context trees, PCTs, provide a sparse parameterization of conditional probability distributions. They are particularly powerful for modeling context-specific independencies in sequential discrete data. Learning PCTs from data is computationally hard due to the combinatorial explosion of the space of model structures as the number of predictor variables grows. Under the score-and-search paradigm, the fastest algorithm for finding an optimal PCT, prior to the present work, is based on...
Source Cite
Published on May 22, 2019in Applied Microbiology and Biotechnology 3.34
Anna Coenen1
Estimated H-index: 1
,
Sylvia Oetermann2
Estimated H-index: 2
,
Alexander Steinbüchel82
Estimated H-index: 82
(King Abdulaziz University)
Streptomyces coelicolor A3(2) is a rubber-degrading actinomycete that harbors one gene coding for a latex clearing protein (lcpA3(2)). Within the genome of S. coelicolor A3(2), we identified a gene coding for a novel protein of the TetR family (LcpRBA3(2)) downstream of lcpA3(2) and demonstrated its binding upstream of lcpA3(2). This indicates a role of LcpRBA3(2) in the regulation of lcp expression. LcpRBA3(2) shows no homology to LcpRVH2, a putative regulator of lcp expression in Gordonia poly...
Source Cite