Branding/Logomark minus Citation Combined Shape Icon/Bookmark-empty Icon/Copy Icon/Collection Icon/Close Copy 7 no author result Created with Sketch. Icon/Back Created with Sketch.
Loading Scinapse...
Basic Local Alignment Search Tool
Published on Oct 1, 1990in Journal of Molecular Biology 4.89
· DOI :10.1016/S0022-2836(05)80360-2
Stephen F. Altschul46
Estimated H-index: 46
(National Institutes of Health),
Warren Gish15
Estimated H-index: 15
(National Institutes of Health)
+ 2 AuthorsDavid J. Lipman44
Estimated H-index: 44
(National Institutes of Health)
Abstract
A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straight-forward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
  • Full text
  • References (23)
  • Cited By (58072)
Cite
References23
Published on Jan 1, 1979
John E. Hopcroft57
Estimated H-index: 57
,
Jeffrey D. Ullman95
Estimated H-index: 95
2,789 Citations
Published on Jan 1, 1989in Journal of Molecular Biology 4.89
Jean B. Margot2
Estimated H-index: 2
(Pennsylvania State University),
G. William Demers3
Estimated H-index: 3
(Pennsylvania State University),
Ross C. Hardison66
Estimated H-index: 66
(Pennsylvania State University)
Abstract The nucleotide sequence of the entire β-like globin gene cluster of rabbits has been determined. This sequence of a continuous stretch of 44.5 × 10 3 base-pairs (bp) starts about 6 × 10 3 bp upstream from e (the 5′-most gene) and ends about 12 × 10 3 bp downstream from β (the 3′-most gene). Analysis of the sequence reveals that: (1) the sequence is relatively A + T rich (about 60%); (2) regions with high G + C content are associated with OcC repeats, a short interspersed repeated DNA in...
59 Citations Source Cite
Published on Aug 1, 1983
David Sankoff56
Estimated H-index: 56
,
Joseph B. Kruskal9
Estimated H-index: 9
1,530 Citations
Published on Jun 1, 1991in Journal of Molecular Biology 4.89
Stephen F. Altschul46
Estimated H-index: 46
(National Institutes of Health)
Protein sequence alignments have become an important tool for molecular biologists. Local alignments are frequently constructed with the aid of a "substitution score matrix" that specifies a score for aligning each pair of amino acid residues. Over the years, many different substitution matrices have been proposed, based on a wide variety of rationales. Statistical results, however, demonstrate that any such matrix is i.mplicitly a "log-odds" matrix, with a specific target distribution for align...
546 Citations Source Cite
Published on Jun 1, 1974in Siam Journal on Applied Mathematics 1.70
Peter H. Sellers4
Estimated H-index: 4
This paper gives a formal definition of the biological concept of evolutionary distance and an algorithm to compute it. For any set S of finite sequences of varying lengths this distance is a real-valued function on $S \times S$, and it is shown to be a metric under conditions which are wide enough to include the biological application. The algorithm, introduced here, lends itself to computer programming and provides a method to compute evolutionary distance which is shorter than the other metho...
419 Citations Source Cite
Randall F. Smith2
Estimated H-index: 2
,
Temple F. Smith37
Estimated H-index: 37
Abstract We have developed a computer algorithm that can extract the pattern of conserved primary sequence elements common to all members of a homologous protein family. The method involves clustering the pairwise similarity scores among a set of related sequences to generate a binary dendrogram (tree). The tree is then reduced in a stepwise manner by progressively replacing the node connecting the two most similar termini by one common pattern until only a single common "root" pattern remains. ...
225 Citations Source Cite
Published on Jul 1, 1984in Bulletin of Mathematical Biology 1.48
Peter H. Sellers11
Estimated H-index: 11
(Rockefeller University)
A new development is introduced here in the use of dynamic programming in finding pattern similarities in genetic sequences, as was first done by Needleman and Wunsch (1969). A condition of pattern similarity is defined and an algorithm is given which scans any set of similarities and screens out those which fail to meet the condition. When the set to be scanned contains every pair of segments, one from each of two given sequences of lengthsm andn (i.e. every possible location for a pattern simi...
77 Citations Source Cite
Published on Jun 1, 1990in Annals of Statistics 2.52
Samuel Karlin75
Estimated H-index: 75
,
Amir Dembo35
Estimated H-index: 35
,
Tsutomu Kawabata10
Estimated H-index: 10
89 Citations Source Cite
Published on Jan 1, 1982in Nucleic Acids Research 11.56
Walter B. Goad12
Estimated H-index: 12
(Los Alamos National Laboratory),
Minoru I. Kanehisa6
Estimated H-index: 6
(Los Alamos National Laboratory)
We present an algorithm--a generalization of the Needleman-Wunsch-Sellers algorithm--which finds within longer sequences all subsequences that resemble one another locally. The probability that so close a resemblance would occur by chance alone is calculated and used to classify these local homologies according to statistical significance. Repeats and inverted repeats may also be found. Results for both random and biological nucleic acid sequences are presented. Fourteen complete genomes are ana...
205 Citations Source Cite
William R. Pearson45
Estimated H-index: 45
,
David J. Lipman44
Estimated H-index: 44
We have developed three computer programs for comparisons of protein and DNA sequences. They can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity. The FASTA program is a more sensitive derivative of the FASTP program, which can be used to search protein or DNA sequence data bases and can compare a protein sequence to a DNA sequence data base by translating the DNA data base as it is searched. FASTA includes an ...
9,718 Citations Source Cite
Cited By58072
Published on Jan 1, 2005
Mathieu Giraud10
Estimated H-index: 10
,
Pascale Quignon8
Estimated H-index: 8
+ 6 AuthorsJacques Nicolas10
Estimated H-index: 10
Published on Jan 1, 2003
Rodger Staden27
Estimated H-index: 27
,
David Phillip Judge4
Estimated H-index: 4
,
James K. Bonfield20
Estimated H-index: 20
Methods for managing large scale sequencing projects are available through the use of our GAP4 package and the applications to which it can link are described. This main assembly and editing program, also provides a graphical user interface to the assembly engines: CAP3, FAKII, and PHRAP. Because of the diversity of working practices in the large number of laboratories where the package is used, these methods are very flexible and are readily tailored to suit local needs. For example, the Sanger...
30 Citations Source Cite
Published on Jan 1, 2003
Conor Neil McCarthy1
Estimated H-index: 1
Psychrotrophic bacteria, such as Pseudomonas fluorescens B52, are a major cause of milk spoilage at refrigeration temperature due to the production of lipolytic and proteolytic enzymes. Regulatory mechanisms controlling the production of lipase and protease by the B52 lipA and aprX genes were investigated. Transposon mutagenesis identified the possible involvement of a poly-A polymerase enzyme which destabilises mRNA by 3' polyadenylation. A homologue of the E. coli EnvZ/OmpR two-component senso...
1 Citations
Published on Jan 1, 2002
Roger Unwin1
Estimated H-index: 1
(University of Illinois at Urbana–Champaign),
James M. Fenton34
Estimated H-index: 34
(University of Illinois at Urbana–Champaign)
+ 4 AuthorsShankar Subramaniam21
Estimated H-index: 21
4 Citations Source Cite
Published on Jan 1, 2015
N. Ajami1
Estimated H-index: 1
,
J. Petrosino1
Estimated H-index: 1
Abstract The collections of eukaryotic and prokaryotic viral genomes that are found in or on humans are referred to as the human virome. There are an estimated 10 31 viral particles on Earth. Human feces, one of the richest human samples, contain at least 10 9 virus-like particles per gram. Viruses have an evident effect on human health either by shaping the structure and function of bacterial communities (prokaryotic viruses) or by directly infecting human cells resulting in acute, persistent, ...
Source Cite
Published on Jan 1, 2000
Siv G. E. Andersson50
Estimated H-index: 50
(Uppsala University),
Kimmo Eriksson23
Estimated H-index: 23
Comparative genome sequence data from closely related microbial strains and species are accumulating at a rapid pace. Detailed inspection of this data provides information about the rates and patterns whereby intragenomic rearrangement events occur. This will help us understand the evolutionary forces that determine genomic structures and stabilities in microbial systems. Here, we discuss methods and tools currently available for comparative analysis of genomic architectures and phylogenetic rec...
1 Citations Source Cite
Published on Jan 1, 2014
Abstract Probably the most important technique available to the molecular biologist is DNA sequencing, by which the precise order of nucleotides in a piece of DNA can be determined. The function of a gene can often be deduced from its nucleotide sequence. Initially these techniques were applied to individual genes, but since the early 1990s an increasing number of entire genome sequences have been obtained. Many bioinformatics programs are used during the process of analyzing DNA sequences. Tric...
Source Cite
Published on Jan 1, 2002in Pharmaceutical biotechnology
Wolfgang Sadée42
Estimated H-index: 42
(University of California, San Francisco),
Richard C. Graul5
Estimated H-index: 5
(University of California, San Francisco),
Alan Y. Lee1
Estimated H-index: 1
(University of California, San Francisco)
In summation, classification of membrane transporters is a challenging issue. One must consider not only their primary structure, but one must also ponder their function, topology, evolutionary origins, substrate specificity, and structural architecture. It is not known to what extent convergence and divergence played a role in the evolutionary progression of membrane transporters. A major question remaining to be resolved is whether common ancestry indeed implies similar folding into a common m...
7 Citations Source Cite
Published on Jan 1, 2009
View next paperBIOEDIT: A USER-FRIENDLY BIOLOGICAL SEQUENCE ALIGNMENT EDITOR AND ANALYSIS PROGRAM FOR WINDOWS 95/98/ NT