Branding/Logomark minus arrow-point-to-down Citation Combined Shape Icon/Bookmark-empty Icon/Copy Icon/Collection Icon/Close Copy 7 Icon/List no author result Created with Sketch.
Loading Scinapse...
A clustering method for repeat analysis in DNA sequences.
Abstract
Background A computational system for analysis of the repetitive structure of genomic sequences is described. The method uses suffix trees to organize and search the input sequences; this data structure has been used previously for efficient computation of exact and degenerate repeats.
  • Full text
  • References (23)
  • Cited By (118)
  • References (23)
  • Cited By (118)
1999 in Nucleic Acids ResearchIF: 11.56
Al Delcher1
Estimated H-index: 1
,
Simon Kasif45
Estimated H-index: 45
+ 2 AuthorsOwen White58
Estimated H-index: 58
28 Citations
1994 in Intelligent Systems in Molecular Biology
Pankaj K. Agarwal52
Estimated H-index: 52
,
David J. States32
Estimated H-index: 32
Over 3.6 million bases of DNA sequence from chromosome III of the C. eleganshave been determined. The availability of this extended region of contiguous sequence has allowed us to a~nalyze the nature and prevalence of repetitive sequences in the genome of a eukaryotic organism with a high gene density. We have assembled a Repeat Pattern Toolkit (RPT) to analyze the patterns of repeats occurring in DNA. The tools include identifying significant locM alignments (utilizing both two-way and three-wa...
28 Citations Download PDF
2000 in NatureIF: 41.58
Arabidopsis Genome Initiative1
Estimated H-index: 1
The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene tran...
6,945 Citations Source Cite
2000 in Intelligent Systems in Molecular Biology
Stefan Kurtz30
Estimated H-index: 30
(Bielefeld University),
Enno Ohlebusch23
Estimated H-index: 23
+ 2 AuthorsRobert Giegerich35
Estimated H-index: 35
The repetitive structure of genomic DNA holds many secrets to be discovered. A systematic study of repetitive DNA on a genomic or inter-genomic scale requires extensive algorithmic support. The REPuter family of programs described herein was designed to serve as a fundamental tool in such studies. Efficient and complete detection of various types of repeats is provided together with an evaluation of significance, interactive visualization, and simple interfacing to other analysis programs.
50 Citations Download PDF
1996 in ScienceIF: 41.06
Carol J. Bult52
Estimated H-index: 52
(University of Illinois at Urbana–Champaign),
Owen White30
Estimated H-index: 30
(University of Illinois at Urbana–Champaign)
+ 2 AuthorsJ. Craig Venter85
Estimated H-index: 85
(University of Illinois at Urbana–Champaign)
The present application describes the complete 1.66-megabase pair genome sequence of an autotrophic archaeon, Methanococcus jannaschii, and its 58- and 16-kilobase pair extrachromosomal elements. Also described are 1738 predicted protein-coding genes.
2,245 Citations Download PDF Cite
1999 in NatureIF: 41.58
Karen E. Nelson71
Estimated H-index: 71
(J. Craig Venter Institute),
Rebecca A. Clayton15
Estimated H-index: 15
(J. Craig Venter Institute)
+ 26 AuthorsKaren A. Ketchum22
Estimated H-index: 22
(J. Craig Venter Institute)
Evidence for lateral gene transfer between Archaea and Bacteria from genome sequence of Thermotoga maritima
1,236 Citations Download PDF Cite
2000 in Nucleic Acids ResearchIF: 11.56
Qiaoping Yuan10
Estimated H-index: 10
(J. Craig Venter Institute),
Feng Liang8
Estimated H-index: 8
(J. Craig Venter Institute)
+ 5 AuthorsRobin Buell1
Estimated H-index: 1
(J. Craig Venter Institute)
A wealth of molecular resources have been developed for rice genomics, including dense genetic maps, expressed sequence tags (ESTs), yeast artificial chromosome maps, bacterial artificial chromosome (BAC) libraries and BAC end sequence databases. Integration of genetic and physical maps involves labor-intensive empirical experiments. To accelerate the integration of the bacterial clone resources with the genetic map for the International Rice Genome Sequencing Project, we cleaned and filtered th...
37 Citations Download PDF Cite
1997
Dan Gusfield34
Estimated H-index: 34
(University of California, Davis)
Part I. Exact String Matching: The Fundamental String Problem: 1. Exact matching: fundamental preprocessing and first algorithms 2. Exact matching: classical comparison-based methods 3. Exact matching: a deeper look at classical methods 4. Semi-numerical string matching Part II. Suffix Trees and their Uses: 5. Introduction to suffix trees 6. Linear time construction of suffix trees 7. First applications of suffix trees 8. Constant time lowest common ancestor retrieval 9. More applications of suf...
3,153 Citations Source
  • References (23)
  • Cited By (118)
2011
Rohani Binti Abu Bakar3
Estimated H-index: 3
,
Chu Yuyi (Waseda University), Junzo Watada19
Estimated H-index: 19
(Waseda University)
The primary objective of clustering is to discover a structure in the data by forming some number of clusters or groups. In order to achieve optimal clustering results in current soft computing approaches, two fundamental questions should be considered; (i) how many clusters should be actually presented in the given data, and (ii) how real or good the clustering itself is. Based on these two fundamental questions, almost clustering method needs to determine the number of clusters . Yet, it is di...
Source Cite
2004 in Research in Computational Molecular Biology
Pavel A. Pevzner77
Estimated H-index: 77
,
Haixu Tang38
Estimated H-index: 38
,
Glenn Tesler27
Estimated H-index: 27
1 Citations
2009
Chris Duran14
Estimated H-index: 14
,
David Edwards59
Estimated H-index: 59
,
Jacqueline Batley41
Estimated H-index: 41
(University of Queensland)
The bulk of variation at the nucleotide level is often not visible at the phenotypic level. However, this variation can be exploited using molecular genetic marker systems. Molecular genetic markers represent one of the most powerful tools for genome analysis and permit the association of heritable traits with underlying genomic variation. Molecular marker technology has developed rapidly over the last decade, with the development of high-throughput genotyping methods and the availability of lar...
15 Citations Source Cite
2009
David Edwards59
Estimated H-index: 59
(University of Queensland),
Jason E. Stajich47
Estimated H-index: 47
(University of California, Berkeley),
David Hansen12
Estimated H-index: 12
(Commonwealth Scientific and Industrial Research Organisation)
DNA Sequence Databases.- Sequence Comparison Tools.- Genome Browsers.- Predicting Non-coding RNA Transcripts.- Gene Prediction Methods.- Gene Annotation Methods.- Regulatory Motif Analysis.- Molecular Marker Discovery and Genetic Map Visualisation.- Sequence Based Gene Expression Analysis.- Protein Sequence Databases.- Protein Structure Prediction.- Classification of Information About Proteins.- High-Throughput Plant Phenotyping - Data Acquisition, Transformation, and Analysis.- Phenome Analysis...
13 Citations Source Cite
2008
G Achaz , F Boyer + 1 AuthorsE Coissac
The importance of genome redundancy has been strongly emphasized in the field of genome dynamics and evolution as well as in medical biology. A repeat is a sequence present twice or more with a high degree of similarity within a larger sequence (e.g. a chromosome) or set of sequences (e.g. a genome with several chromosomes). Each instance of the repeated sub-sequence is called a ’copy’ of the repeat. We use the term ”duplication” to denote any active mechanistic event that creates a repeat. Even...
Download PDF
2011 in Advances in GeneticsIF: 4.69
Dale J. Hedges31
Estimated H-index: 31
(University of Miami),
Victoria P. Belancio17
Estimated H-index: 17
(Tulane University)
Since their initial discovery in maize, there have been various attempts to categorize the relationship between transposable elements (TEs) and their host organisms. These have ranged from TEs being selfish parasites to their role as essential, functional components of organismal biology. Research over the past several decades has, in many respects, only served to complicate the issue even further. On the one hand, investigators have amassed substantial evidence concerning the negative effects t...
15 Citations Source Cite
2006
Agnès Dettaï3
Estimated H-index: 3
,
Jean-Nicolas Volff37
Estimated H-index: 37
(University of Würzburg)
For the last fifteen years, researchers have been using SINE (short interspersed elements; non-autonomous retroposons) insertion polymorphism as characters for phylogeny. Although the collection of these characters is much less straightforward and much more work intensive than for classical sequence data, they are subject to very little homoplasy, and therefore allow more reliable determination of the phylogeny of species. As reversions are very rare, and the ancestral state (absence of the inse...
4 Citations Source Cite
Tese de mestrado integrado. Engenharia Informatica e computacao. Faculdade de Engenharia. Universidade do Porto. 2012
Download PDF
View next paperAutomated de novo identification of repeat sequence families in sequenced genomes.