Branding/Logomark minus Citation Combined Shape Icon/Bookmark-empty Icon/Copy Icon/Collection Icon/Close Copy 7 no author result Created with Sketch. Icon/Back Created with Sketch. Match!

A clustering method for repeat analysis in DNA sequences.

Published on Aug 1, 2001in Genome Biology 14.03
· DOI :10.1186/gb-2001-2-8-research0027
Natalia Volfovsky16
Estimated H-index: 16
,
Brian J. Haas62
Estimated H-index: 62
,
Steven L. Salzberg120
Estimated H-index: 120
Cite
Abstract
Background A computational system for analysis of the repetitive structure of genomic sequences is described. The method uses suffix trees to organize and search the input sequences; this data structure has been used previously for efficient computation of exact and degenerate repeats.
  • References (23)
  • Citations (120)
Cite
References23
Newest
William C. Nierman56
Estimated H-index: 56
(JCVI: J. Craig Venter Institute),
Tamara V. Feldblyum33
Estimated H-index: 33
(JCVI: J. Craig Venter Institute)
+ 34 AuthorsJanine R. Maddock32
Estimated H-index: 32
(UM: University of Michigan)
The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction p...
Published on Nov 1, 2000in Bioinformatics 4.53
Joseph A. Bedell12
Estimated H-index: 12
,
Ian Korf37
Estimated H-index: 37
,
Warren R. Gish9
Estimated H-index: 9
Summary: Identifying and masking repetitive elements is usually the first step when analyzing vertebrate genomic sequence. Current repeat identification software is sensitive but slow, creating a costly bottleneck in large-scale analyses. We have developed MaskerAid, a software enhancement to RepeatMasker that increased the speed of masking more than 30-fold at the most sensitive setting. Availability: On request from the authors (see http:// sapiens.wustl.edu/ MaskerAid).
Published on Sep 15, 2000in Nucleic Acids Research 11.15
Qiaoping Yuan10
Estimated H-index: 10
(JCVI: J. Craig Venter Institute),
Feng Liang8
Estimated H-index: 8
(JCVI: J. Craig Venter Institute)
+ 5 AuthorsRobin Buell1
Estimated H-index: 1
(JCVI: J. Craig Venter Institute)
A wealth of molecular resources have been developed for rice genomics, including dense genetic maps, expressed sequence tags (ESTs), yeast artificial chromosome maps, bacterial artificial chromosome (BAC) libraries and BAC end sequence databases. Integration of genetic and physical maps involves labor-intensive empirical experiments. To accelerate the integration of the bacterial clone resources with the genetic map for the International Rice Genome Sequencing Project, we cleaned and filtered th...
Published on Aug 19, 2000 in ISMB (Intelligent Systems in Molecular Biology)
Stefan Kurtz31
Estimated H-index: 31
(Bielefeld University),
Enno Ohlebusch23
Estimated H-index: 23
+ 2 AuthorsRobert Giegerich37
Estimated H-index: 37
The repetitive structure of genomic DNA holds many secrets to be discovered. A systematic study of repetitive DNA on a genomic or inter-genomic scale requires extensive algorithmic support. The REPuter family of programs described herein was designed to serve as a fundamental tool in such studies. Efficient and complete detection of various types of repeats is provided together with an evaluation of significance, interactive visualization, and simple interfacing to other analysis programs.
Published on Jul 1, 2000in Genome Research 9.94
Long Mao6
Estimated H-index: 6
(Clemson University),
Todd Wood7
Estimated H-index: 7
(Clemson University)
+ 9 AuthorsStephen A. Goff18
Estimated H-index: 18
(Novartis)
Transposable elements (TEs) are ubiquitous in all organisms (Burge and Howe 1989; Xiong and Eickbush 1990). In plants, TEs are classified into two main classes (Flavell et al. 1994). Retrotransposons comprise Class I and transpose via an RNA intermediate. Class I TEs include retrotransposons with long terminal repeats (LTRs) such as Ty1/Copia-like and Ty3/Gypsy-like, as well as non-LTR retrotransposons. The class II TEs transpose via a DNA intermediate and in plants have been found mainly in mai...
Published on Mar 24, 2000in Science 41.04
Mark D. Adams39
Estimated H-index: 39
(Celera Corporation),
Susan E. Celniker52
Estimated H-index: 52
(LBNL: Lawrence Berkeley National Laboratory)
+ 191 AuthorsRichard F. Galle3
Estimated H-index: 3
(LBNL: Lawrence Berkeley National Laboratory)
The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the ∼120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosom...
Published on Mar 15, 2000in Nucleic Acids Research 11.15
Timothy D. Read12
Estimated H-index: 12
(JCVI: J. Craig Venter Institute),
Robert C. Brunham59
Estimated H-index: 59
(UBC: University of British Columbia)
+ 22 AuthorsKristi Berry5
Estimated H-index: 5
(JCVI: J. Craig Venter Institute)
The genome sequences of Chlamydia trachomatis mouse pneumonitis (MoPn) strain Nigg (1 069 412 nt) and Chlamydia pneumoniae strain AR39 (1 229 853 nt) were determined using a random shotgun strategy. The MoPn genome exhibited a general conservation of gene order and content with the previously sequenced C.trachomatis serovar D. Differences between C.trachomatis strains were focused on an ~50 kb ‘plasticity zone’ near the termination origins. In this region MoPn contained three copies of a novel g...
Published on Mar 10, 2000in Science 41.04
Hervé Tettelin48
Estimated H-index: 48
(JCVI: J. Craig Venter Institute),
Nigel J. Saunders37
Estimated H-index: 37
(University of Oxford)
+ 39 AuthorsRobert J. Dodson44
Estimated H-index: 44
(JCVI: J. Craig Venter Institute)
The 2,272,351–base pair genome of Neisseria meningitidis strain MC58 (serogroup B), a causative agent of meningitis and septicemia, contains 2158 predicted coding regions, 1158 (53.7%) of which were assigned a biological role. Three major islands of horizontal DNA transfer were identified; two of these contain genes encoding proteins involved in pathogenicity, and the third island contains coding sequences only for hypothetical proteins. Insights into the commensal and virulence behavior of N. m...
Published on Jan 1, 2000in Nature 43.07
Arabidopsis Genome Initiative1
Estimated H-index: 1
The flowering plant Arabidopsis thaliana is an important model system for identifying genes and determining their functions. Here we report the analysis of the genomic sequence of Arabidopsis. The sequenced regions cover 115.4 megabases of the 125-megabase genome and extend into centromeric regions. The evolution of Arabidopsis involved a whole-genome duplication, followed by subsequent gene loss and extensive local gene duplications, giving rise to a dynamic genome enriched by lateral gene tran...
Published on May 1, 1999in Nature 43.07
Karen E. Nelson71
Estimated H-index: 71
(JCVI: J. Craig Venter Institute),
Rebecca A. Clayton15
Estimated H-index: 15
(JCVI: J. Craig Venter Institute)
+ 26 AuthorsKaren A. Ketchum21
Estimated H-index: 21
(JCVI: J. Craig Venter Institute)
The 1,860,725-base-pair genome of Thermotoga maritima MSB8 contains 1,877 predicted coding regions, 1,014 (54%) of which have functional assignments and 863 (46%) of which are of unknown function. Genome analysis reveals numerous pathways involved in degradation of sugars and plant polysaccharides, and 108 genes that have orthologues only in the genomes of other thermophilic Eubacteria and Archaea. Of the Eubacteria sequenced to date, T.maritima has the highest percentage (24%) of genes that are...
Cited By120
Newest
Published on Dec 1, 2019in BMC Genomics 3.50
Hong-Rui Zhang1
Estimated H-index: 1
(CAS: Chinese Academy of Sciences),
Xian-Chun Zhang16
Estimated H-index: 16
(CAS: Chinese Academy of Sciences),
Qiao-Ping Xiang11
Estimated H-index: 11
(CAS: Chinese Academy of Sciences)
It is hypothesized that the highly conserved inverted repeats (IR) structure of land plant plastid genomes (plastomes) is beneficial for stabilizing plastome organization, whereas the mechanism of the occurrence and stability maintenance of the recently reported direct repeats (DR) structure is yet awaiting further exploration. Here we describe the DR structure of the Selaginella vardei (Selaginellaceae) plastome, to elucidate the mechanism of DR occurrence and stability maintenance. The plastom...
Published on May 14, 2019in Mobile Dna 3.63
Andrei S. Guliaev1
Estimated H-index: 1
(RAS: Russian Academy of Sciences),
S. K. Semyenova10
Estimated H-index: 10
(RAS: Russian Academy of Sciences)
Background Genomes of eukaryotes are inhabited by myriads of mobile genetic elements (MGEs) – transposons and retrotransposons - which play a great role in genome plasticity and evolution. A lot of computational tools were developed to annotate them either in genomic assemblies or raw reads using de novo or homology-based approaches. But there has been no pipeline enabling users to get coding and flanking sequences of MGEs suitable for a downstream analysis from genome assemblies.
Hong-Rui Zhang1
Estimated H-index: 1
(CAS: Chinese Academy of Sciences),
Qiao-Ping Xiang11
Estimated H-index: 11
(CAS: Chinese Academy of Sciences),
Xian-Chun Zhang16
Estimated H-index: 16
(CAS: Chinese Academy of Sciences)
Published on May 1, 2018in Molecular Ecology Resources 7.05
Céline Van de Paer2
Estimated H-index: 2
(University of Toulouse),
Olivier Bouchez22
Estimated H-index: 22
(INRA: Institut national de la recherche agronomique),
Guillaume Besnard37
Estimated H-index: 37
(University of Toulouse)
The mitogenome is rarely used to reconstruct the evolutionary history of plants, contrary to nuclear and plastid markers. Here, we evaluate the usefulness of mitochondrial DNA for molecular evolutionary studies in Oleaceae, in which cases of cytoplasmic male sterility (CMS) and of potentially contrasted organelle inheritance are known. We compare the diversity and the evolution of mitochondrial and chloroplast genomes by focusing on the olive complex and related genera. Using high-throughput tec...
Published on Jan 1, 2018in Methods of Molecular Biology
Katelyn McNair7
Estimated H-index: 7
(SDSU: San Diego State University),
Ramy K. Aziz29
Estimated H-index: 29
(Cairo University)
+ 3 AuthorsRobert Edwards93
Estimated H-index: 93
(SDSU: San Diego State University)
Phages are complex biomolecular machineries that have to survive in a bacterial world. Phage genomes show many adaptations to their lifestyle such as shorter genes, reduced capacity for redundant DNA sequences, and the inclusion of tRNAs in their genomes. In addition, phages are not free-living, they require a host for replication and survival. These unique adaptations provide challenges for the bioinformatics analysis of phage genomes. In particular, ORF calling, genome annotation, noncoding RN...
Published on Jan 1, 2018in Methods of Molecular Biology
Danillo Oliveira Alvarenga8
Estimated H-index: 8
(UNESP: Sao Paulo State University),
Leandro Marcio Moreira13
Estimated H-index: 13
(UFOP: Universidade Federal de Ouro Preto)
+ 1 AuthorsAlessandro M. Varani14
Estimated H-index: 14
(UNESP: Sao Paulo State University)
Departamento de Tecnologia Faculdade de Ciencias Agrarias e Veterinarias Universidade Estadual Paulista “Julio de Mesquita Filho”–UNESP
Published on Jun 1, 2017in Current Biology 9.19
Sergio A. Muñoz-Gómez7
Estimated H-index: 7
(Dal: Dalhousie University),
Fabian G. Mejía-Franco1
Estimated H-index: 1
+ 4 AuthorsClaudio H. Slamovits26
Estimated H-index: 26
(CIFAR: Canadian Institute for Advanced Research)
Summary Red algal plastid genomes are often considered ancestral and evolutionarily stable, and thus more closely resembling the last common ancestral plastid genome of all photosynthetic eukaryotes [1, 2]. However, sampling of red algal diversity is still quite limited (e.g., [2–5]). We aimed to remedy this problem. To this end, we sequenced six new plastid genomes from four undersampled and phylogenetically disparate red algal classes (Porphyridiophyceae, Stylonematophyceae, Compsopogonophycea...
Published on Feb 3, 2017
Lisui Bao14
Estimated H-index: 14
(UA: University of Alabama),
Zhanjiang Liu1
Estimated H-index: 1
(UA: University of Alabama)
Published on Dec 1, 2016in BMC Genomics 3.50
Helen N. Catanese2
Estimated H-index: 2
(WSU: Washington State University),
Kelly A. Brayton32
Estimated H-index: 32
(WSU: Washington State University),
Assefaw Hadish Gebremedhin15
Estimated H-index: 15
(WSU: Washington State University)
Background Short-sequence repeats (SSRs) occur in both prokaryotic and eukaryotic DNA, inter- and intragenically, and may be exact or inexact copies. When heterogeneous SSRs are present in a given locus, we can take advantage of the pattern of different repeats to genotype strains based on the SSRs. Cataloguing and tracking these repeats can be difficult as diverse groups of researchers are involved in the identification of the repeats. Additionally, the task is error-prone when done manually.
Shuaibin Lian1
Estimated H-index: 1
(Xinyang Normal University),
Xinwu Chen1
Estimated H-index: 1
(Xinyang Normal University)
+ 2 AuthorsXianhua Dai1
Estimated H-index: 1
(SYSU: Sun Yat-sen University)
It has become clear that repetitive sequences have played multiple roles in eukaryotic genome evolution including increasing genetic diversity through mutation, changes in gene expression and facilitating generation of novel genes. However, identification of repetitive elements can be difficult in the ab initio manner. Currently, some classical ab initio tools of finding repeats have already presented and compared. The completeness and accuracy of detecting repeats of them are little pool. To th...