The Repeat Pattern Toolkit (RPT): analyzing the structure and evolution of the C. elegans genome.
Published on Jan 1, 1994 in Intelligent Systems in Molecular Biology
Over 3.6 million bases of DNA sequence from chromosome III of the C. eleganshave been determined. The availability of this extended region of contiguous sequence has allowed us to a~nalyze the nature and prevalence of repetitive sequences in the genome of a eukaryotic organism with a high gene density. We have assembled a Repeat Pattern Toolkit (RPT) to analyze the patterns of repeats occurring in DNA. The tools include identifying significant locM alignments (utilizing both two-way and three-way alignments), dividing the set of alignments into connected components (signifying repeat families), computing evolutionary distance between repeat fanfily members, constructing minimum spanning trees from the connected components, and ~isualizing the evolution of the repeat faanilies. Over 7000 families of repetitive sequences were identified. The size of the families ranged from isolated pairs to over 1600 segments of similar sequence. Approximately 12.3% of the analyzed sequence participates in a repeat element.