einverted
Function
Description
einverted looks for inverted repeats (stem loops) in a nucleotide
sequence.
It will find inverted repeats that include a proprtion of mismatches and
gaps (bulges in the stem loop).
Algorithm
It works by finding alignments between the sequence and its reverse
complement that exceed a threshold score. Gaps and Mismatches are
assigned a penalty (negative) score. Matches are assigned a positive
score. The score is calculated by summing the values of each match, the
penalties of each mismatch and the large penalties of any gaps. Any
region whose score exceeds the threshold is reported.
einverted uses dynamic programming and thus is guaranteed to find
the optimal alignment, but is slower than, for example, a self-by-self BLAST.
It can find multiple inverted repeats in a sequence.
einverted does not report overlapping matches.
The original "inverted" program was written to annotate the nematode
genome. Excluding overlapping repeats saved problems with simple repeat
sequences in this genome.
Usage
Command line arguments
Input file format
The input for einverted is a nucleotide sequence
Output file format
Data files
None.
Notes
Sometimes you can find repeats using the program palindrome that
you can't find with einverted using the default parameters.
This is not due to a problem with either program. It is simply because
some of the shortest repeats that you find with palindrome's
default parameter values are below einverted's default cutoff
score - you should decrease the 'Minimum score threshold' to see them.
For example, when palindrome is run with 'em:hsfau1',
it finds the repeat:
64 aaaactaaggc 74
|||||||||||
98 ttttgattccg 88
einverted will not report this as its score is 33 (11 bases
scoring 3 each, no mismatches or gaps) with is below the default score
cutoff of 50.
If einverted is run as:
% einverted em:hsfau1 -threshold 33
then it will find it:
Score 33: 11/11 (100%) matches, 0 gaps
64 aaaactaaggc 74
|||||||||||
98 ttttgattccg 88
Anything can be considered to be a repeat if you set the score
threshold low enough!
einverted does not report overlapping matches.
The original "inverted" program was written to annotate the nematode
genome. Excluding overlapping repeats saved problems with simple repeat
sequences in this genome.
References
Some useful references on inverted repeats:
-
Pearson CE, Zorbas H, Price GB, Zannis-Hadjopoulos M Inverted repeats,
stem-loops, and cruciforms: significance for initiation of DNA
replication. J Cell Biochem 1996 Oct;63(1):1-22
-
Waldman AS, Tran H, Goldsmith EC, Resnick MA. q Long inverted repeats
are an at-risk motif for recombination in mammalian cells.
Genetics. 1999 Dec;153(4):1873-83. PMID: 10581292; UI: 20050682
-
Jacobsen SE
Gene silencing: Maintaining methylation patterns.
Curr Biol 1999 Aug 26;9(16):R617-9
-
Lewis S, Akgun E, Jasin M.
Palindromic DNA and genome stability. Further studies.
Ann N Y Acad Sci. 1999 May 18;870:45-57.
PMID: 10415472; UI: 99343961
-
Dai X, Greizerstein MB, Nadas-Chinni K, Rothman-Denes LB
Supercoil-induced extrusion of a regulatory DNA hairpin. Proc Natl
Acad Sci U S A 1997 Mar 18;94(6):2174-9
Warnings
None.
Diagnostic Error Messages
None.
Exit status
It always exits with a status of 0.
Known bugs
None.
palindrome also looks for inverted repeats but is much faster
and less sensitive, as it looks for near-perfect repeats.
Author(s)
This program was originally written by
This application was modified for inclusion in EMBOSS by
History
Target users
Comments