einverted

Function

Description

einverted looks for inverted repeats (stem loops) in a nucleotide sequence.

It will find inverted repeats that include a proprtion of mismatches and gaps (bulges in the stem loop).

Algorithm

It works by finding alignments between the sequence and its reverse complement that exceed a threshold score. Gaps and Mismatches are assigned a penalty (negative) score. Matches are assigned a positive score. The score is calculated by summing the values of each match, the penalties of each mismatch and the large penalties of any gaps. Any region whose score exceeds the threshold is reported.

einverted uses dynamic programming and thus is guaranteed to find the optimal alignment, but is slower than, for example, a self-by-self BLAST. It can find multiple inverted repeats in a sequence.

einverted does not report overlapping matches.

The original "inverted" program was written to annotate the nematode genome. Excluding overlapping repeats saved problems with simple repeat sequences in this genome.

Usage

Command line arguments


Input file format

The input for einverted is a nucleotide sequence

Output file format

Data files

None.

Notes

Sometimes you can find repeats using the program palindrome that you can't find with einverted using the default parameters.

This is not due to a problem with either program. It is simply because some of the shortest repeats that you find with palindrome's default parameter values are below einverted's default cutoff score - you should decrease the 'Minimum score threshold' to see them.

For example, when palindrome is run with 'em:hsfau1', it finds the repeat:

64    aaaactaaggc    74
      |||||||||||
98    ttttgattccg    88

einverted will not report this as its score is 33 (11 bases scoring 3 each, no mismatches or gaps) with is below the default score cutoff of 50.

If einverted is run as:

% einverted em:hsfau1 -threshold 33

then it will find it:

Score 33: 11/11 (100%) matches, 0 gaps
      64 aaaactaaggc 74      
         |||||||||||
      98 ttttgattccg 88      

Anything can be considered to be a repeat if you set the score threshold low enough!

einverted does not report overlapping matches.

The original "inverted" program was written to annotate the nematode genome. Excluding overlapping repeats saved problems with simple repeat sequences in this genome.

References

Some useful references on inverted repeats:

  1. Pearson CE, Zorbas H, Price GB, Zannis-Hadjopoulos M Inverted repeats, stem-loops, and cruciforms: significance for initiation of DNA replication. J Cell Biochem 1996 Oct;63(1):1-22
  2. Waldman AS, Tran H, Goldsmith EC, Resnick MA. q Long inverted repeats are an at-risk motif for recombination in mammalian cells. Genetics. 1999 Dec;153(4):1873-83. PMID: 10581292; UI: 20050682
  3. Jacobsen SE Gene silencing: Maintaining methylation patterns. Curr Biol 1999 Aug 26;9(16):R617-9
  4. Lewis S, Akgun E, Jasin M. Palindromic DNA and genome stability. Further studies. Ann N Y Acad Sci. 1999 May 18;870:45-57. PMID: 10415472; UI: 99343961
  5. Dai X, Greizerstein MB, Nadas-Chinni K, Rothman-Denes LB Supercoil-induced extrusion of a regulatory DNA hairpin. Proc Natl Acad Sci U S A 1997 Mar 18;94(6):2174-9

Warnings

None.

Diagnostic Error Messages

None.

Exit status

It always exits with a status of 0.

Known bugs

None. palindrome also looks for inverted repeats but is much faster and less sensitive, as it looks for near-perfect repeats.

Author(s)

This program was originally written by

This application was modified for inclusion in EMBOSS by

History

Target users

Comments