etandem

Function

Description

etandem looks for tandem repeats in a sequence. It is normally used after equicktandem has been run to identify potential repeat sizes. It calculates a consensus for the repeat region and gives a score for how many matches there are to the consensus - the number of mismatches.

Input sequences are converted into ACGT or N (so ambiguity codes are ignored).
The score is +1 for a match, -1 for a mismatch.
The first copy of a repeat is ignored.
The highest score is kept for each start position and repeat size.

The lowest score to be reported is set by the threshold score. The threshold score can be set on the command-line using the -threshold qualifier, the default is 20. For perfect repeats, the score is the length of the repeat (except for the first copy). Reduce the threshold score a little if you wish to to allow mismatches. Each mismatch scores -1 instead of +1 so it scores 2 less than a perfect match of the same number of bases.

Running with a wide range of repeat sizes is inefficient. That is why equicktandem was written - to give a rapid estimate of the major repeat sizes.

Usage

Command line arguments


Input file format

The input for etandem is a nucleotide sequence USA.

Output file format

By default etandem writes a 'table' report file.

Data files

None

Notes

Running with a wide range of repeat sizes is inefficient. That is why equicktandem was written - to give a rapid estimate of the major repeat sizes.

References

None.

Warnings

None.

Diagnostics

None.

Exit status

It always exits with status 0.

Known bugs

None.

Running with a wide range of repeat sizes is inefficient. That is why equicktandem was written - to give a rapid estimate of the major repeat sizes.

Authors

This program was originally written by

This application was modified for inclusion in EMBOSS by

History

Target users

Comments