checktrans

Function

Description

Reads in a protein sequence containing stops, and writes a report of any open reading frames (continuous protein sequence with no stops) that are greater than a minimum size. The default minimum ORF size is 100 residues. It writes out any ORF sequences.

The input sequence might typically have been produced by transeq.

Note that if you have only translated a nucleic sequence in one frame, checktrans will miss possible ORFs in other frames. You have to give checktrans translations in all three (six?) frames in order for it to be effective at finding all possible ORFs.

Usage

Command line arguments


Input file format

This program reads the USA of a protein sequence with STOP codons in it.

Output file format

This program writes three files: the ORF report file (paamir_1.checktrans), the output sequence file (paamir_1.fasta) and the feature file (paamir_1.out3) which is in GFF format by default.

The ORF report file gives the numeric count of the ORF, the position of the terminating STOP codon, the length of the ORF, its start and end positions and the name of the sequence it has been written out as.

The name of the output sequences is constructed from the name of the input sequence followed by an underscore and then the numeric count of the ORF (e.g. 'PAAMIR_1_7').

Data files

None.

Notes

None.

References

None.

Warnings

None.

Diagnostic Error Messages

None.

Exit status

This program always exits with a status of 0.

Known bugs

None.

Author(s)

and modified by to output the sequence data to a single file in the conventional EMBOSS style.

History

Target users

Comments