The home page of REBASE is: http://rebase.neb.com/
This program uses REBASE data to find the recognition sites and/or cut sites of restriction enzymes in a nucleic acid sequence.
This program displays the cut sites on both strands by default. It will optionally also display the translation of the sequence.
There are many options to change the style of display to aid in making clear presentations.
One potentially very useful option is '-flatreformat' that displays not only the cut sites which many other restriction cut-site programs will show, but also shows the recognition site.
By default, only one of any group of isoschizomers (enzymes that have the same recognition site and cut positions) is reported (this behaviour can be turned off by setting the qualifier '-limit' to be false.) The reported enzyme from any one group of isoschizomers (the prototype) is specified in the REBASE database and the information is held in the data file 'embossre.equ'. You may edit this file to set your own preferred prototype,if you wish.
As well as the display of where enzymes cut in the sequence, remap displays:
|
You can specifiy a file of ranges to display in uppercase by giving the '-uppercase' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-upper @myfile').
The format of the range file is:
An example range file is:
# this is my set of ranges 12 23 4 5 this is like 12-23, but smaller 67 10348 interesting region
You can specifiy a file of ranges to highlight in a different colour when outputting in HTML format (using the '-html' qualifier) by giving the '-highlight' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-highlight @myfile').
The format of this file is very similar to the format of the above uppercase range file, except that the text after the start and end positions is used as the HTML colour name. This colour name is used 'as is' when specifying the colour in HTML in a '<FONT COLOR=xxx>' construct, (where 'xxx' is the name of the colour).
The standard names of HTML font colours are given in:
http://http://www.w3.org/TR/REC-html40/types.html
and
http://www.ausmall.com.au/freegraf/ncolour2.htm
and
http://mindprod.com/htmlcolours.html
(amongst other places).
An example highlight range file is:
# this is my set of ranges 12 23 red 4 5 darkturquoise 67 10348 #FFE4E1
The name of the sequence is displayed, followed by the description of the sequence.
The formatted display of cut sites on the sequence follows, with the six-frame translation below it. The cut sites are indicated by a slash character '\' that points to the poition between the nucleotides where the cuts occur. Cuts by many enzymes at the same position are indicated by stacking the enzyme names on top of each other.
At the end the section header 'Enzymes that cut' is displayed followed by a list of the enzymes that cut the specified sequence and the number of times that they cut. For each enzyme that cuts, a list of isoschizomers of that enzyme (sharing the same recognition site pattern and cut sites) is given.
This is followed by lists of the enzymes that do cut, but which cut less often than the '-mincut' qualifier or more often than the '-maxcut' qualifier.
Any of the isoschizomers that are excluded from cutting, (either through restrictions such as the permitted number of cuts, blunt cutters only, single cutters only etc. or because their name has not been given in the input list of enzymes), will not be listed.
Then a list is displayed of the enzymes whose names were input and which match the other criteria ('-sitelen', '-blunt', '-sticky', '-ambiguity' or '-commercial') but which do not cut.
Finally the number of enzymes that were rejected from consideration because they do not match the '-sitelen', '-blunt', '-sticky', '-ambiguity' or '-commercial' criteria is displayed.
The '-flatreformat' qualifier changes the display to emphasise the recognition site of the restriction enzyme, which is indicated by a row of '=' characters. The cut site if pointed to by a '>' or '<' character and if the cut site is not within or imemdiately adjacent to the recognition site, they are linked by a row of '.' characters.
The name of the enzyme is displayed above (or below when the reverse sense site if displayed) the recognition site. The name of the enzyme is also displayed above the cut site if this occurs on a different display line to the recognition site (i.e. if it wraps onto the next line of sequence).