Vector sequences should be stored in simple text files with up to 80 characters of data per line. Sequencing vectors are those vectors such as m13 used to produce templates for sequencing. All other vectors, such as cosmid vectors, that are used to purify and grow the DNA prior to it being subcloned into sequencing vectors are termed "cloning vectors". It is important that the files containing cloning vector sequences that are used by vepe are arranged so that the cloning site follows the last base in the file. For example (where X is the cloning site).
start of file acatacatacatatata acatagatagatacaga . . . cagatataX end of file
For sequencing vectors it is somewhat tedious to calculate the correct cloning site and primers site values for vepe. The numbers can be worked out from listings of the vector sequences but it is far easier to use the restriction enzyme search in nip to do it. The last section of this note explains how to define the positions of cloning site and primer in a single search using nip. First we do it in two steps to explain the operations.
The position of the cloning site depends on the ordering of the bases in the particular vector sequence file being used. That is, as the sequences are circular, the file may be arranged to start at any base and still give the same circular sequence. Vepe must be told the correct position of the cloning site, then, relative to that, the position of the first base that will be included in the reading. i.e. the relative position of the first base 3' of the primer.
For "forward" primers we search for the complement of the primer sequence in the vector. For "reverse" primers we search for the primer sequence in the vector. The relative positions of reverse primers, being to the "left" of the cloning site, have negative values. Below we use EMBL entry M13MP18 as an example. Here the SmaI site is at 6249, the forward primer (ForwardP) is at relative position 41, and the reverse primer (ReverseP) is at position -24. The figure was produced by nip.
ECORI BANII
. BSP1286
. HGIAI
. SACI
. . BANI
. . . AVAI
. . . BINI
. . . KPNI
. . . .NCII
. . . ..NCII
. . . ..SMAI
. . . ... BAMHI
. . . ... XHOII XBAI
. . . ... . . BINI
123456789012
123456789012345678901234
ReversePaacagctatgaccatg
acacaggaaacagctatgaccatgattacgaattcgagctcggtacccggggatcctcta
6210 6220 6230 6240 6250 6260
SALI
.ACCI
..HINCII PSTI
... . BSPMI
... . . SPHI
... . . . HINDIII EAEI
34567890123456789012345678901
tgaccggcagcaaaatg ForwardP
gagtcgacctgcaggcatgcaagcttggcactggccgtcgttttacaacgtcgtgactgg
6270 6280 6290 6300 6310 6320