The find matching words routine finds runs of identical characters in the sequence. Its main value is speed, being 100's of times faster than the find similar spans function. It is of course not very sensitive but is useful for long DNA sequences.
The word length is the minimum number of consecutive matching characters. All runs of identical characters that are at least as long as the word length will produce a line on the sip plot of length proportional to the actual word length ( see section Sip plot).
If the expected number of matches is greater than a maximum value (currently set by default to be 5000) ( see section Changing the maximum number of matches), a dialogue box will be invoked (see below). This dialogue box displays the expected number of matches and an approximation of the memory required to save these matches. The user is then given the option of plotting the results only (temporary result) ( see section Permanent and temporary results), or saving the results in memory. It is strongly advised that for large numbers of matches that the results are only plotted.
Further operations
horizontal EMBL: hsproperd vertical EMBL: mmproper word length 8 Number of matches 140
Positions 162 h 4 v and length 14 ttcacccagtatga Positions 225 h 67 v and length 18 gaagactgctgtctcaac Positions 509 h 118 v and length 8 ctctgtca Positions 276 h 118 v and length 9 ctctgtcag Positions 288 h 130 v and length 8 tgcaggtc Positions 626 h 131 v and length 8 gcaggtct Positions 1208 h 144 v and length 8 atggtcag