Dotplots are an excellent visual way to view the regions of similarity between pairs of sequences which cannot be reproduced just by sequence alignment. In particular, identifying repeated regions, inversions and translocations are not handled by pairwise alignment methods.
A dotplot marks matches between two sequences on a two dimensional grid. Runs of identity will produce diagonal lines on this grid. To illustrate this, two sequences have been provided. One is an original Pygmy Chimp sequence and the other is the same sequence edited to provide an interesting dotplot.
To open these click here. You may have to click the tab that says Dotplot. You can also open the dotplot in a new window by clicking the new window button .
What you should see now is a 2D grid with a number of diagonal lines. One sequence appears across the top on the x-axis and the second along the left hand side on the y-axis. The overall trend is for a diagonal from the top left to the bottom right.
Note: You can zoom in and out on the dotplot by adjusting the zoom level using and
. If you zoom in enough you will be able to see the individual letters of the two sequences on each axis. Also, notice how there is a cross-hair that follows your mouse which gives you the position in the two sequences of your pointer. You can use this to note the location in the dotplot of features you are interested in and this will help you later when you start producing alignments.
Try changing the Data Source settings and see how this affects the dotplot. Notice how changing the sensitivity setting increases or reduces the number of short, noisy points in the dotplot. Increasing the window size will tend to fill the gaps between smaller diagonals to make longer ones. Reducing it will shorten and reduce the diagonals that you see. There may be some interesting detail to be seen but generally reducing this simply adds noise to the plot so increasing the value will leave the strong diagonals visible and get rid of this noise.
Parallel diagonals in a dotplot indicate that there is a repeating pattern in the sequence. A reversed diagonal indicates a sequence inversion event. This means that one sequence has to match against the reverse complement of another sequence.
Now that you are familiar with how these two sequences are related you can move on and start aligning them.