Exercise 2: Aligning draft genomes with the Mauve contig mover

The Mauve Contig Mover (MCM) algorithm will align a draft genome to a reference sequence, ordering the contigs in the draft genome according the their position along the reference genome. This will be chosen as the default algorithm if two genomes are selected for alignment, and one or both of them are a list of sequences. The reference genome will be automatically determined if one of the two files selected is a single sequence, as in this example. If both files are a sequence list, you will need to choose the reference genome from the drop down list.

Select the draft genome NZ_MRBH000000000, and the reference genome NC_009565. Go to Then go to Align/Assemble → Align Whole Genomes. In the Mauve options, change the alignment algorithm to MCM algorithm if it is not already set on this. Make sure the option to Save ordered contigs is checked. Leave the other settings at their defaults and click OK to start the analysis.

This outputs two documents: a Mauve genome alignment, and a sequence list containing the draft genome contigs sorted according to order and orientation that they appear in the alignment.

Open the Mauve alignment document. You will notice that this alignment has more LCBs than the previous whole genome alignment from Exercise 1. The red vertical bars on the NZ_MRBH00000000 sequence denote the boundaries of the individual contigs in this genome.

Use the zoom controls above the viewer to inspect some of the LCBs more closely. Click the Zoom In button a few times, and then use the Shift Left button to move to approximately position 300,000 in NC_009565. You should see a large light blue block that has several red vertical lines on the lower block denoting contig boundaries. Right click on the block and choose View LCB alignment. In the Alignment View you can see that the NZ_MRBH000000000 sequence is comprised of 35 concatenated contigs from the draft genome. Individual contigs are denoted by the "Accession" annotation in green.




Now try repeating the same alignment with the progressiveMauve algorithm. You will see that this alignment has many more LCBs and rearrangements than the MCM alignment. This is because when draft genomes are aligned with progressiveMauve, Geneious concatenates the contigs into a single sequence, in the order they appear in the sequence list, prior to performing the alignment. As this concatenation step is unlikely to order the contigs correctly, the minimum number of LCBs in the alignment is probably going to correspond to the number of contigs in your list (unless adjacent contigs in the list happen to be in the correct order). However, with the MCM alogrithm the contigs are ordered and one LCB may often contain multiple contigs, so the alignment will be much cleaner with fewer LCBs.

For this reason you should always use the MCM algorithm for pairwise comparisons of draft genomes.

Exercise 3: Converting a Mauve alignment into a standard alignment