Information Panel

Comparing Programs and Methods to Use for Global Multiple Sequence Alignment

Adapted from Bioinformatics: Sequence and Genome Analysis, 2nd edition, by David W. Mount. CSHL Press, Cold Spring Harbor, NY, USA, 2004.

INTRODUCTION

It is difficult to find a global optimal alignment of more than two sequences (and, especially, more than three) that includes matches, mismatches, and gaps and that takes into account the degree of variation in all of the sequences at the same time. Thus, approximate methods are used, such as progressive global alignment, iterative global alignment, alignments based on locally conserved patterns found in the same order in the sequences, statistical methods that generate probabilistic models of the sequences, and multiple sequence alignments produced by graph-based methods. When 10 or more sequences are being compared, it is common to begin by determining sequence similarities between all pairs of sequences in the set. A variety of methods are then available to cluster the sequences into the most related groups or into a phylogenetic tree. This article discusses several of these methods and provides data that compare their utility under various conditions.

| Table of Contents