Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2003 Jul 1;31(13):3518–3524. doi: 10.1093/nar/gkg579

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2003 Oxford University Press

PMC Copyright notice

Constructing a multiple alignment. (A) Constructing a row of the crude multiple alignment. One of the secondary sequences (e.g. sequence r) consists of two contigs. The pairwise alignments between the reference sequence and the two contigs are shown in a dot-plot format, in which the positions of each local alignment are plotted as a series of diagonal lines. For clarity, the four major local alignments are numbered and enclosed in shaded parallelograms. To construct a row in the crude multiple alignment, the local alignments are pruned so that each position in the reference sequence is aligned at most once. In this illustration, interval a-b is aligned to the reverse complement of B–A, b–c is aligned to B–C, c–d is aligned to C′–D, and e–g is aligned to E–G. This necessitates some pruning since some positions in the reference sequence are aligned more than once, e.g. the positions just before b. Extraneous matches to an improperly masked repetitive element around position f are discarded. Row r of the crude multiple alignment is constructed from the aligned intervals listed above. Gaps within a local pairwise alignment, say between a and b, result in ‘internal gaps’ in row r of the multiple alignment, which are penalized. A region between aligned segments (e.g. region z–a or d–e) is considered an ‘end-gap’ and is not penalized. Note that segment E–D of the secondary sequence appears twice in row r. (B) Refinement of the multiple alignment. One cycle of the refinement process is shown schematically. The crude multiple alignment is shown as a series of rows with thick lines representing strings of nucleotides; gaps are spaces in the rows. A subalignment between positions i and j is extracted and row r removed. The subalignment and row r are reduced by removing gaps as described in the Methods, and a new alignment is computed between the sequence in row r and the reduced subalignment (without row r). If this process improves the alignment score, then the new subalignment is spliced back into the large alignment. This process is repeated for all sub-regions where the alignment's columns have changed.