Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2013 Mar 28;29(10):1268–1274. doi: 10.1093/bioinformatics/btt149

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

© The Author 2013. Published by Oxford University Press.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

PMC Copyright notice

Fig. 2. — Algorithm flowchart. A: Encoding of the FASTA sequences using dinucleotide counts (see Section 3.1.1 for details). B: Identify new clusters and add sequences to previously identified clusters by single linkage using Euclidian distance cutoff d. C: When any previously identified clusters are merging at the distance cutoff d, record the cluster content before merging, and update the cluster parent–child hierarchy. D: Remove newly identified clusters containing less than N sequences. E: Increase the Euclidian distance cutoff d for the next pass. F: Traverse the cluster parent–child merging tree to re-create each cluster discovered in its last state before its merging with another cluster (identification of seed clusters). G: Re-assign sequences to closest seed clusters (see text for detail)