Skip to main content
. 2022 Nov 30;18(11):e1010745. doi: 10.1371/journal.pcbi.1010745

Table 1. Summary of sequence data characteristics.

Length is the median length of nucleotide (nt) sequences. HXB2 coords = reference nucleotide coordinates in the HXB2 genome (Genbank accession K03455). Year type: sequences are annotated with year of sample collection, and in some cases date of HIV diagnosis. N = Total sample size, including both old and new sequences. Incid = number of sequences in ‘incident’ subset (most recent year). Subtype classifications were derived from the original data sources, when available, or generated de novo with SCUEAL [60].

Location Length (nt) HXB2 coords Year type N Incid Date range Subtypes
Washington, USA 985 2,256–3,240 diagnosis 6,583 253 1982–2019 B (89%), C (4%), A1 (1%), other (10%)
collection 6,583 253 1999–2019
Alberta, Canada 1,017 2,253–3,269 collection 1,051 155 2007–2013 B (77%), C (20%), A1 (3%)
Tennessee, USA 1,296 2,253–3,548 collection 2,741 162 2001–2015 B (93%), C (2%), other (5%)
diagnosis 2,338 129 1977–2011
Beijing, China 1,004 2,273–3,276 collection 3,916 1,196 2005–2015 01AE (50%), 07BC (25%), B (19%), other (6%)