(A) Unrooted maximum-likelihood tree of the complete dataset of 830 HIV-1 group M genomes. (B) Evolutionary distances between the root and all tips of a rooted version of the tree shown in A plotted against the year the sequence was sampled. The root location was detemined in Tempest (30), minimizing the sum of the squared residuals from this exploratory regression. Root-to-tip plots based on ML trees of subsampled datasets are displayed in SI Appendix, Fig. S1. (C) Midpoint rooted ML tree of a 1,799-nt pol alignment that includes subsampled dataset A, sequences that cluster with the DRC66 lineage, plus multiple divergent subtype C-like sequences as summarized in ref. 17, some of which are derived from intersubtype recombinant genomes (e.g., “CU”) or of which only partial sequence for this alignment is available. Tips of the subtype C-related sequences, including DRC66, are labeled by subtype (marked with * if determined based on partial sequence only, e.g., “C*”), sampling year, sampling country, and GenBank accession number. For sampling country, COD = Democratic Republic of the Congo, BWA = Botwana, SWE = Sweden, ZAF = South Africa. In all three figures subtypes are color coded according to the color legend. The DRC66 sequence is indicated with a red star.