Sequence size and HIV clustering. Results of correlation analysis between the extent of HIV clustering and sequence length are shown. Axis y shows the proportion of HIV-1C sequences in clusters. Axis x shows the sequence length. (A–D) Graphs show the extent of HIV clustering at different bootstrap thresholds for cluster identification: (A) ≥0.70, (B) ≥0.80, (C) ≥0.90, and (D)=1.0. The code above the graphs indicates different sets of HIV-1C sequences including near full-length genome sequences (FG), structural genes, subregions, and their combinations.