Comparative analysis of the length and frequency association in the PTAP motif of subtypes B and C. Full-length Gag protein sequences belonging to the HIV-1 subtypes B and C were downloaded from the HIV LANL sequence database (accessed in June, 2017). One sequence per patient was selected for the analysis. Of the 3,895 and 1,879 subtype B and subtype C sequences analyzed, 548 and 505 sequences, respectively, contained a sequence insertion in the PTAP motif. The number of sequences containing an insertion is plotted against the length of duplication. The numbers and percentages of sequences with a duplication of 3, 6, 7, 12, and 14 amino acids are depicted (inset). The percentage values represent the proportion of sequences containing a PTAP duplication. The 21 amino acid windows consisting of the PTAP motif and representing the consensus sequences of subtypes B and C are presented below. The original and duplicated amino acid sequences are indicated. The arrows indicate the length of the PTAP motif and the flanking amino acids and the direction of reverse transcription. The duplicated amino acid residues are highlighted in bold. The core PTAP motifs are underlined.