Table 2. Amino acid signatures of BCAR and BPANDEMIC datasets.
Location | Gene | gag | pol | vif | vpu | tat | rev | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Protein | P24 | P24 | P24 | P7 | PR | RT | RT | Vif | Vpu | Tat | Rev | |||
Position | 27 | 120 | 148 | 12 | 41 | 207 | 211 | 50 | 24 | 23 | 57 | 67 | 102 | |
Ancestors | BCAR | V | S | V | I | K | E | R | R | T | T | A | S | V |
BPANDEMIC | V | S | V | T | K | Q | R | R | T | T | A | S | V | |
Datasets | D n = 18 | I89 V11 | S78 N17 | V100 | I56 T22 | K100 | E88 K6 | K61 R39 | K89 R11 | T94 S6 | N94 T6 | E88 A6 | S67 P33 | I78 V16 |
BCAR n = 18 | I61 V39 | S50 N39 | V61 T22 | I39 T33 | K89 R11 | E50 A11 | K50 R50 | K56 R44 | T83 S6 | N44 T44 | A55 E39 | S83 P17 | V100 | |
BCAR expanded | I71 V29 | S50 N25 | V68 T18 | I54 T25 | K74 R23 | E48 A25 | K54 R25 | - | - | - | A53 E32 | - | - | |
n = 28 | n = 197 | n = 59 | ||||||||||||
BPANDEMIC n = 450 | V69 I31 | N55 S26 | T66 V27 | T40 I28 | R75 K24 | Q82 E9 | R46 K44 | R76 K24 | S51 T45 | T76 N22 | G40 E27 | P59 S39 | I69 V28 | |
p-values adj. | D/BCAR | - | - | - | - | - | - | 0.7563 | - | - | 0.00391 | 0.0039 | - | 0.0001 |
D/BPANDEMIC | 0.0172 | 0.0461 | 0.0009 | 0.3137 | 0.0073 | 0.0002 | 0.7962 | 0.0004 | 0.0311 | 0.00229 | 0.0508 | 0.1355 | - | |
BCAR/BPANDEMIC | 0.1084 | 0.3094 | 0.1084 | 0.0502 | 0.0200 | 0.0008 | 0.9670 | 0.0255 | 0.0015 | 0.0465 | 0.0300 | 0.0423 | 0.0300 | |
BCAR exp/BPANDEMIC | 0.0091 | 0.0091 | 0.0376 | 0.0376 | 0.0001 | 0.0001 | 0.0001 | - | - | - | 0.0001 | - | - |
The table details positions identified as bearing amino acidic compositions significantly divergent between BCAR and BPANDEMIC strains. For each position are supplied (1) the residues present in the reconstructed BCAR and BPANDEMIC ancestors; (2) the two most common amino acid observed in that position in Subtype D, BCAR, BCAR Expanded, and BPANDEMIC datasets accompanied by a number representing its frequency in each group; (3) the adjusted p-values. The sampling range of each sub dataset is indicated in the table. Ancestor sequences were reconstructed employing FL genomes of BCAR (n = 18) and BPANDEMIC (n = 69) lineages. G (Glycine), P (Proline), A (Alanine), V (Valine), L (Leucine), I (Isoleucine), M (Methionine), C (Cysteine), F (Phenylalanine), Y (Tyrosine), W (Tryptophan), H (Histidine), K (Lysine), R (Arginine), Q (Glutamine), N (Asparagine), E (Glutamic Acid), D (Aspartic Acid), S (Serine), T (Threonine). p-values < 0.05 are in bold.