Relationship between amino acid frequency of circulating viruses and mother-to-child transmission. The odds that the mother's amino acid will be transmitted to the infant is a function of the empirical frequency of amino acids among circulating viruses. Each plot shows the empirical transmission probability (odds on a log10 scale) of a variant as a function of the frequency of the variant among 929 consensus Gag sequences isolated from chronically infected, antiretroviral-naive patients from Durban, South Africa. Empirical transmission probabilities (solid colored lines) are estimated by counting the proportion transmitted within a continuous sliding window with a 1-log-odds width with respect to the value on the abscissa. All log-odds values are smoothed by adding a pseudocount equal to 1% of the number of Durban sequences used to estimate the cohort frequency. Gray lines represent a linear fit to the sliding-window averages; shaded areas represent 95% confidence intervals estimated by using the percentile-t method on 1,000 multilevel bootstraps. Sites in which a mixture was observed in the infant founder virus were excluded. (A) Among 13,441 nonmixture sites from 29 mothers, the odds of transmission is associated with the frequency of the amino acid (AA) in the Durban cohort. (B) Among 281 sites containing two-amino-acid mixtures from 26 mothers, the probability of transmission is associated with the relative cohort frequency of the amino acid. Transmission probability is calculated with respect to a randomly chosen member of the mixture; the abscissa represents the relative frequency of that amino acid in the Durban cohort compared to the other amino acid in the mixture. For both plots, the slope is significantly greater than 0 (P <2e−10, as estimated by a multilevel logistic regression model [see Materials and Methods]).