Table 2.
Multilevel multivariable logistic regression model of transmission selection bias
| Likelihood Ratio Test2 | ||||||
|---|---|---|---|---|---|---|
| Feature | γ Estimate1 | Std. Error |
z value | Pr(>|z|) | χ2 (df) | Pr(>χ2) |
| (Intercept) | 6.43 | 0.558 | 11.53 | <1E-16 | ||
| Cohort Frequency (cfreq)3 | 1.70 | 0.119 | 14.24 | <1E-16 | ||
| cfreq^2 | 0.24 | 0.019 | 12.28 | <1E-16 | ||
| # Covarying sites | 0.04 | 0.012 | 3.35 | 8.2E-4 | ||
| Susceptible to Recipient HLA | −0.60 | 0.142 | −4.18 | 2.9E-5 | ||
| Donor Esc Polymorphism : Gag4,5 | 0.00 | 0.253 | 0.00 | 0.998 | 13.3 (3) | 0.004 |
| Donor Esc Polymorphism : Pol | −0.69 | 0.197 | −3.49 | 4.9E-4 | ||
| Donor Esc Polymorphism : Nef | 0.48 | 0.326 | 1.48 | 0.140 | ||
| Risk Index5 | 0.15 | 0.084 | 1.74 | 0.081 | 22.2 (3) | 5.9E-5 |
| Risk Index : cfreq6 | 0.14 | 0.067 | 2.15 | 0.032 | ||
| Risk Index : cfreq^2 | 0.06 | 0.015 | 3.65 | 2.6E-4 | ||
| ETI | −0.16 | 0.132 | −1.18 | 0.236 | ||
| p177 | 0.22 | 0.228 | 0.97 | 0.333 | ||
| p17 : cfreq | 0.19 | 0.103 | 1.83 | 0.067 | ||
| p24 | 1.72 | 0.285 | 6.03 | 1.7E-9 | ||
| p24 : cfreq | 0.64 | 0.116 | 5.47 | 4.6E-8 | ||
| p15 | 0.65 | 0.241 | 2.71 | 0.007 | ||
| p15 : cfreq | 0.28 | 0.106 | 2.66 | 0.008 | ||
| Protease | 0.62 | 0.307 | 2.03 | 0.042 | ||
| Protease : cfreq | 0.15 | 0.135 | 1.09 | 0.278 | ||
| RT | 0.62 | 0.208 | 2.98 | 0.003 | ||
| RT : cfreq | 0.15 | 0.095 | 1.60 | 0.109 | ||
| Integrase | 0.50 | 0.225 | 2.23 | 0.026 | ||
| Integrase : cfreq | 0.19 | 0.105 | 1.78 | 0.076 | ||
| Nef | 0.97 | 0.236 | 4.12 | 3.8E-5 | ||
| Nef : cfreq | 0.41 | 0.310 | 1.34 | 0.181 | ||
| Nef CD4/MHC Domains | 0.50 | 0.104 | 4.80 | 1.6E-6 | ||
| Nef CD4/MHC Domains : cfreq | 0.52 | 0.133 | 3.88 | 1.0E-4 | ||
| Structural Frequency (sfreq)8 | 0.33 | 0.144 | 2.29 | 0.022 | 24.2 (3) | 2.2E-5 |
| sfreq : cfreq | 0.49 | 0.129 | 3.80 | 1.5E-4 | ||
| sfreq : cfreq^2 | 0.13 | 0.029 | 4.45 | 8.6E-6 | ||
| Random Effects9 | Std. Dev. | Corr | ||||
| (Intercept) | 0.91 | |||||
| cfreq | 0.08 | −1.00 | ||||
Fixed effect parameters. Model was fit using multilevel logistic regression. Model fit was not improved by the addition of quadratric interaction effects between cohort frequency and protein domains or couple ID. See methods for feature definitions. Compare to Figures 2 and 3 in the main text.
Likelihood ratio test performed between full model and a model excluding the grouped set of features.
Cohort frequency was standardized (zero mean, unit variance).
Donor CTL escape features were scaled by 1−cfreq to reflect the probability that de novo escape occurred in the donor.
Colon (:) signifies a multiplicative interaction.
Standardized (zero mean, unit variance) donor VL plus one if the recipient if female or a male with GUI.
Protein domain features are treated as covariates. It is not clear whether significance implies a different relationships between cohort frequency and odds of transmission, or simply reflect variations in mean donor quasispecies diversity.
Defined as the expected frequency of an amino acid in the cohort based on the impact of that amino acid on the protein structure (see methods; frequency was standardized). Structural features were evaluated separately from the rest of the model because crystal structures are available for only a subset of sites. Model estimates reflect model fit using all parameters. Likelihood ratio test is against a null model including only the main parameters, but fit on sites with structural information.
Random effects were applied to each couple. The intercept and the slope of cohort frequency were allowed to vary as a bivariate Guassian. Maximum likelihood standard deviations are reported. The maximum likelihood covariance term is presented as a correlation.