Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 May 20.
Published in final edited form as: Stat Med. 2021 Mar 3;40(11):2604–2612. doi: 10.1002/sim.8920

A Top Scoring Pairs Classifier for Recent HIV Infections

Athena Chen 1, Oliver Laeyendecker 2,3, Susan H Eshleman 4, Daniel R Monaco 4, Kai Kammers 5, H Benjamin Larman 4, Ingo Ruczinski 1,*
PMCID: PMC8375573  NIHMSID: NIHMS1733147  PMID: 33660319

Summary

Accurate incidence estimation of HIV infection from cross-sectional biomarker data is crucial for monitoring the epidemic and determining the impact of HIV prevention interventions. A key feature of cross-sectional incidence testing methods is the mean window period, defined as the average duration that infected individuals are classified as recently infected. Two assays available for cross-sectional incidence estimation, the BED capture immunoassay and the Limiting Antigen (LAg) Avidity assay, measure a general characteristic of antibody response; performance of these assays can be affected and biased by factors such as viral suppression, resulting in sample misclassification and overestimation of HIV incidence. As availability and use of antiretroviral treatment increases worldwide, algorithms that do not include HIV viral load and are not impacted by viral suppression are needed for cross-sectional HIV incidence estimation. Using a phage display system to quantify antibody binding to over 3,300 HIV peptides, we present a classifier based on top scoring peptide pairs that identifies recent infections using HIV antibody responses alone. Based on plasma samples from individuals with known dates of seroconversion, we estimated the mean window period for our classifier to be 217 days (95% confidence interval 183–257 days), compared to the estimated mean window period for the LAg-Avidity protocol of 106 days (76–146 days). Moreover, each of the four peptide pairs correctly classified more of the recent samples than the LAg-Avidity assay alone at the same classification accuracy for non-recent samples.

Keywords: HIV infection, incidence estimation, phage immuno-precipitation sequencing, top-scoring pairs

1 |. INTRODUCTION

Accurate methods for estimating the incidence of HIV infection are crucial for monitoring the epidemic and determining the impact of HIV prevention interventions1. The traditional approach for HIV incidence estimation relies on following cohorts of HIV-uninfected individuals and quantifying the rate of new infections. This approach may be impacted by several factors, including selective attrition of those at risk for HIV infection and behavioral changes related to study engagement1. An alternative approach that overcomes some limitations of longitudinal studies is to identify recently infected individuals using biomarkers measured in samples collected from a cross-sectional survey.

HIV incidence can be estimated from a cross-sectional survey using the calculation, I = w/(), where w is the number of individuals identified as recently infected, n is the number of individuals not infected with HIV, and μ is the mean window period2. The mean window period for a testing algorithm is the average duration that infected individuals are classified as recently infected. Annual HIV incidence is the measure most often used for public health, epidemiologic, and research purposes; therefore, it is preferable for algorithms to have mean window periods that are between 180 days and one year3. Further, the probability of an individual being classified as recently infected should to converge to zero within 1–2 years of infection, to reduce the likelihood that individuals infected more than 2 years ago are misclassified as recently infected4.

Two assays used for cross-sectional incidence estimation, the BED capture immunoassay (BED assay)5 and the Limiting Antigen (LAg) Avidity assay6, measure a general characteristic of antibody response, such as titer, class, and avidity7. Unfortunately, these characteristics of the antibody response are affected by various factors8, such as infecting viral subtype9 and viral suppression10, resulting in sample misclassification. Misclassifying samples from persons with long-term infection as recently infected can lead to overestimation of HIV incidence11.

The prevalence of HIV subtypes varies around the world, with HIV subtype C being the most prevalent12. HIV subtype D infection is often associated with a delayed or muted antibody response to infection, which can lead to high rates of misclassification with HIV incidence assays9. Viral suppression due to antiretroviral treatment (ART) or innate suppression reduces the circulating antigen load of HIV. This can decrease the antibody response, leading to frequent false-recent misclassification with many serologic incidence assays. Until recently, ART was usually initiated later in HIV infection, after CD4 cell counts declined13. For this reason, testing algorithms for HIV incidence estimation often include low viral load as a surrogate for non-recent infection. Due to increasing recognition that earlier ART initiation yields significant health benefits and reduces HIV transmission, current treatment guidelines now recommend initiation of ART at the time of HIV diagnosis14. As availability and use of ART increases worldwide, a higher proportion of infected individuals will be virally suppressed early in infection. New testing algorithms that do not include HIV viral load and are not impacted by viral suppression are needed for cross-sectional HIV incidence estimation.

In this manuscript we provide a proof of principle that such algorithms exist, describing a novel approach for identifying recent HIV infections using HIV antibody responses alone. First, we used a phage display system to quantify antibody binding to over 3,300 HIV peptides15. We then developed a classification algorithm based on top-scoring pairs (TSP)16,17,18 to categorize individuals with subtype C HIV infection as recently or non-recently infected. The performance of the classifier was validated on an independent data set and was compared to a widely used testing algorithm, the LAg-Avidity protocol. The LAg-Avidity protocol for HIV incidence estimation includes antibody avidity testing using the LAg-Avidity assay and viral load testing.

2 |. METHODS

2.1 |. Phage Immuno-Precipitation Sequencing (PhIP-Seq)

Antibody reactivities can serve as excellent biomarkers of immune responses and environmental exposures for a variety of reasons, including their abundance, accessibility in peripheral blood, and stability ex vivo. Thus, numerous methods have been developed to assess antibody binding specificities. Phage Immuno-Precipitation Sequencing (PhIP-Seq) is the only technique currently available that simultaneously quantifies antibody binding to hundreds of thousands of candidate epitopes, with a per-sample cost and throughput that enables analysis of large collections of clinical specimens19,20,21. PhIP-Seq uses long-mer oligonucleotide library synthesis to encode proteomic-scale peptide libraries for display on bacteriophages. The resulting phage libraries can then be used to immunoprecipitate antibodies that target specific peptides. Antibody reactivity to individual peptides is assessed by deep sequencing DNA from the precipitated phage particles. The VirScan20 assay uses the PhIP-Seq method to quantify antibody binding to over 96,000 peptides spanning the genomes of more than 200 viruses that infect humans. These peptides include 3,384 peptides spanning the HIV genome (Supplementary Table 1).

2.2 |. Samples and Materials

Plasma samples were obtained between 2001 and 2009 from individuals in the Hormonal Contraception and HIV Genital Shedding (GS) study22. Study participants were women from Zimbabwe and Uganda who had known dates of HIV seroconversion. Consistent with local treatment guidelines at the time of the study, ART was recommended for participants with CD4 cell counts below 250 cells/mm3. CD4 cell count and viral load data were also collected in the GS study22. LAg-Avidity assay (SEDIA Biosciences Corporation, Portland, OR and Maxim Biomedical, Bethesda, MD) data for these participants were generated in a previous study23. For each individual, the estimated date of infection was defined as the midpoint between visits with the last negative and first positive HIV antibody test, or fifteen days after documentation of acute infection (HIV RNA positive / HIV antibody negative status). The sample set analyzed was limited to women whose time interval between last negative and first positive visits was three months or less.

We analyzed two sets of samples from individuals with recent (2 months – 6 months) and non-recent (≥ 18 months) HIV infection (Table 1). Since samples came from individuals with imprecise dates of infection, we set aside samples with reported duration of infection between 6 months and 18 months for the construction of the classifier. The discovery set consisted of samples from GS study participants infected with HIV subtype C who had at least one year of follow-up after seroconversion, with samples collected at three or more visits during the study period. These samples were randomly divided by individual into a training and a test set (Table 1). Overall, the discovery data consisted of read counts for 258 samples from 57 individuals, derived from five multi-well plates. The validation set consisted of challenge samples from the GS study that were not included in the discovery set and run on separate plates. These samples had characteristics known to complicate cross-sectional incidence estimation, such as viral suppression due to ART (viral load ≤ 400 copies/mL) and low CD4 cell count (≤ 350 cells/mm3). Additional samples from individuals in the discovery set were also analyzed with the validation data, and the data were derived from a single additional plate (Supplementary Table 2). These samples included technical replicates from the discovery data and samples from different study visits.

TABLE 1.

Summary statistics for samples in the training, test, and validation sets. Samples from study visits between 2 – 6 months and 18 months after infection were considered recently and non-recently infected, respectively.

Training Test Validation
Number of persons 38 19 19
Number of samples 176 82 26
 Recent infection 63 27 1
 Non-recent infection 113 55 25
Median number of samples per participant (range) 5 (2–7) 4 (2–7) 1 (1–3)
Median duration of infection (range; years) 2.9 (0.2–8.7) 3.1 (0.2–8.3) 5.1 (0.3–8.8)
 Recent infection 0.3 (0.2–0.5) 0.3 (0.2–0.4) 0.3 (0.3–0.3)
 Non-recent infection 4.5 (1.5–8.7) 4.9 (1.5–8.3) 5.1 (1.6–8.8)
Median age of participant (range) 29 (21–43) 27 (19–44) 33 (21–45)
Median CD4 cell count (range; cells/mm3) 463.5 (59–1113) 494.0 (71–1216) 127.0 (54–945)
Median log10 viral load (range; copies/mL) 4.0 (2.1–5.8) 4.3 (2.2–5.9) 4.4 (2.0–6.1)

Antibody reactivity profiles were previously obtained using the PhIP-Seq assay and VirScan library as described in Eshleman et al15. Briefly, plasma IgG concentrations were quantified using an in-house enzyme-linked immunosorbent assay. For each sample, the T7 bacteriophage VirScan library was added to a sample containing 2 μg of IgG to enable peptide-antibody binding. The antibody-bound phage were precipitated using protein A/G coated magnetic beads, and unbound phage were washed away. The peptide-encoding DNA sequences were PCR amplified, barcoded for sample multiplexing, and sequenced using an Illumina HiSeq instrument19.

The VirScan data were normalized and summarized as peptide logarithmic (base 10) relative fold changes15. Antibody reactivity to Ebola and rabies virus (718 Ebola and 518 rabies virus peptides, respectively) was used to normalize data to adjust for sample-to-sample differences in sequencing depth; these values reflected non-specific antibody binding, since participants did not have these infections. The normalized value for each HIV peptide was then divided by the median of the normalized values for the same peptide observed across the mock immunoprecipitation reactions that were run on the same plate, generating a logarithmic fold change value for each HIV peptide.

2.3 |. Construction of the Classifier

Our approach to classifying samples as recent or non-recent extends the k-top scoring pairs (k-TSP) classifier for genomic data16,17,18. To reduce the complexity of the search space, candidate peptides were identified by fitting a simple linear model of log-normalized relative fold changes as a function of log time for each peptide and each study participant using all available samples collected more than 2 months after infection (Supplementary Figure 1). Based on the average antibody response across all individuals, we identified ten peptides with the largest increase and ten peptides with the largest decrease in antibody reactivity over time. Each peptide with increasing antibody response over time was paired with a peptide with decreasing antibody response, resulting in one-hundred peptide pairs. For each sample, we recorded the log relative fold change of the increasing peptide to the decreasing peptide for each pair.

Optimal classification cutoffs were derived from the training data by partitioning the range of the log relative fold changes into 1,000 potential cutoffs. We derived precision-recall curves for each peptide pair by calculating the positive predictive value (PPV) and sensitivity for a classifier at each threshold. Prioritizing the probability of being recent given a recent classification, we defined optimal cutoffs to be the cutoffs that achieved the minimum Euclidean distance to perfect classification, 100% PPV and 100% sensitivity. In cases where more than one cutoff satisfied the optimal criteria, the median cutoff was selected (when the total number of optimal cutoffs was even, the first cutoff larger than the median was selected as the optimal cutoff).

Noting that the pairs with the highest classification accuracy tended to share a peptide, we identified mutually exclusive peptide pairs through iteratively selecting the peptide pair with the highest classification accuracy and removing pairs with overlapping peptides from the list of candidate peptide pairs until no remaining pairs were left. Area under the precision-recall curve was used as a tiebreaker for pairs with equal classification accuracy. By construction of our candidate peptide pairs, this resulted in an ordered list of 10 mutually exclusive pairs (Supplementary Table 3).

For a fixed k, the k-TSP classifier uses a majority voting system based on the first k peptides in the ordered list of peptide pairs. In the k-TSP classifier, a sample is classified as recent if a majority (> 50%) of the peptide pairs categorized the sample as recent. Classification accuracy on the training and testing data for k = 1, 2,⋯, 10 was used to select k. To investigate the robustness of this model selection procedure, we also employed leave-one-out cross-validation as an alternative strategy to select the number of peptide pairs. In turn, the samples from one of the 57 individuals were removed from the data set and classifiers using 1 to 10 peptide pairs were constructed; these classifiers were evaluated on the samples from the individual who was not included in the classifier construction.

2.4 |. Comparing results from the kTSP classifier to results from the LAg-Avidity protocol

The LAg-Avidity protocol differentiates samples as recently infected if the normalized optical density (ODn) value from the LAg-Avidity assay is ≤1.5 and the viral load is ≥1000 copies/mL13. We compared the performance of each peptide pair in the k-TSP classifier to the LAg-Avidity assay result alone using receiver operating characteristic (ROC) curves. Classification accuracy and probability of recent classification curves were used to compare the overall k-TSP classifier against the LAg-Avidity protocol. For each classifier, the probability of classifying a sample as recent was modeled using logistic regression as a function of infection duration. The mean window period for each algorithm was derived using numerical integration of the fitted curve. 95% confidence intervals for the mean window period were constructed by mapping the 95% confidence bands for the linear component of the model to the logistic scale and integrating the resulting curves. For these probability curves the window samples (reported duration of infection between 6 and 18 months) were included.

3 |. RESULTS

In the training data the k-TSP classifier using either the top 3 or top 4 peptide scoring pairs yielded the highest classification accuracy of 0.92 (Supplementary Figure 2 and Supplementary Table 4). Notably, at the same respective classification accuracies for non-recent samples under the selected optimal cut-off values, each of the four peptide pairs individually classified more of the recent samples correctly than the LAg-Avidity assay alone (Figure 1). Since the LAg-Avidity assay is widely used for cross-sectional incidence testing, this finding justifies further investigation of the peptide-based approach.

FIGURE 1.

FIGURE 1

Receiver operating characteristic (ROC) curves for the four peptide pairs in the 4-TSP classifier and the LAg-Avidity assay derived from the training data. The area under the ROC curve for each peptide pair and the LAg-Avidity assay is indicated in the legend as the number in parenthesis. The points on the four peptide pair curves indicate the sensitivity and specificity under the selected peptide pair abundance cutoffs. Under these selected optimal cutoff values, each peptide pair had higher classification sensitivity than the LAg-Avidity assay alone at the same respective specificity.

In the testing data, the 3-TSP and 4-TSP classifiers also performed equally well with a 0.94 accuracy. Somewhat surprisingly, classifiers performed a bit better on the test than on the training data, especially for classifiers based on a larger number of peptide pairs (Supplementary Figure 2). This possibly indicates that more information about time of infection could be delineated from other peptides in a larger data set, although differences in accuracy between models are small as the overall accuracy for all classifiers is high. In the independent validation data, the 4-TSP classifier had a slightly higher classification accuracy than the 3-TSP classifier (Supplementary Table 4). The leave-one-out cross-validation procedure that was employed as an alternative strategy to select the number of peptide pairs yielded identical results (Supplementary Figure 3). The highest average training accuracies on the data used to construct the classifier were observed for models of size 3 and 4, with the optimum average testing accuracy evaluating the classifiers achieved at size 4.

The 4-TSP classifier (Supplementary Figure 4) used here for comparison performed equally well or better than the LAg-Avidity protocol for identifying recent infections in the discovery set (Table 2). Though the 4-TSP classifier performed marginally worse than the LAg-Avidity protocol on the validation set and additional samples by misclassifying one and two non-recent samples respectively, it is much more likely to detect recently infected samples (Figure 2, and Discussion). The mean window period for the 4-TSP classifier is 217 days (95% CI: 183–257 days), compared to the mean window period of 106 days (95% CI: 76–146 days) obtained for the LAg-Avidity protocol for the entire data set (i.e., the training and test data sets combined, plus the samples with reported times of infection between 6 and 18 months). Thus, the 4-TSP classifier has a higher probability of identifying truly recent samples and has a larger mean window period than the LAg-Avidity protocol, suggesting that the 4-TSP classifier may produce more precise incidence estimates without relying on viral load information.

TABLE 2.

Proportion of samples correctly classified by the 4-TSP classifier and the LAg-Avidity protocol. Some samples were missing viral load data and could not be classified as recent or non-recent by the current LAg-Avidity protocol.

4-TSP classifier LAg-Avidity Protocol
correct total percent correct total percent
Training 162 176 92% 139 169 82%
 Recent 49 63 78% 27 57 47%
 Non-recent 113 113 100% 112 112 100%
Test 77 82 94% 61 77 62%
 Recent 22 27 81% 8 24 33%
 Non-recent 55 55 100% 53 53 100%
Validation 24 26 92% 25 26 96%
 Recent 0 1 0% 0 1 0%
 Non-recent 24 25 96% 25 25 100%
Repeated individuals, replicated samples 9 13 62% 11 13 85%
 Recent 0 2 0% 0 2 0%
 Non-recent 9 11 82% 11 11 100%
Repeated individuals, different study visits 10 10 100% 10 10 100%
 Recent 0 0 - 0 0 -
 Non-recent 10 10 100% 10 10 100%

FIGURE 2.

FIGURE 2

Probability of being classified as recent as a function of duration of infection for the 4-TSP classifier and the LAg-Avidity protocol calculated on the entire data set (training and test data combined, plus the samples with reported times of infection between 6 and 18 months). The 95% confidence bands for the fitted curves are indicated by the shaded regions.

4 |. DISCUSSION

In this manuscript, we present a classifier based on top-scoring pairs of peptide antibody reactivities to distinguish between recent and non-recent HIV subtype C infections. The classifier uses a voting system of four peptide pairs, where a sample is classified as recent if three or more pairs identify the sample as recent. Pair classification of recent infection is based on the relative antibody abundances of the paired peptides. The 4-TSP classifier had a 94% classification accuracy for identifying recent infections in our testing data set, and its performance was validated in another independent data set of HIV subtype C samples. This provides a proof of principle that our approach using HIV antibody responses alone (without HIV viral load or other biomarkers) can identify samples from individuals with recent HIV infection.

The size of our model (i.e. the number of peptide pairs in the classifier) was determined by building models of various sizes on a training data set and evaluating those models on a separate test data set. To investigate the robustness of the model selection procedure we employed leave-one-out cross-validation as an alternative strategy. As noted in the Results, the leave-one-out cross-validation procedure yielded identical results for the selected size of the classifier. Cross-validation also yielded lower variance estimates for the classification accuracies (e.g., Supplementary Figure 3) and unlike the training and test data approach, is not affected by the random assignment of the individuals to the respective groups. However, cross-validation only aids in determining the number of peptide pairs; it does not yield peptide pair specific cut-offs like the training and test data approach does. The cut-offs can be selected post-hoc on the full data after the model size has been chosen, though the resulting classification estimates can then be too optimistic (biased upwards) when evaluated on the same data set that was used to establish the cut-offs. However, a comparison with the training and test data cut-offs and classification accuracies yielded virtually identical numbers (Supplementary Table 5).

The model search and variable selection strategy in our approach was altered compared to the methods initially proposed by the authors of the top scoring pairs approach, which has been previously used for gene expression studies16,24,25. Instead of pre-filtering on differentially expressed genes suggested in the freely available software package switchBox25 to reduce the model search space, we pre-selected peptides based on increasing or decreasing abundance profiles as a function of time since infection. Moreover, we wanted to specifically select cutoffs for the pairs’ relative peptide abundances to classify a sample, instead of simply determining which log relative abundance was higher. That is, for each peptide pair with log relative abundances X and Y respectively we wanted to extend the mathematical indicator I(Y > X) implemented in switchBox to the indicator I(Y > X+c), where c is a real number determined during the model selection procedure. This additional parameter to be determined for each peptide pair increases the model search space, but the procedure is scalable due to the above mention pre-selection strategy.

For an incidence assay to be practical, it is important to not classify samples from persons with long-term infection as recent. Both the 4 TSP and the LAg-Avidity assay achieved 100% specificity in the training data and the test data with the chosen cutoffs (Table 2). We used the ROC curves (Figure 1) to demonstrate that for very high specificities (i.e. very low false positive rates) the peptide pairs had similar or even better sensitivities (higher true positive rates) than the LAg-Avidity assay. With the chosen cutoffs for the peptide pairs, we see higher classification sensitivity than for the LAg-Avidity assay at the same specificity, which is important since the LAg-Avidity is the standard for cross-sectional incidence testing. Given this finding, it made sense to proceed and construct a classifier based on the peptide pairs, and to investigate what sensitivity could be achieved while still minimizing the false positive rate. Thus, we wanted to prioritize the probability of being recent given a recent classification (i.e. the positive predictive value), and therefore the precision-recall curves were used for optimal classification cutoff selection. When the minimum Euclidean distance to perfect classification in the ROC curve (100% sensitivity and 100% specificity) was chosen as criterion for cut-off selection instead, the results were almost identical (data not shown). Moreover, similar to the observation in the ROC curve (Figure 1) each of the four peptide pairs a had higher positive predictive value than the LAg-Avidity assay at the same respective sensitivity (Supplementary Figure 5).

In comparison to the current LAg-Avidity protocol, the 4-TSP classifier showed equal or better classification accuracy for recent HIV infection. Indeed, each peptide pair had a higher classification accuracy than the LAg-Avidity assay alone, with the first peptide pair capturing most of the information about recency of infection. While the LAg-Avidity protocol and 4-TSP classifier performed equally well for identifying non-recent samples, the 4-TSP classifier more accurately identified samples from recently infected individuals. This is reflected by higher probabilities of appearing recent for truly recent samples and a mean window period more than twice as long for the 4-TSP classifier. The 4-TSP classifier is therefore more likely to provide precise annual incidence estimates without relying on viral load as a marker for non-recent infection.

At first glance, results from the validation set suggested that the LAg-Avidity protocol slightly outperformed the 4-TSP classifier. However, the validation set had very few samples from individuals with recent infection. As the LAg-Avidity protocol classifies samples with low viral load as non-recent, it is expected that the LAg-Avidity protocol would perform better on a data set consisting mostly of non-recent infections. The low number of recent samples limited our ability to fully validate the performance of the TSP classifier, and a larger independent data set with more samples from recently infected individuals is needed.

As the discovery data consisted only of samples from individuals infected with HIV subtype C infection, we expect poorer classification performance for other HIV subtypes, particularly strains with vastly different serologic responses such as HIV subtype D. Indeed, the 4-TSP classifier presented in this paper had poor performance on the subtype D samples that were analyzed on the same plate that contained the subtype C validation data samples. In summary, 58% (7/12) of the subtype D samples from recently infected individuals and 63% (6/8) of subtype D samples from non-recently infected were correctly classified. An additional limitation of this study is that the discovery and validation data only included data from African women. Thus, the classifier presented here performs well on a somewhat narrow population and should not be regarded as a general classifier. That said, the TSP approach can be easily be applied to other populations and HIV subtypes to identify novel peptide signatures for recent infection, and it is a proof of principle that approaches for cross-sectional HIV incidence estimation that do not include HIV viral load and are not impacted by viral suppression exist.

While the VirScan assay is useful for identifying candidate peptides to discern recent and non-recent infections, it is not practical for large-scale incidence testing in the field. We are currently evaluating the possibility of using four-peptide pair model on a multi-peptide enzyme immunoassay (EIA) platform. This commercially-available and cost-effective EIA system could ultimately be used in the field as the testing platform for a multi-peptide HIV incidence assay.

Supplementary Material

1

ACKNOWLEDGMENTS

Funding for this work was provided by grants from the United States National Institute of Allergy and Infectious Diseases (NIAID): R01-AI095068 (Eshleman), UM1-AI068613 (Eshleman), and U24 AI118633 (Larman). Additional support was provided by the Division of Intramural Research, NIAID, NIH.

Financial disclosure

None of the authors has a financial or personal relationship with other people or organizations that could inappropriately influence (bias) their work.

Footnotes

Conflict of interest

The authors declare no potential conflict of interests. The current study used previously published data15 that had been generated on stored samples from individuals who consented to future use of their specimens for research. The institutional review board of the Johns Hopkins University approved the study on cross-sectional incidence testing (NA_00051773). The parent studies from which the samples had originated were approved by local IRBs and the research had been conducted in accordance with the principles expressed by the Declaration of Helsinki.

Data sharing

The data and code are publicly available at https://github.com/athchen/ktsp_paper/ to ensure the reproducibility of our results.

References

  • 1.Mastro TD. Determining HIV incidence in populations: moving in the right direction. J. Infect. Dis 2013; 207(2): 204–206. [DOI] [PubMed] [Google Scholar]
  • 2.Brookmeyer R, Quinn TC. Estimation of current human immunodeficiency virus incidence rates from a cross-sectional survey using early diagnostic tests. Am. J. Epidemiol 1995; 141(2): 166–172. [DOI] [PubMed] [Google Scholar]
  • 3.Brookmeyer R, Laeyendecker O, Donnell D, Eshleman SH. Cross-sectional HIV incidence estimation in HIV prevention research. J. Acquir. Immune Defic. Syndr 2013; 63Suppl 2: S233–239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Brookmeyer R On the statistical accuracy of biomarker assays for HIV incidence. J. Acquir. Immune Defic. Syndr 2010; 54(4): 406–414. [DOI] [PubMed] [Google Scholar]
  • 5.Dobbs T, Kennedy S, Pau CP, McDougal JS, Parekh BS. Performance characteristics of the immunoglobulin G-capture BED-enzyme immunoassay, an assay to detect recent human immunodeficiency virus type 1 seroconversion. J. Clin. Microbiol 2004; 42(6): 2623–2628. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wei X, Liu X, Dobbs T, et al. Development of two avidity-based assays to detect recent HIV type 1 seroconversion using a multisubtype gp41 recombinant protein. AIDS Res. Hum. Retroviruses 2010; 26(1): 61–71. [DOI] [PubMed] [Google Scholar]
  • 7.Murphy G, Parry JV. Assays for the detection of recent infections with human immunodeficiency virus type 1. Euro Surveill 2008; 13(36). [PubMed] [Google Scholar]
  • 8.Laeyendecker O, Brookmeyer R, Oliver AE, et al. Factors associated with incorrect identification of recent HIV infection using the BED capture immunoassay. AIDS Res. Hum. Retroviruses 2012; 28(8): 816–822. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Longosz AF, Serwadda D, Nalugoda F, et al. Impact of HIV subtype on performance of the limiting antigen-avidity enzyme immunoassay, the bio-rad avidity assay, and the BED capture immunoassay in Rakai, Uganda. AIDS Res. Hum. Retroviruses 2014; 30(4): 339–344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Hayashida T, Gatanaga H, Tanuma J, Oka S. Effects of low HIV type 1 load and antiretroviral treatment on IgG-capture BED-enzyme immunoassay. AIDS Res. Hum. Retroviruses 2008; 24(3): 495–498. [DOI] [PubMed] [Google Scholar]
  • 11.Hallett TB, Ghys P, Barnighausen T, Yan P, Garnett GP. Errors in ‘BED’-derived estimates of HIV incidence will vary by place, time and age. PLoS ONE 2009; 4(5): e5720. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Hemelaar J, Elangovan R, Yun J, et al. Global and regional molecular epidemiology of HIV-1, 1990–2015: a systematic review, global survey, and trend analysis. Lancet Infect Dis 2019; 19(2): 143–155. [DOI] [PubMed] [Google Scholar]
  • 13.World Health Organization. When and how to use assays for recent infection to estimate HIV incidence at a population level. 2011.
  • 14.World Health Organization. Guideline on when to start antiretroviral therapy and on pre-exposure prophylaxis for HIV 2015. [PubMed]
  • 15.Eshleman SH, Laeyendecker O, Kammers K, et al. Comprehensive Profiling of HIV Antibody Evolution. Cell Reports 2019; 27: 1422–1433.e4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Geman D, d’Avignon C, Naiman DQ, Winslow RL. Classifying gene expression profiles from pairwise mRNA comparisons. Statistical Applications in Genetics and Molecular Biology 2004; 3: Article19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Tan AC, Naiman DQ, Xu L, Winslow RL, Geman D. Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics 2005; 21: 3896–3904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Marchionni L, Afsari B, Geman D, Leek JT. A simple and reproducible breast cancer prognostic test. BMC Genomics 2013; 14: 336. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Larman HB, Zhao Z, Laserson U, et al. Autoantigen discovery with a synthetic human peptidome. Nature Biotechnology 2011; 29: 535–541. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Xu GJ, Kula T, Xu Q, et al. Viral immunology. Comprehensive serological profiling of human populations using a synthetic human virome.. Science 2015; 348: aaa0698. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Mohan D, Wansley DL, Sie BM, et al. PhIP-Seq characterization of serum antibodies using oligonucleotide-encoded peptidomes. Nature Protocols 2018; 13(9): 1958–1978. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Morrison CS, Chen PL, Nankya I, et al. Hormonal contraceptive use and HIV disease progression among women in Uganda and Zimbabwe. J. Acquir. Immune Defic. Syndr 2011; 57(2): 157–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Laeyendecker O, Konikoff J, Morrison DE, et al. Identification and validation of a multi-assay algorithm for cross-sectional HIV incidence estimation in populations with subtype C infection. J Int AIDS Soc 2018; 21(2). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Afsari B, Braga-Neto UM, Geman D. Rank Discriminants for Predicting Phenotypes from RNA Expression. The Annals of Applied Statistics 2014; 8(3): 1469–1491. [Google Scholar]
  • 25.Afsari B, Fertig EJ, Geman D, Marchionni L. switchBox: an R package for k-Top Scoring Pairs classifier development.. Bioinformatics (Oxford, England) 2015; 31: 273–274. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES