Skip to main content
UKPMC Funders Author Manuscripts logoLink to UKPMC Funders Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 31.
Published in final edited form as: Sci Transl Med. 2025 Nov 26;17(826):eads8680. doi: 10.1126/scitranslmed.ads8680

Epidemiological and antigenic inferences from serological cross-reactivity among arboviruses

Megan O’Driscoll 1,2,*, Nathanaël Hozé 3,4, Noémie Lefrancq 1,5, Gabriel Ribeiro dos Santos 1,6, Damien Hoinard 7, Mohammed Ziaur Rahman 8, Kishor Kumar Paul 8,9, Abu Mohd Naser Titu 10, Mohammad Shafiul Alam 8, Mohammad Enayet Hossain 8, Jessica Vanhomwegen 7, Simon Cauchemez 3, Emily S Gurley 6,8, Henrik Salje 1,*
PMCID: PMC7618549  EMSID: EMS211400  PMID: 41296827

Abstract

Multiplex immunoassays can facilitate the parallel measurement of antibody responses against multiple antigenically related pathogens, generating a wealth of high-dimensional data that depict complex antibody-antigen relationships. In this study, we developed a generalizable analytical framework to maximize inferences from multipathogen serological studies. We fit the model to measurements of immunoglobulin antibody binding against 10 arboviral pathogens from a cross-sectional study in northwest Bangladesh with 1453 participants. We used our framework to jointly infer the prevalence of each pathogen by location and age as well as between-pathogen antibody cross-reactivity. Reconstructing immunological profiles, we found evidence of endemic transmission of Japanese encephalitis virus and recent outbreaks of dengue and chikungunya viruses in this district. Our estimates of antibody cross-reactivity were highly correlated with phylogenetic distances inferred from genetic data [correlation coefficient (r) = 0.94], demonstrating how antigenic landscapes can be inferred from population-level serological studies. Furthermore, we showed how our framework could be used to identify the presence of antigenically related pathogens that were not directly tested for, representing a potential opportunity for the detection of emerging pathogens. The presented analytical framework offers a tool that can be applied to a growing number of multipathogen studies and will help support the integration of serological testing into disease surveillance platforms.

Introduction

Traditional infectious disease surveillance methods rely on testing of symptomatic individuals. However, the number of confirmed cases is often a poor indicator of the true underlying burden of infection in a population because of varying rates of subclinical infections, nonspecific symptoms, and heterogeneity in health-seeking behavior (1, 2). Population serological studies, which test for immune markers that are generated in response to an infection, such as antibodies, can provide a direct measure of the underlying burden of infection in a population (36). These studies can help to fill the gaps left by traditional disease surveillance methods, allowing an improved understanding of population susceptibility, pathogen transmission dynamics, and rates of pathogen severity (79).

Recent advances in high-throughput immunoassay technologies have facilitated the generation of large volumes of immunological data. In particular, multiplex immunoassays, which use color-coded beads to simultaneously measure multiple analytes from a single sample, offer an efficient approach to maximize the information gained from biological samples. In the context of population serological studies, this is leading to a shift in the design of such studies from previously focusing on a single antigen of interest to now being able to simultaneously test for the presence of antibodies against multiple antigens (1018). By allowing efficient characterization of population immune profiles against multiple pathogens or antigens, such approaches are paving the way for integrated serosurveillance efforts, providing important insights into guiding strategies for the control and prevention of infectious agents (1921).

Antibody cross-reactivity has long posed a challenge to the interpretation of serological data in regions where antigenically related pathogens cocirculate (2224). In these contexts, it can be difficult to determine whether antibodies measured against the pathogen of interest were caused by an infection with that specific pathogen or by a related pathogen with similar structural antigenic regions that the antibody can recognize and bind to. Measurements of antibody responses against multiple antigenically related pathogens generate a wealth of high-dimensional data depicting complex antibody-antigen relationships, providing a pathway to characterizing these cross-reactive dynamics (25, 26). For instance, the degree of correlation in antibody responses measured against different antigens at the population level can provide an indication of their antigenic similarity. Analytical methods that quantify and account for these cross-reactive relationships can allow robust characterization of population immune profiles and underlying infection burdens. In addition, cross-reactive relationships provide a potential opportunity to detect emerging pathogens for which it is otherwise not possible to directly test.

Arboviruses, including flaviviruses and alphaviruses, are key examples of diverse viral families that represent ongoing threats to human health. Members of these viral families continue to emerge into the human population, and those that are already established in human transmission cycles continue their geographic expansion, causing increasing burdens to health care systems (27). Monitoring of infection burdens and understanding the antigenic landscape of such viral pathogens are key to global health. In this study, we developed an analytical framework for the analysis of multipathogen serological studies to jointly infer the prevalence of past infections and between-pathogen antibody cross-reactivity. We demonstrate the utility of this framework through application to a cross-sectional serological study of arboviruses in the Chapai Nawabganj district in northwest Bangladesh. Although cases of Japanese encephalitis virus (JEV) are regularly reported in this area and a chikungunya virus (CHIKV) outbreak was documented in recent years, the underlying burden of infection of these and other arboviruses in this district is not well understood (28, 29).

Results

Three circulating viruses explained antibody responses against 10 pathogens in a Bangladeshi population

We analyzed serological samples from 1453 individuals participating in a cross-sectional study conducted in 2014 in the Chapai Nawabganj district, located in northwest Bangladesh (Fig. 1A). Individuals were enrolled from 39 communities across all five subdistricts with an age range of 2 to 90 years. We used an arbovirus multiplex immunoassay, previously validated using sera from experimentally infected horses (30). Concentrations of binding immunoglobulin G (IgG) antibodies were measured against antigens of 10 pathogens, comprising 9 flaviviruses and 1 alphavirus, using a Luminex platform: dengue virus serotypes 1, 2, 3, and 4 (DENV1 to DENV4); JEV; West Nile virus (WNV); yellow fever virus (YFV); Zika virus (ZIKV); tick-borne encephalitis virus (TBEV); and CHIKV. The domain III of the envelope protein (EDIII) was used as the target antigen for each flavivirus, whereas the E2 protein was used for the alphavirus CHIKV. In addition, samples were tested using a commercial DENV enzyme-linked immunosorbent assay (ELISA) that used the whole E protein. Values of median fluorescence intensity against each antigen from the multiplex immunoassay were divided by individual-level responses to a background control (SNAP-tag) to obtain a relative fluorescence intensity (RFI) measurement. RFIs by antigen and upazila (subdistrict) of residence were determined (Fig. 1B), with respective population distributions of antibody concentrations. Simple Pearson correlation coefficients (r values) calculated for RFI titers against each antigen pair revealed high correlations in titers against JEV and WNV (r = 0.85), between DENV1 and DENV3 (r = 0.80), and between TBEV and YFV (r = 0.74) (Fig. 1C and fig. S1). In contrast, we observed low correlation between RFIs measured against CHIKV and all flavivirus antigens, with correlation coefficients ranging from −0.20 to 0.11.

Fig. 1. Multipathogen serological study data.

Fig. 1

(A) Map of Chapai Nawabganj district and its five subdistricts. The inset plot shows the location of the Chapai Nawabganj district in Bangladesh, indicated in red. (B) Heatmap of the log RFI IgG antibody titer values for multiplex antigens and log titers for the DENV ELISA assay. Column panels show the antibody binding concentrations for each of the five subdistricts. Gray bars on the right show the population distribution of log titer values for each antigen. (C) Heatmap of Pearson r correlation coefficients of population log antibody titers for each multiplex antigen pair.

We developed a semimechanistic multivariate mixture model to jointly infer pathogen-specific prevalence of past infections, denoted “prevalence” throughout, and between-pathogen antibody cross-reactivity. Mixture models allowed probabilistic inferences of individual infection statuses rather than binary classification, enabling the propagation of uncertainty from the individual level into population-level estimates. Applying our framework to the arbovirus serological data, we conducted a stepwise variable selection process to determine which pathogens were present, i.e., which have transmitted within the study population, and best explain the population antibody titer distributions against all antigens. Fitting to log antibody data from all 11 antigens (multiplex immunoassay and ELISA) in the same framework, we first assumed only a single pathogen to be present such that antibody titers against the remaining pathogens can only be explained by negative infection statuses with potential cross-reactivity from the present pathogen. Given that classic information criterion metrics such as the Akaike information criterion or deviance information criterion do not perform well in cases of mixture models of varying components, we instead used likelihood increment percentage (LIP) metrics to assess the relative improvements in model fit with increasing model components (Supplementary Materials and Methods) (31). Allowing each pathogen to be present in turn, we retained the present pathogen that resulted in the highest model log-likelihood before adding a second present pathogen. We repeated this process of adding present pathogens until the log-likelihood increment percentage per component (LIPpc) fell below 1%. We found the highest support for the presence of DENV1 in the study population, followed by CHIKV (LIPpc = 9.2%) and JEV (LIPpc = 5.0%) (Fig. 2A). Inclusion of a fourth present pathogen resulted in LIPpc values below 1%, and we therefore assumed that all of the remaining pathogens were absent from the study population. The selected present pathogens from this variable selection process were consistent across the three model versions considered, including a base model where a single value of prevalence is estimated by pathogen, a location-specific model where prevalence is estimated by subdistrict, and a location- and age-specific model where prevalence is estimated by subdistrict and age group (tables S1 to S3). We found that the location- and age-specific model version performed best by log-likelihood, and we therefore focus on the results of this model for the remainder of the manuscript. Model fit from this best-performing model is shown in fig. S2, and parameter estimates are shown in tables S4 to S6. In simulation studies, we also found that model versions that allow pathogen-specific prevalence to vary in the population (e.g., by location or age) are better able to accurately estimate true parameter values compared with those estimating a single population prevalence per pathogen (Supplementary Materials and Methods).

Fig. 2. Prevalence of past infections by space and age.

Fig. 2

(A) Increase in log-likelihood estimates from the full model as additional pathogens are assumed to be present. (B) A comparison of prevalence estimates across models is shown. Points and lines show the median and 95%Crl estimates of prevalence by pathogen, respectively. The gray points and lines show estimates from a simple mixture model (stars) and threshold method (triangles) fit independently to antibody data for each pathogen. Colored points and lines show estimates from the full model framework, progressing from including one present pathogen (blue) to three present pathogens (yellow). (C) Maps of median prevalence estimates by subdistrict for each present pathogen from the final full model. (D) Prevalence estimates by age are shown for each present pathogen. Points and lines show the median and 95%Crl estimates of prevalence from the final multivariate mixture model, respectively. The green lines and ribbons show the median and 95%Crl estimates of population prevalence for DENV1 and CHIKV, respectively. The red line and shaded ribbon show the median and 95%Crl estimates from a catalytic model for JEV, respectively, assuming endemic transmission since its estimated introduction time to the region.

Antibody profiles revealed endemic transmission of JEV and epidemics of DENV and CHIKV

Our final model estimated an overall prevalence of past infections of 17.0% [95% credible interval (95%CrI): 15.3 to 18.8%] for DENV1, 12.9% (95%CrI: 11.3 to 14.6%) for CHIKV, and 25.1% (95%CrI: 22.9 to 27.5%) for JEV in the study population (Fig. 2B). Reconstructing the most probable infection status of each individual, we estimated that 3.5% had been infected by both DENV1 and CHIKV, 2.5% by both DENV1 and JEV, 2.3% by both CHIKV and JEV, and 0.8% by all three. We found that pathogen-specific prevalence estimates did not substantially change as additional present pathogens were added (Fig. 2B). To understand how these estimates would compare with those derived from simpler analytical approaches that are traditionally used, we applied classic single-dimension (1D) mixture models and threshold approaches to antibody data independently for each pathogen. We found prevalence estimates for DENV1 and CHIKV to be consistent across models, whereas classic mixture model and threshold estimates of prevalence for DENV2, DENV4, TBEV, YFV, and ZIKV were close to zero, roughly aligned with inferences from our model framework. However, classic mixture models estimated a prevalence of 10.6% (95%CrI: 8.9 to 12.4%) for DENV3 and 7.5% (95%CrI: 5.7 to 9.5%) for WNV, inconsistent with our model framework, which found these antibody responses to be explained by cross-reactivity from DENV1 and JEV (Fig. 2B). In a simulation study, we further showed how bias in prevalence estimates obtained by classic 1D mixture models increased as both cross-reactivity from a related pathogen and prevalence of the related pathogen increased (Supplementary Materials and Methods).

Our model framework estimated substantial spatial variability in infection burden across the Chapai Nawabganj district. The prevalence of DENV1 ranged from 5.9% (4.0 to 8.3%) in Gomastapur in the north of the district to 30.3% (95%CrI: 25.8 to 34.6%) in Shibganj in the south (Fig. 2C). CHIKV prevalence was estimated to range from 4.4% (95%CrI: 2.1 to 7.8%) in Nachole to 28.4% (95%CrI: 24.3 to 32.7%) in Shibganj, consistent with where an outbreak had previously been detected (29). The prevalence of JEV infection ranged from 18.6% (95%CrI: 14.9 to 23.0%) in Shibganj to 33.7% (95%CrI: 27.4 to 40.4%) in Bholahat (Fig. 2C). Age-specific prevalence of DENV1 and CHIKV appeared constant among those ≥5 years of age (Fig. 2D). The lower prevalence estimated among 0- to 4-year-olds for DENV1 and CHIKV suggests the occurrence of an epidemic for each of these viruses sometime in the 5 years before the serological study (2009 to 2014). In contrast, we observed a pattern of increasing JEV prevalence by age in this district, indicative of endemic transmission (Fig. 2D). To characterize the transmission intensity of JEV, we fit simple catalytic models to age-specific prevalence estimates, assuming endemic transmission of JEV since its introduction to Bangladesh. Patterns in the age-specific prevalence estimates suggested the most likely JEV introduction date to be between 1965 and 1970 (Fig. 2D and fig. S3), roughly aligned with the first detected human cases in Bangladesh in 1977 (32, 33). We estimated that 1.2% (95%CrI: 1.1 to 1.3%) of the susceptible population in this region has been infected each year since JEV’s introduction, corresponding to ~19,179 (95%CrI: 17,347 to 21,128) annual JEV infections in the Chapai district (Fig. 2D). Estimates of transmission intensity of JEV varied from 0.8% (95%CrI: 0.6 to 1.1%) in Nawabganj Sadar to 1.7% (1.4 to 2.0%) in Gomastapur (fig. S4). We found our prevalence estimates to be consistent with or without inclusion of the DENV ELISA assay in model fitting (fig. S5). Estimates of JEV prevalence, however, were slightly higher when the DENV ELISA assay was included in model fitting, consistent with the lower sensitivity reported for the multiplex JEV EDIII antigen used in this study (30) and the high sensitivity of whole–E protein DENV ELISA assays.

Cross-reactivity estimates align with genetic distances between antigens

From this best-performing model with three present pathogens, we estimated the highest antibody cross-reactivity from JEV to WNV with a relative titer increase of 0.71 (95%CrI: 0.68 to 0.74) against WNV among JEV-positive individuals (Fig. 3A). This implied that among those infected with JEV, an average of 71% of their response to the JEV antigen also bound to the WNV antigen. JEV infection was also estimated to induce a 0.16 (0.13 to 0.19) relative titer increase against YFV and a 0.13 (0.09 to 0.16) relative titer increase against DENV1. Among individuals with a past DENV1 infection, we estimated a 0.54 (95%CrI: 0.52 to 0.55) relative titer increase against DENV3 as well as 0.18 (0.16 to 0.20) and 0.19 (0.17 to 0.20) relative titer increases against DENV2 and DENV4, respectively (Fig. 3A). In contrast, we estimated minimal cross-reactivity from individuals with past CHIKV infection against the flavivirus antigens, with median estimates ranging from 0.00 to 0.03 (Fig. 3A). To summarize the antigenic relationships inferred by the model, we applied multidimensional scaling of the estimated antibody responses across infection statuses (Fig. 3B). We observed close antigenic clustering of the DENV serotypes, particularly DENV2 and DENV4, although the positions of these DENV antigens are informed only by the positive sera of a single DENV serotype (DENV1). We also inferred a close antigenic relationship between JEV and WNV, members of the same serocomplex (23), as well as between TBEV and YFV (Fig. 3B). To understand how our inferred antigenic distances compared with phylogenetic distances in the Flaviviridae family, we constructed a maximum-likelihood phylogeny of representative flavivirus EDIII proteins. We found high correlation between our inferred antigenic distances and genetic distances with Pearson r = 0.94 (Fig. 3C). The shortest genetic and antigenic distances were estimated between JEV and WNV and between DENV1 and DENV3, whereas the greatest antigenic and genetic distances were inferred between DENV1 and TBEV and between DENV1 and YFV (Fig. 3C).

Fig. 3. Antigenic relationships.

Fig. 3

(A) Model estimates of between-pathogen antibody cross-reactivity. Colored bars show the median model estimates of relative titer increases from each infecting pathogen against each noninfecting pathogen. Black lines show the 95%Crl estimates. (B) Antigenic cartography map of inferred antigenic distances between each serum sample (gray squares) representing binding IgG antibodies and antigen (purple circles), obtained by multidimensional scaling of the cross-reactivity estimates to a two-dimensional space. (C) Comparison of the inferred antigenic distances and genetic distances inferred using phylogenetic methods (blue points), considering only flavivirus pathogen pairs (i.e., excluding CHIKV), based on EDIII. The blue line and shaded ribbon show the mean and 95% confidence interval model fit of a linear regression with r = 0.94, respectively. (D) Observed (gray bars) and reconstructed log DENV ELISA population titer distributions, where the purple line and shaded ribbon show the model median and 95%Crl estimates, respectively. (E) Reconstructed log DENV ELISA titer distributions by infection status where colored lines and shaded ribbons indicate median and 95%Crl estimates, respectively. The gray dashed line in (D) and (E) shows the manufacturer-recommended threshold for defining DENV seropositivity.

To understand the relationship between each multiplex antigen and the more-sensitive DENV ELISA antigen, we estimated where the antibody titer distributions of each infection status fall on the DENV ELISA assay. The full reconstructed distribution of DENV ELISA titers is shown in Fig. 3D. Of the present pathogens included in our model, we estimated that DENV1-positive individuals had the highest response on the DENV ELISA assay with mean titers of 3.5 (95%CrI: 3.5 to 3.6), shown in Fig. 3E. We also estimated a substantial signal from individuals with past JEV infection to the DENV ELISA assay, with mean titers of 1.8 (95%CrI: 1.7 to 1.9). In contrast, individuals with past CHIKV infection had mean DENV ELISA titers of −0.2 (95%CrI: −0.4 to 0.0), similar to those of individuals negative for all pathogens (mean DENV ELISA titer −0.3; 95%CrI: −0.4 to −0.2) (Fig. 3E). Using the manufacturer-recommended cut-off for defining DENV seropositivity resulted in a prevalence of 22.0% (95%CI: 19.9 to 24.3%), slightly higher than our estimates of DENV1 prevalence of 17.0% (95%CrI: 15.3 to 18.8%), inferred by the full model. Results from our model indicated that the difference in estimates is attributed to individuals previously infected with JEV who have high responses on the DENV ELISA assay (Fig. 3E), giving the appearance of increasing DENV ELISA titers by age (fig. S6). The DENV ELISA assay uses the whole E protein and is therefore more sensitive to cross-reactivity compared with the EDIII used in the multiplex assay. Taking the DENV ELISA results on their own could easily be misinterpreted as endemic transmission of DENV.

Cross-reactive signals allow detection of unobserved pathogens

A key uncertainty that arises in multipathogen serological studies is the potential presence of cross-reactive antibodies induced from unmeasured or unknown pathogens. This raises the potential issue of misattributing antibody responses to related pathogens that have been tested for. We therefore used our analytical method to explore the scenarios in which the presence and prevalence of an unobserved pathogen can be inferred through cross-reactive antibody responses. We conducted simulations of a three-pathogen system, where two pathogens are observed or tested for and one is not. We found that estimates of the prevalence of observed pathogens (O) and the unobserved pathogen (X) can both be accurately inferred at intermediate values of cross-reactivity from X to O, with a low root mean square error (RMSE) (Fig. 4A). At lower values of cross-reactivity from X to O, the prevalence of observed pathogens O can be estimated with good accuracy because of limited interference from X. However, the prevalence of X cannot be well identified when its true prevalence and its cross-reactivity to observed pathogens are both low. Moreover, at higher values of cross-reactivity from X to O, the prevalence of X cannot be accurately reconstructed, with the prevalence of O also becoming unidentifiable as the prevalence of X increases (Fig. 4A).

Fig. 4. Detecting unobserved pathogens.

Fig. 4

(A) Heatmap of the RMSE of prevalence estimates obtained from simulated data for an observed pathogen, O, and an unobserved pathogen, X, with varying cross-reactivity from X to O and varying prevalence of the unobserved pathogen X. (B to D) Comparisons of model estimates of pathogen prevalence when antibody titers against the homologous antigen are observed and included in the model (orange) and when they are excluded/unobserved but reconstructed as an unobserved pathogen X. Results are shown for DENV (B), CHIKV (C), and JEV (D). Points and lines indicate model median and 95%Crl estimates, respectively.

To assess the generalizability of these findings to real data, we refit our final model to the arbovirus antigens, excluding 1 of the 10 antigens and attempting to reconstruct the parameters for this unobserved pathogen. We showed that our model was able to accurately reconstruct the prevalence of both DENV1 and JEV as unobserved pathogens when antibody measurements from these antigens were excluded from model fitting (Fig. 4, B to D). In addition, the model accurately reconstructed estimates of cross-reactivity from the unobserved pathogens to each observed antigen, including close antigenic relationships between JEV and WNV and between DENV1 and DENV3 (fig. S7). In practice, if the DENV1 or JEV antigens are missing, we may likely conclude that DENV3 (instead of DENV1) or WNV (instead of JEV) is present instead. In contrast with the case of DENV1 and JEV, we showed that when the CHIKV antigen is excluded from fitting, the model cannot accurately reconstruct the parameters of CHIKV. This is consistent with the simulation results that demonstrated that the lack of cross-reactivity between CHIKV and flavivirus antigens inhibits its identification, whereas intermediate cross-reactivity between JEV and DENV1 with the other flavivirus antigens enables their accurate reconstruction.

Discussion

The transmission of multiple antigenically related pathogens in a population has long posed a challenge to the interpretation of serological studies of flaviviruses and other antigenically variable pathogens. In contexts where multiple related, cross-reacting pathogens have been transmitted in a population, it can be difficult to understand whether antibodies detected against a specific pathogen were induced through exposure to that pathogen itself or to a related pathogen. In this study, we applied an analytical framework to multipathogen serological data to disentangle the multidimensional antigen-antibody relationships. We used this framework to infer estimates of arboviral prevalence in northwest Bangladesh by location and age, accounting for varying degrees of between-pathogen antibody cross-reactivity.

A key question arising from multipathogen serological studies is determining which pathogens have truly transmitted in the study population. To date in Bangladesh, there has been no evidence of human transmission for YFV or TBEV, whereas human cases of DENV, CHIKV, JEV, and, more recently, ZIKV and WNV have all been reported in the country (28, 29, 32, 3436). Among the pathogens included in the study, we found evidence of past DENV, CHIKV, and JEV transmission in the district of Chapai Nawabganj, whereas no strong evidence for the presence of the remaining pathogens was found. Our prevalence estimates for this district (17, 13, and 25% for DENV, CHIKV, and JEV, respectively) are lower than national estimates for DENV (24%) but higher than national estimates for CHIKV (2.4%) and JEV (3.4%) (7, 37, 38). We showed that simpler methods that do not account for antibody cross-reactivity would have inferred the presence of WNV, whereas, in our model, WNV antibodies were found to be explained by high cross-reactivity from JEV infections. This is consistent with local epidemiological data, with confirmed JEV hospitalized cases regularly reported in this district (28, 33). In contrast, the first diagnosed human case of WNV in Bangladesh was reported in 2019 in Dhaka, 5 years after the completion of the serological study (36). We inferred the highest prevalence of DENV and CHIKV in the more urbanized district of Shibganj, where a past epidemic of CHIKV had been reported, consistent with the Aedes vector’s ability to thrive in urban environments (29). For ZIKV, the first detected human case in Bangladesh was in August 2014, with serological evidence of ZIKV infections in Dhaka since 2013 (34, 39). However, we found no evidence of ZIKV transmission in the district of Chapai Nawabganj as of 2014.

Our analysis explored how a combination of sensitive and specific antigens can be leveraged as a means of detecting the presence and prevalence of unknown or unmeasured pathogens. We demonstrated how signals from related cross-reacting pathogens could make the detection of unobserved pathogens possible under conditions of intermediate cross-reactivity profiles. An important limitation to this is that the relationship of the unobserved pathogen to the measured pathogens, and therefore the reliability of inference, will not be known in many real-world applications. Future work to define robust criteria or strength of evidence for the detection of unobserved pathogens through serological surveillance will be important for public health applications. Routine monitoring of population serological profiles over time could allow real-time detection of previously unidentified pathogens through shifts in baseline cross-reactive profiles. The major emerging pathogens of recent years have all belonged to families of known human pathogens, with coronaviruses, flaviviruses, and influenza viruses being the predominant examples. Our findings highlight how emerging pathogens of known pathogen families could be detected through routine serological testing, supporting the potential benefits of a global immunological observatory and routine serological surveillance (40, 41).

There are a number of limitations that should be considered when interpreting the results of our analyses. Antibody kinetics are not accounted for in our model framework, which relies on cross-sectional serological data representing a snapshot of population antibody titers at a single point in time. Because the timing and sequence of infections cannot be accounted for, our model estimates will therefore reflect population averages across a range of time-since-infection scenarios represented in the study population. If antibodies wane substantially after infection, then the prevalence of infection may be underestimated. Our approach also assumes lifelong homotypic immunity against reinfection and no postinfection boosting. However, in the case of the arboviruses measured in this study, infection is thought to induce long-term homotypic IgG responses (42, 43). In addition, if an infection induces higher antibody responses against a related pathogen than the infecting pathogen, then the infection statuses may be misclassified. We assume that cross-reactive antibodies from infecting pathogens to any non-infecting pathogen combine additively, which may not hold for individuals infected with an increasing number of related pathogens. Our model represents an unsupervised approach to classifying population infection statuses, which has the advantage of not relying on validated sample sets that can be time- and resource-intensive to obtain. Although our extensive simulation study demonstrated the theoretical validity of our analytical framework, future analysis from serological cohort studies where infections are confirmed by polymerase chain reaction or other methods will allow for an improved understanding of model performance.

Despite these limitations, the antigenic distances between assay antigens and monotypic sera inferred by our model were found to be highly consistent with genetic distances between antigens independently inferred from phylogenetic methods. In addition, our inferences of arbovirus infection burden are consistent with known epidemiological indicators of each present pathogen. Previous studies have analyzed serological data considering two related pathogens using additional assumptions of pathogen transmission dynamics (25, 26). The analytical framework presented here is agnostic to any specific pathogen transmission dynamics, allowing it to scale to many pathogens with potentially different transmission dynamics. This framework has been shown to provide robust estimates of the prevalence of past infections in systems of several related pathogens while simultaneously accounting for varying amounts of between-pathogen cross-reactivity. In this study, we demonstrated the utility of this analytical approach to making robust epidemiological inferences of public health relevance. This framework could disentangle high-dimensional antibody-antigen relationships of multiple related pathogens, providing insights into the burden and transmission dynamics, as well as the immune landscape, of arboviral diseases.

Materials and Methods

Study design

We analyzed data from a cross-sectional seroprevalence study conducted in 2014 in the Chapai Nawabganj zila (district) of Bangladesh, located in the Rajshahi division. A total of 1453 individuals were included in the study from 39 communities across the five upazilas (subdistricts) of Chapai Nawabganj. Participant recruitment in each rural community was initiated by identifying the household where the most recent wedding took place and beginning recruitment in the house of their closest neighbor. In each urban community, recruitment was initiated by finding the nearest community center and identifying the closest neighboring household for participant recruitment. All residents of selected houses, regardless of age, were eligible for enrollment in the study. After the first enrolled house-hold, the next five closest neighbors were skipped, and the sixth closest household was approached for enrollment. This process was repeated until individuals from at least 15 different households were enrolled from that community. Individuals unable to give consent because of disability or who had an acute medical condition where blood collection is contraindicated were excluded from study participation. A single 5-ml venous blood sample was collected from each enrolled participant. The age distribution of the study participants compared with the population age distribution for each of the five upazilas is shown in fig. S8. Data on the population age distributions were obtained from the 2011 Bangladesh census (44). The serological study was approved by the ethical review boards at the US Centers for Disease Control and Prevention; International Centre for Diarrhoeal Disease Research, Bangladesh (PR-13001); Johns Hopkins Bloomberg School of Public Health; and Institut Pasteur. Informed consent, and assent when appropriate, was obtained from all study participants.

Multiplex immunoassay

Serum samples were tested for the presence of IgG antibodies against 10 antigens using an in-house microsphere-based multiplex immunoassay. Recombinant EDIII antigens were used for all flavivirus pathogens (DENV1, DENV2, DENV3, DENV4, JEV, WNV, TBEV, YFV, and ZIKV) to improve assay sensitivity and specificity, as previously described (30), and a recombinant E2 glycoprotein was used for CHIKV. The assay was previously validated using 172 negative and 300 positive control sera from horses infected with WNV, JEV, and TBEV (30). Magnetic beads were coupled to each specific antigen, and the assay was conducted as previously described, including determining individual-level background controls using a recombinant human protein SNAP-tag (O6-methylguanine DNA methyl-transferase) (30). Fluorescence intensity and bead color coding were measured using a Luminex 200 system (Bio-Rad Laboratories). For each sample, an RFI value was calculated by dividing the median fluorescence intensity by the fluorescence intensity of the individual-level background control. All analyses of multiplex immunoassay data were conducted using log RFI values. In addition to the 10 antigens measured with the multiplex immunoassay, a Panbio DENV ELISA assay using the whole DENV E protein was also used to measure binding DENV IgG antibody concentrations.

Multivariate mixture model

We developed and applied a semimechanistic multivariate Gaussian mixture model to jointly infer the prevalence of infection of multiple pathogens and between-pathogen antibody cross-reactivity. We assumed the long-term persistence of IgG antibodies after infection and therefore defined positivity on the basis of the presence or absence of serological evidence of a past infection. We considered a system of n = P pathogens, where the true infection status of each individual is either negative, (0), or positive, (1), for each pathogen. For a system of P pathogens with two possible outcomes, there are 2P possible infection status combinations. For each possible infection status combination, c, we define a P-dimensional Gaussian component, with vector of means μc and covariance matrix Σc. The proportion of the study population with infection status combination c, θc, was defined as the conditional probability of having infection status combination c, given the pathogen-specific population prevalence, πp. For example, for infection status combination c = {0,1,0} where individuals have been infected by pathogen B only, the proportion of the population with this infection status is calculated as θc = (1 −πA) (πB) (1 − πC).

We first considered a system of two related pathogens, A and B. For each of the four possible infection status combinations, we defined a bivariate Gaussian component to characterize the antibody titers of this status. For individuals negative for both A and B, {A0, B0}, the Gaussian component is defined with means μA0 and μB0 and standard deviations σA0 and σB0. We allowed for correlation in these negative antibody titers, with correlation coefficient ρ0. The covariance matrix, Σ, for this Gaussian component was then defined as shown in Eq. 1, where the covariance of A and B is Cov(A0,B0)=ρ0σA0σB0

Σ{A0,B0}=(σA02ρ0σA0σB0ρ0σA0σB0σB02) (1)

For individuals infected with pathogen B only, {A0, B1}, we defined the mean rise in antibody titers against pathogen B as μBR such that the mean antibody titers against pathogen B are μB1=μB0+μBR, with standard deviation σB1. We assumed that this bivariate Gaussian component was a linear combination of negative antibody titers against pathogen A and positive antibody titers against pathogen B. In this way, the mean antibody titers against pathogen A were calculated as μA0+ϕBAμB1. Here, ϕBA is the relative increase in antibody titers raised against pathogen A after infection with pathogen B relative to the antibody titers binding to pathogen B. The covariance in antibody titers for this Gaussian component was calculated as shown in Eqs. 2 and 3 using the bilinearity property of covariance. Here, we assumed that the covariance in these antibody titers in the absence of cross-reactivity was zero, Cov (A0, B1 = 0). Under this assumption, the standard deviation for antibody titers against pathogen A can be calculated as σA02+(ϕBAσB1)2. The covariance matrix for this bivariate Gaussian component can subsequently be defined as shown in Eq. 4

Cov(A0+ϕBAB1,B1)=Cov(A0,B1)+ϕBACov(B1,B1) (2)
Cov(A0+ϕBAB1,B1)=ϕBAσB12 (3)
Σ{A0,B1}=(σA02+(ϕBAσB1)2ϕBAσB12ϕBAσB12σB12) (4)

For individuals infected with A only, {A1, B0}, the bivariate Gaussian component was defined with the same approach as before. The Gaussian means were defined as μA0+μAR and μB0+ϕABμA1, respectively, for antibody titers against pathogens A and B, and the standard deviations were defined as σA1 and σB02+(ϕABσA1)2. With the same approach as above (Eqs. 2 and 3), the covariance matrix of this Gaussian component was defined as shown in Eq. 5. We assumed that cross-reactive antibodies cannot decrease antibody titers, i.e., ϕ ≥ 0. In addition, we allowed for asymmetric cross-reactive antibody responses such that ϕBA is independent of ϕAB

Σ{A1,B0}=(σA12ϕABσA12ϕABσA12σB02+(ϕABσA1)2) (5)

Last, for the infection status combination {A1, B0}, where individuals have been infected with both pathogens A and B, we defined the means of the Gaussian component as μA1=μA0+μAR and μB1=μB0+μBR. Without knowing the sequence or timing of infections, we assumed that the means and standard deviations of positive antibody titers were constant across infection statuses. Therefore, the standard deviations of this Gaussian component were simply defined as σA1 and σB1 with correlation coefficient ρ1. The covariance matrix for this Gaussian component was then defined as shown in Eq. 6

Σ{A1,B1}=(σA12ρ1σA1σB1ρ1σA1σB1σB12) (6)

More generally, for a system with n = P pathogens, the number of possible infection status combinations, and therefore the number of P-dimensional Gaussian components used to describe the data, is 2P. To reduce the number of new parameters required when scaling the system to higher dimensions, we assumed that correlation in antibody titers when individuals are negative to all pathogens, ρ0, is constant across pathogen pairs. For instance, the covariance matrix for infection status {A0, B0, C0} in a three-pathogen system is defined as shown in Eq. 7.

Σ{A0,B0,C0}=(σA02ρ0σA0σB0ρ0σA0σC0ρ0σA0σB0σB02ρ0σB0σC0ρ0σA0σC0ρ0σB0σC0σC02) (7)

For infection statuses where individuals have been infected with one pathogen, the covariance in antibody titers against the infecting pathogen and noninfecting pathogens was calculated with the same approach as before (Eqs. 2 and 3). The covariance in antibody titers for pairs of noninfecting pathogens, now in the presence of cross-reactive antibodies from a related pathogen, was calculated as shown in Eqs. 8 to 10, taking the example of infection status {A0, B0, C1}, where individuals are positive for pathogen C only. As before, we assumed antibody titers against infecting and noninfecting pathogens to be independent in the absence of cross-reactivity such that Cov (A0, C1) = 0 and Cov (B0, C1) = 0. The covariance matrix for this infection status is then defined as shown in Eq. 11

Cov(A0,B0C1)=Cov(A0+ϕCAC1,B0+ϕCBC1) (8)
Cov(A0,B0C1)=Cov(A0,B0)+ϕCBCov(A0,C1)+ϕCACov(C1,B0)+ϕCAϕCBCov(C1,C1) (9)
Cov(A0,B0C1)=ρ0σA0σB0+ϕCAϕCBσC12 (10)
Σ{A0,B0,C1}=(σA02+(ϕCAσC1)2ρ0σA0σB0+ϕCAϕCBσC12ϕCAσC12ρ0σA0σB0+ϕCAϕCBσC12σB02+(ϕCBσC1)2ϕCBσC12ϕCAσC12ϕCBσC12σC12) (11)

In the case of infection status combinations where individuals have been infected with more than one pathogen, we assumed that the means and standard deviations of cross-reactive antibody titers against noninfecting pathogens increased additively. For instance, for an infection status {A0, B1, C1}, the mean of the Gaussian component characterizing titers against pathogen A was calculated as μA0+ϕBAμB1+ϕCAμC1. Similarly, the standard deviation for antibody titers against pathogen A was calculated as σA02+(ϕBAσB1)2+(ϕCAσC1)2. The covariance between negative and positive titers was calculated as shown in Eq. 12, additionally accounting for the correlation in antibody titers for the positive pathogens, ρ1. We assumed that this correlation in antibody titers between pairs of positive pathogens was constant across all pathogen pairs and infection status combinations. The covariance matrix for infection status {A0, B1, C1} is shown in Eq. 13

Cov(A0,B1C1)=ϕBAσB12+ϕCAρ1σB1σC1 (12)
Σ{A0,B1,C1}=(σA02+(ϕBAσB1)2+(ϕCAσC1)2ϕBAσB12+ϕCAρ1σB1σC1ϕCAσC12ϕBAσB12+ϕCAρ1σB1σC1σB12ρ1σB1σC1ϕCAσC12ρ1σB1σC1σC12) (13)

More generally, in the case of covariance in antibody titers against pairs of noninfecting pathogens, in the presence of >1 infecting pathogen, the covariance was calculated as shown in Eq. 14, with K total pathogens. In the case of covariance in antibody titers for infecting and noninfecting pathogens, in the presence of >1 infecting pathogen, the covariance was calculated as shown in Eq. 15

Cov(A0,B0C1,D1,K1)=ρ0σA0σB0+n=CKϕnAϕnBσn12+n=CK1m=n+1K(ϕnAϕmB+ϕmAϕnB)ρ0σ02 (14)
Cov(A0,B1C1,D1,K1)=ϕBAσB12+n=CKϕnAρ1σB1σn1 (15)

Present versus absent pathogens

To understand which pathogens are likely to have transmitted in the population, we conducted a stepwise variable selection process, comparing the performance of models that assumed a pathogen to be present versus absent. In the case of pathogens that are assumed to be absent from the study population, the model can be simplified by reducing the number of possible infection status combinations and therefore the number of Gaussian components needed to describe the data. For instance, considering a system of three pathogens, A, B, and C, there are eight possible infection status combinations if all three pathogens have transmitted in the population. In contrast, if pathogen C can be assumed to be absent from the study population (i.e., has never transmitted in the population), then the number of infection status combinations is halved. In this way, the antibody titers against the absent pathogen were still included in model fitting and were characterized only by the negative cross-reactivity Gaussian components/infection statuses. The number of parameters was also reduced, with π and μ1 parameters no longer estimated for each absent pathogen. Cross-reactivity parameters ϕ from the absent pathogens to the present pathogens were also not estimated, but cross-reactivity from the present pathogens to the absent ones was estimated.

Unobserved pathogens

We explored the ability of the model framework to accurately reconstruct the prevalence of past infection of some unmeasured/unobserved pathogen, “pathogen X,” that had not been tested for by the serological assay. Additional Gaussian components were fit to characterize the antibody titer distributions of pathogen X, with 2P+1 possible infection status combinations. Covariance matrices for each infection status combination were of size PP, where P is the number of observed pathogens. The prevalence of the unmeasured pathogen was estimated, as was the cross-reactivity from the unobserved pathogens to the measured pathogens. Because the homologous binding to pathogen X is unknown, we assumed some fixed value of mean titers against pathogen X from which the relative titer increase against other pathogens could be inferred. We assumed that the standard deviation of these positive titers against pathogen X was equal to the standard deviation of positive responses against the measured pathogens.

Model fitting

We applied the model framework to the arbovirus serological data from Chapai Nawabganj. We considered the data as being representative of the 10 pathogens that were tested using the multiplex immunoassay (CHIKV, DENV1, DENV2, DENV3, DENV4, JEV, TBEV, WNV, YFV, and ZIKV) and an additional antigen (DENV ELISA assay), giving a total of 11 antigens. We assumed that the standard deviations of negative antibody responses, σ0, and positive antibody responses, σ1, were each constant across the multiplex immunoassay antigens. We also assumed that correlation in antibody responses against pairs of infecting pathogens, ρ0, was 0. We fit the multivariate mixture model framework jointly to data from all 11 antigens and conducted a stepwise variable selection process to identify which of the 10 pathogens were likely to be present in the study population. We considered a “base” model version that estimates a single prevalence per pathogen, as well as location-specific and location- and age-specific versions. In the location-specific versions, pathogen-specific prevalence was allowed to vary by each of the five subdistricts of Chapai Nawabganj. For the location- and age-specific version, we further allowed pathogen prevalence to vary by age group. We assumed all other parameters to be constant across the study population, allowing only pathogen-specific prevalence to vary in the population. Model run times with increasing present pathogens are shown in fig. S9. All code used for model fitting is available at https://doi.org/10.5281/zenodo.17340210, with additional resources at https://github.com/meganodris/MultiSero.

Variable selection process

We started by assuming that only 1 of the 10 pathogens was present in the study population such that antibody titers for the remaining 9 pathogens must be explained by negative infection statuses and cross-reactive antibodies from the single present pathogen. We fit this model, assuming each of the 10 pathogens to be present in turn, and identified the model with the highest log-likelihood from this step. The present pathogen from the best-performing model was then retained in the next step, where a second present pathogen was added. We repeated this iterative process, at each step identifying the additional present pathogen that improved the likelihood the most. We continued the process until the LIPpc fell below a threshold of 1%. Because of the limitations of traditional information criterion approaches for the comparison of finite mixture models of varying complexity, we focused on the LIP metric (31, 45). We used the LIPpc to assess the improvement in model fit across models with an increasing number of present pathogens and therefore increasing numbers of Gaussian components used to explain the data. Here, the increase in log-likelihood for model j (logLikj), relative to the log-likelihood of simpler model j – 1, (logLikj–1), was considered, accounting for the additional Gaussian components, g, fitted by model j, gj, as shown in Eq. 16

LIPpcj=100(12logLikj2logLikj1)gjgj1 (16)

DENV ELISA assay

We included the DENV ELISA as an additional antigen/dimension in the full model framework. Because of the inclusion of DENV antigens in the multiplex immunoassay, we did not consider the DENV ELISA to be an independent pathogen that can be present. We instead allowed the DENV ELISA antibody titers to be explained by the same Gaussian components describing the multiplex immunoassay antibody responses. We assumed no covariance between antibody titers measured on the DENV ELISA and on the multiplex immunoassay. We estimated the mean, μE, and standard deviation, σE, of DENV ELISA titers for infection status combinations that were negative for all pathogens and those positive for a single pathogen. For infection statuses that are positive for >1 pathogen, we assumed that the mean DENV ELISA titers were equal to the maximum μE of the Gaussian components describing the respective monotypic infection statuses. The standard deviation of DENV ELISA titers for an infection status positive to >1 pathogen was calculated as the joint standard deviation of the respective monotypic infection statuses. For example, the standard deviation of DENV ELISA titers for an infection status positive for pathogens A and B, σEA1B1, was calculated as σEA12+σEB12.

Reconstructing unobserved pathogens

To test the ability of our model to reconstruct the prevalence of past infection of an unobserved pathogen, we excluded antibody titer data from a single pathogen from model fitting. In particular, of the pathogens that were determined to be present in the study population from the variable selection process, we excluded the multiplex antigen of each in turn. We included the DENV ELISA assay in model fitting and inferred the prevalence of each observed and unobserved pathogen by subdistrict. We assumed a fixed value of 2 for mean positive titer responses against pathogen X for inference of the cross-reactivity responses from pathogen X to observed pathogens.

Likelihood and priors

We assumed that the antibody titers for each infection status combination followed a multivariate Gaussian distribution, with the probability of observing antibody titers x for individual i, given infection status combination c, as shown in Eq. 17. Here, x was an n = P vector of antibody titer measurements to each pathogen, and 𝒩 represents a probability density function of a normal (Gaussian) distribution. The full model log-likelihood was calculated as the sum of the log probabilities across individuals, i, and across all possible infection status combinations, c, weighted by the proportion of the study population with infection status combination c, θc, as shown in Eq. 18. The log-likelihood of the location and age-specific model was calculated similarly to before, as shown in Eq. 19, but now with location- and age-specific θc,l,a derived from the location- and age-specific prevalence estimates πp,l,a, as previously described

N(xμc,Σc)=12πP/2|Σc|1/2exp(12(xμc)TΣc1(xμc)) (17)
logLik=ilncθcN(xiμc,Σc) (18)
logLik=ilncθc,l,aN(xiμc,Σc) (19)

Statistical analysis

Parameter priors are shown in Table 1, which were assumed to be constant across pathogens. We fit the model in a Bayesian frame-work with Hamiltonian Monte Carlo No-U-Turn sampling using cmdStanR (46). Each model was fit with three chains for 3000 iterations in addition to 3000 warm-up samples. Model convergence was assessed by visual inspection of chain mixing and by R-hat convergence diagnostic across all parameters. Median and 95%CrI estimates were calculated for each parameter.

Table 1. Multivariate mixture model parameter priors.

Parameter priors and limits used in model fitting.

Parameter Parameter description Limits Prior
π Prevalence [0,1] Beta(1, 5)
μ0 Mean of negative titers [–1, inf] Normal(0, 0.1)
μR Mean increase in positive titers [0, inf] Normal(3, 0.1)
σ0 SD of negative titers [0, inf] Normal(0.4, 0.05)
σ1 SD of positive titers [0, inf] Normal(0.7, 0.05)
ϕ Cross-reactivity [0, inf] Exp(8)
ρ0 Correlation in negative titers [0,1] Beta(2, 2)
μE Mean of ELISA titers [–1, inf] Normal(2, 1)
σE SD of ELISA titers [0, inf] Normal(0.5, 0.2)

Supplementary Material

MDAR Reproducibility Checklist
Supplementary Materials

Acknowledgments

Funding

We acknowledge funding from European Research Council grant nos. 101170844 and 804744 (to H.S.); the National Institutes of Health (grant number 1R01AI175941-01) (to H.S.); the Wellcome Trust (xSTAR, grant number 228185/Z/23/Z) (to H.S.); the Laboratoire d’Excellence Integrative Biology of Emerging Infectious Diseases program (grant ANR-10-LABX-62-IBEID) (to S.C.); the European Union’s Horizon 2020 research and innovation programme under VEO grant agreement no. 874735, U01 NIH-PICREID (U01AI151758) (to S.C.); and the INCEPTION project (PIA/ANR16-CONV-0005) (to S.C.).

Footnotes

Author contributions: M.Z.R., K.K.P., A.M.N.T., M.S.A., M.E.H., E.S.G., and H.S. designed the study. M.O’D. and H.S. designed the analytical methodology. J.V. and D.H. performed the multiplex assay experiments. M.O’D. performed the analysis. N.L. conducted the phylogenetic analysis. M.O’D. and H.S. wrote the manuscript. M.O’D., N.H., N.L., G.R.D.S., D.H., M.Z.R., K.K.P., A.M.N.T., M.S.A., M.E.H., J.V., S.C., E.S.G., and H.S. reviewed and edited the manuscript.

Competing interests: The authors declare that they have no competing interests.

Data and materials availability

All data associated with this study are present in the paper or the Supplementary Materials. Anonymized data and code to reproduce the analysis can be found at https://doi.org/10.5281/zenodo.17340210.

References And Notes

  • 1.Jiles R, Klevens M, Hughes E. In: Viral Infections of Humans: Epidemiology and Control. Kaslow RA, Stanberry LR, Le Duc JW, editors. Springer; 2014. Surveillance and seroepidemiology; pp. 63–79. [Google Scholar]
  • 2.Peppa M, Edmunds WJ, Funk S. Disease severity determines health-seeking behaviour amongst individuals with influenza-like illness in an internet-based cohort. BMC Infect Dis. 2017;17:238. doi: 10.1186/s12879-017-2337-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Stringhini S, Wisniak A, Piumatti G, Azman AS, Lauer SA, Baysson H, De Ridder D, Petrovic D, Schrempft S, Marcus K, Yerly S, et al. Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Geneva, Switzerland (SEROCoV-POP): A population-based study. Lancet. 2020;396:313–319. doi: 10.1016/S0140-6736(20)31304-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pollán M, Pérez-Gómez B, Pastor-Barriuso R, Oteo J, Hernán MA, Pérez-Olmeda M, Sanmartín JL, Fernández-García A, Cruz I, Fernández de Larrea N, Molina M, et al. ENE-COVID Study Group, Prevalence of SARS-CoV-2 in Spain (ENE-COVID): A nationwide, population-based seroepidemiological study. Lancet. 2020;396:535–544. doi: 10.1016/S0140-6736(20)31483-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Buss LF, Prete CA, Jr, Abrahim CMM, Mendrone A, Jr, Salomon T, de Almeida-Neto C, França RFO, Belotti MC, Carvalho MPSS, Costa AG, Crispim MAE, et al. Three-quarters attack rate of SARS-CoV-2 in the Brazilian Amazon during a largely unmitigated epidemic. Science. 2021;371:288–292. doi: 10.1126/science.abe9728. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Uyoga S, Adetifa IMO, Karanja HK, Nyagwange J, Tuju J, Wanjiku P, Aman R, Mwangangi M, Amoth P, Kasera K, Ng’ang’a W, et al. Seroprevalence of anti-SARS-CoV-2 IgG antibodies in Kenyan blood donors. Science. 2021;371:79–82. doi: 10.1126/science.abe1916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Salje H, Paul KK, Paul R, Rodriguez-Barraquer I, Rahman Z, Alam MS, Rahman M, Al-Amin HM, Heffelfinger J, Gurley E. Nationally-representative serostudy of dengue in Bangladesh allows generalizable disease burden estimates. eLife. 2019;8:e42869. doi: 10.7554/eLife.42869. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Azman AS, Lauer SA, Bhuiyan TR, Luquero FJ, Leung DT, Hegde ST, Harris JB, Paul KK, Khaton F, Ferdous J, Lessler J, et al. Vibrio cholerae O1 transmission in Bangladesh: Insights from a nationally representative serosurvey. Lancet Microbe. 2020;1:e336–e343. doi: 10.1016/S2666-5247(20)30141-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.O’Driscoll M, Ribeiro Dos Santos G, Wang L, Cummings DAT, Azman AS, Paireau J, Fontanet A, Cauchemez S, Salje H. Age-specific mortality and immunity patterns of SARS-CoV-2. Nature. 2021;590:140–145. doi: 10.1038/s41586-020-2918-0. [DOI] [PubMed] [Google Scholar]
  • 10.Chan Y, Martin D, Mace KE, Jean SE, Stresman G, Drakeley C, Chang MA, Lemoine JF, Udhayakumar V, Lammie PJ, Priest JW, et al. Multiplex serology for measurement of IgG antibodies against eleven infectious diseases in a national serosurvey: Haiti 2014-2015. Front Public Health. 2022;10:897013. doi: 10.3389/fpubh.2022.897013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.O’Hearn AE, Voorhees MA, Fetterer DP, Wauquier N, Coomber MR, Bangura J, Fair JN, Gonzalez R-J, Schoepp RJ. Serosurveillance of viral pathogens circulating in West Africa. Virol J. 2016;13:163. doi: 10.1186/s12985-016-0621-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.De Thoisy A, Woudenberg T, Pelleau S, Donnadieu F, Garcia L, Pinaud L, Tondeur L, Meola A, Arowas L, Clement N, Backovic M, et al. COVID-Oise and SeroPed study teams, Seroepidemiology of the seasonal human coronaviruses NL63, 229E, OC43 and HKU1 in France. Open Forum Infect Dis. 2023;10:ofad340. doi: 10.1093/ofid/ofad340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Woudenberg T, Pelleau S, Anna F, Attia M, Donnadieu F, Gravet A, Lohmann C, Seraphin H, Guiheneuf R, Delamare C, Stefic K, et al. Humoral immunity to SARS-CoV-2 and seasonal coronaviruses in children and adults in north-eastern France. EBioMedicine. 2021;70:103495. doi: 10.1016/j.ebiom.2021.103495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Arnold BF, Martin DL, Juma J, Mkocha H, Ochieng JB, Cooley GM, Omore R, Goodhew EB, Morris JF, Costantini V, Vinjé J, et al. Enteropathogen antibody dynamics and force of infection among children in low-resource settings. eLife. 2019;8:e45594. doi: 10.7554/eLife.45594. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Moss DM, Priest JW, Hamlin K, Derado G, Herbein J, Petri WA, Jr, Lammie PJ. Longitudinal evaluation of enteric protozoa in Haitian children by stool exam and multiplex serologic assay. Am J Trop Med Hyg. 2014;90:653–660. doi: 10.4269/ajtmh.13-0545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rakotondramanga JM, Vigan-Womas I, Steinhardt LC, Harimanana A, Ravaoarisoa E, Rasoloharimanana TL, Razanatsiorimalala S, Wesolowski A, Randrianarivelojosia M, Roche B, Garchitorena A. Identification of factors associated with residual malaria transmission using school-based serological surveys in settings pursuing elimination. Malar J. 2022;21:242. doi: 10.1186/s12936-022-04260-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Jones FK, Bhuiyan TR, Muise RE, Khan AI, Slater DM, Hutt Vater KR, Chowdhury F, Kelly M, Xu P, Kováč P, Biswas R, et al. Identifying recent cholera infections using a multiplex bead serological assay. mBio. 2022;13:e0190022. doi: 10.1128/mbio.01900-22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bailly S, Rousset D, Fritzell C, Hozé N, Ben Achour S, Berthelot L, Enfissi A, Vanhomwegen J, Salje H, Fernandes-Pellerin S, Saout M, et al. Spatial distribution and burden of emerging arboviruses in French Guiana. Viruses. 2021;13:1299. doi: 10.3390/v13071299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wiens KE, Jauregui B, Arnold BF, Banke K, Wade D, Hayford K, Costero-Saint Denis A, Hall RH, Salje H, Rodriguez-Barraquer I, Azman AS, et al. Collaboration on Integrated Biomarkers Surveillance, Building an integrated serosurveillance platform to inform public health interventions: Insights from an experts’ meeting on serum biomarkers. PLoS Negl Trop Dis. 2022;16:e0010657. doi: 10.1371/journal.pntd.0010657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Pan American Health Organization. Toolkit for integrated serosurveillance of communicable diseases in the Americas. 2022. www.paho.org/en/documents/toolkit-integrated-serosurveillance-communicable-diseases-americas .
  • 21.Arnold BF, Scobie HM, Priest JW, Lammie PJ. Integrated serologic surveillance of population immunity and disease transmission. Emerg Infect Dis. 2018;24:1188–1194. doi: 10.3201/eid2407.171928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Gaspar-Castillo C, Rodríguez MH, Ortiz-Navarrete V, Alpuche-Aranda CM, Martinez-Barnetche J. Structural and immunological basis of cross-reactivity between dengue and Zika infections: Implications in serosurveillance in endemic regions. Front Microbiol. 2023;14:1107496. doi: 10.3389/fmicb.2023.1107496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rathore APS, St John AL. Cross-reactive immunity among flaviviruses. Front Immunol. 2020;11:334. doi: 10.3389/fimmu.2020.00334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Hay JA, Routledge I, Takahashi S. Serodynamics: A primer and synthetic review of methods for epidemiological inference using serological data. Epidemics. 2024;49:100806. doi: 10.1016/j.epidem.2024.100806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Hozé N, Salje H, Rousset D, Fritzell C, Vanhomwegen J, Bailly S, Najm M, Enfissi A, Manuguerra J-C, Flamand C, Cauchemez S. Reconstructing Mayaro virus circulation in French Guiana shows frequent spillovers. Nat Commun. 2020;11:2842. doi: 10.1038/s41467-020-16516-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Hozé N, Diarra I, Sangaré AK, Pastorino B, Pezzi L, Kouriba B, Sagara I, Dabo A, Djimdé A, Thera MA, Doumbo OK, et al. Model-based assessment of chikungunya and O’nyong-nyong virus circulation in Mali in a serological cross-reactivity context. Nat Commun. 2021;12:6735. doi: 10.1038/s41467-021-26707-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Pierson TC, Diamond MS. The continued threat of emerging flaviviruses. Nat Microbiol. 2020;5:796–812. doi: 10.1038/s41564-020-0714-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Hossain MJ, Gurley ES, Montgomery S, Petersen L, Sejvar J, Fischer M, Panella A, Powers AM, Nahar N, Uddin AKMR, Rahman ME, et al. Hospital-based surveillance for Japanese encephalitis at four sites in Bangladesh, 2003-2005. Am J Trop Med Hyg. 2010;82:344–349. doi: 10.4269/ajtmh.2010.09-0125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Haque F, Rahman M, Banu NN, Sharif AR, Jubayer S, Shamsuzzaman A, Alamgir A, Erasmus JH, Guzman H, Forrester N, Luby SP, et al. An epidemic of chikungunya in northwestern Bangladesh in 2011. PLOS ONE. 2019;14:e0212218. doi: 10.1371/journal.pone.0212218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Beck C, Desprès P, Paulous S, Vanhomwegen J, Lowenski S, Nowotny N, Durand B, Blaise-Boisseau S, Guitton E, Yamanaka T, Zientara S, et al. A high-performance multiplex immunoassay for serodiagnosis of flavivirus-associated neurological diseases in horses. Biomed Res Int. 2015;2015:678084. doi: 10.1155/2015/678084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Grimm KJ, Houpt R, Rodgers D. Model fit and comparison in finite mixture models: A review and a novel approach. Front Educ. 2021;6:613645 [Google Scholar]
  • 32.Khan AM, Khan AQ, Dobrzynski L, Joshi GP, Myat A. A Japanese encephalitis focus in Bangladesh. J Trop Med Hyg. 1981;84:41–44. [PubMed] [Google Scholar]
  • 33.Duque MP, Naser AM, Paul KK, Rahman M, Alam MS, Al-Amin HM, Rahman MZ, Hossain ME, Paul RC, Luby SP, Cauchemez S, et al. Informing an investment case for Japanese encephalitis vaccine introduction in Bangladesh. medRxiv. 2024;24303865 doi: 10.1126/sciadv.adp1657. [Preprint] [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Muraduzzaman AKM, Sultana S, Shirin T, Khatun S, Islam M, Rahman M. Introduction of Zika virus in Bangladesh: An impending public health threat. Asian Pac J Trop Med. 2017;10:925–928. doi: 10.1016/j.apjtm.2017.08.015. [DOI] [PubMed] [Google Scholar]
  • 35.Hossain MS, Noman AA, Mamun SMAA, Mosabbir AA. Twenty-two years of dengue outbreaks in Bangladesh: Epidemiology, clinical spectrum, serotypes, and future disease risks. Trop Med Health. 2023;51:37. doi: 10.1186/s41182-023-00528-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Moral S. New virus detected in Bangladesh. Prothom Alo English. 2019 September 26; https://en.prothomalo.com/bangladesh/New-virus-detected-in-Bangladesh . [Google Scholar]
  • 37.Allen SW, Ribeiro Dos Santos G, Paul KK, Paul R, Rahman MZ, Alam MS, Rahman M, Al-Amin HM, Vanhomwegen J, Weaver SC, Smull T, et al. Results of a nationally representative seroprevalence survey of chikungunya virus in Bangladesh. J Infect Dis. 2024;230:e1031–e1038. doi: 10.1093/infdis/jiae335. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Duque MP, Paul KK, Sultana R, Ribeiro Dos Santos G, O’Driscoll M, Naser AM, Rahman M, Alam MS, Al-Amin HM, Rahman MZ, Hossain ME, et al. National burden and optimal vaccine policy for Japanese encephalitis virus in Bangladesh. medRxiv. 2025;25330995 doi: 10.1016/S1473-3099(25)00590-0. [Preprint] [DOI] [PubMed] [Google Scholar]
  • 39.GeurtsvanKessel CH, Islam Z, Islam MB, Kamga S, Papri N, van de Vijver DAMC, Reusken C, Mogling R, Heikema AP, Jahan I, Pradel FK, et al. Zika virus and Guillain-Barré syndrome in Bangladesh. Ann Clin Transl Neurol. 2018;5:606–615. doi: 10.1002/acn3.556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Metcalf CJE, Farrar J, Cutts FT, Basta NE, Graham AL, Lessler J, Ferguson NM, Burke DS, Grenfell BT. Use of serological surveys to generate key insights into the changing global landscape of infectious disease. Lancet. 2016;388:728–730. doi: 10.1016/S0140-6736(16)30164-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Mina MJ, Metcalf CJE, McDermott AB, Douek DC, Farrar J, Grenfell BT. A Global lmmunological Observatory to meet a time of pandemics. eLife. 2020;9:e58989. doi: 10.7554/eLife.58989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Nitatpattana N, Kanjanopas K, Yoksan S, Satimai W, Vongba N, Langdatsuwan S, Nakgoi K, Ratchakum S, Wauquier N, Souris M, Auewarakul P, et al. Long-term persistence of Chikungunya virus neutralizing antibodies in human populations of North Eastern Thailand. Virol J. 2014;11:183. doi: 10.1186/1743-422X-11-183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.García G, González N, Pérez AB, Sierra B, Aguirre E, Rizo D, Izquierdo A, Sánchez L, Díaz D, Lezcay M, Pacheco B, et al. Long-term persistence of clinical symptoms in dengue-infected persons and its association with immunological disorders. Int J Infect Dis. 2011;15:e38–e43. doi: 10.1016/j.ijid.2010.09.008. [DOI] [PubMed] [Google Scholar]
  • 44.Bangladesh Bureau of Statistics. 2011. https://bbs.gov.bd/site/page/47856ad0-7e1c-4aab-bd78-892733bc06eb/Population-and-Housing-Census .
  • 45.McArdle JJ, Ferrer-Caja E, Hamagami F, Woodcock RW. Comparative longitudinal structural analyses of the growth and decline of multiple intellectual abilities over the life span. Dev Psychol. 2002;38:115–142. [PubMed] [Google Scholar]
  • 46.Stan Development Team. RStan: The R interface to Stan, version 2. 2016 [Google Scholar]
  • 47.Smith DJ, Lapedes AS, de Jong JC, Bestebroer TM, Rimmelzwaan GF, Osterhaus ADME, Fouchier RAM. Mapping the antigenic and genetic evolution of influenza virus. Science. 2004;305:371–376. doi: 10.1126/science.1097211. [DOI] [PubMed] [Google Scholar]
  • 48.Edgar RC. MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–1797. doi: 10.1093/nar/gkh340. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32:268–274. doi: 10.1093/molbev/msu300. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

MDAR Reproducibility Checklist
Supplementary Materials

Data Availability Statement

All data associated with this study are present in the paper or the Supplementary Materials. Anonymized data and code to reproduce the analysis can be found at https://doi.org/10.5281/zenodo.17340210.

RESOURCES