Abstract
Objectives
Pertussis is an important contributor to respiratory morbidity and mortality and remains underdiagnosed since pertussis re-emergence becomes a global public health concern in recent years. This multi-center retrospective observational study aimed to develop a diagnostic prediction model based on laboratory blood parameters to identify children with pertussis.
Methods
A total of 545 children with suspected pertussis from the hospital between January 2024 and June 2024 were identified and randomly split into training (n = 381), and validation group (n = 164). A model was generated based on Least absolute shrinkage and selection operator (LASSO) and multivariate logistic regression analysis identifying potential predictors of pertussis. Model performance was assessed using receiver-operating characteristic (ROC), calibration plots and decision curve analysis (DCA). External validation was conducted on an independent cohort from another hospital (n = 594).
Results
Based on LASSO and multivariate analysis, a model was formulated incorporating four predictors, including age, CRP level, lymphocyte count, and hematocrit (ROC: 0.741 for the training cohort, 0.694 for the internal validation cohort and 0.675 for the external validation cohort). Calibration curve and DCA validated the fitness and clinical application value of this nomogram.
Conclusions
The model developed in this multicenter cohort study exhibits high accuracy in predicting the probability of pertussis in children. This model can be very helpful for clinicians when making decisions.
Keywords: Diagnosis, Laboratory parameters, Model, Multicenter, Pertussis
Introduction
Pertussis, also known as whooping cough, is one of the Class B notifiable infectious diseases in China [1]. It is caused by Gram-negative bacilli Bordetella pertussis (B. pertussis) and spreads via airborne droplets [2]. Typical pertussis is characterized by paroxysmal spasmodic cough, and a croupy inspiratory whoop at the end of coughing. Most of the infected individuals with pertussis are minors, with preschool children being the majority. Since the Tdap (tetanus-diphtheria-acellular pertussis) vaccine was included in the childhood immunization program, the incidence of pertussis has decreased significantly and was once considered to be effectively controlled [3].
However, since the 1980 s, pertussis epidemic outbreak has been reported in some developed countries with high vaccine coverage, and the same public health issue has emerged in China after 2011 [4]. According to the surveillance data from the Chinese Center for Disease Control and Prevention, 41,124 cases of pertussis were reported in 2023, and as many as 38,295 cases in 2022, while the number was only 4475 cases in 2020 [5]. The analysis suggests that the decline in immune efficacy resulting from vaccination and exposure to B. pertussis are the main reasons [6]. In addition, due to the lack of comprehensive active surveillance for pertussis and the underdiagnosis of atypical pertussis cases, the incidence of pertussis in China is still significantly underestimated [7]. Sufficient attention should be paid to pertussis and proper safety precautions should be taken.
Pertussis can be difficult to diagnose because the signs and symptoms are often similar to other respiratory illnesses. Previous studies have explored the diagnostic role of several infection biomarkers in pertussis, such as lymphocyte percentages, white blood cell count, absolute lymphocyte count, absolute neutrophil count and neutrophil-to-lymphocyte ratio (NLR) [8, 9]. However, individual indicators with limited sensitivity and specificity are not always reliable.
Nomograms are more and more widely used in the diagnosis and prognosis of diseases mainly because they can provide enhanced utility in improving diagnostic precision. Therefore, clinical data of children with suspected pertussis were reviewed to develop and validate a diagnostic prediction nomogram for pertussis in this study. We conducted it at two centers to enhance generalizability and validity of our findings. This nomogram based on blood parameters will provide guidance for the early identification of pertussis and enhance capabilities in managing infectious diseases especially in undeveloped areas.
Methods
Study design and participants
All suspected cases of pertussis who were aged less than 14 years in the outpatient department at 1 of 2 hospitals (January 2024 to June 2024) were identified in our multicenter retrospective study. These cases were identified according to the WHO definition [10]. A suspected case was a person who has had a cough for at least 2 weeks or during an outbreak without any other diagnosis but with at least one of the following symptoms: (1) paroxysmal cough; (2) inspiratory rattle; (3) vomiting after coughing or vomiting without other cause; (4) apnea (only applicable to infants under 1 year of age); (5) a clinician suspects whooping cough. Patients with incomplete laboratory data were excluded. Finally, we identified 545 suspected cases of pertussis from the First Affiliated Hospital with Nanjing Medical University, and all patients were randomly assigned to training or validation cohorts with a ratio of 7:3. Besides, 594 cases with suspected pertussis from another hospital were recruited in testing cohort for external validation. The training cohort was used to screen variables and build the model while the validation and testing cohorts were used to evaluate the effect of the trained model. Polymerase chain reaction (PCR) test for pertussis in nasopharyngeal swabs or upper respiratory tract aspirates was conducted in all cases while drawing blood for complete blood count. Patients who met the clinical diagnostic criteria for pertussis in the “Suggestions for Diagnosis and Treatment of Pertussis in Children in China” and were detected positive by B. pertussis PCR test, were diagnosed as pertussis [11]. Approval of this study from Ethics Committee of the First Affiliated Hospital with Nanjing Medical University(2024-SR-908) and Children’s Hospital of Nanjing Medical University (202411002-1) were obtained and the requirement for informed consent was waived.
Data collection
The data for this study were obtained from the electronic medical record system. The variables of interest included patient demographics, complete blood count (first blood parameters due to suspected of pertussis in the outpatient department), C-reactive protein (CRP) level and B. pertussis PCR results. The NLR, platelet-to-lymphocyte ratio (PLR), monocyte-to-lymphocyte ratio (MLR), platelet×neutrophil/lymphocyte ratio (SII), and neutrophil×monocyte/lymphocyte ratio (SIRI) were calculated simultaneously based on complete blood count parameters. The children were divided into the positive and negative groups according to B. pertussis PCR results.
Statistical analysis
Statistical analyses were performed by IBM SPSS version 21.0 (SPSS Inc., Chicago, IL, USA). Categorical variables were presented as frequency (percentages), and continuous variables were presented as mean ± standard deviations or median (interquartile range) as appropriate. We used the Pearson chi-squared tests or Fisher’s exact test to compare categorical variables and the independent sample t-tests or Mann-Whitney U tests to analyze continuous variables. Laboratory factors with P < 0.1 between the positive and negative groups were firstly converted to binary factors with the optimal cutoff value, then least absolute shrinkage and selection operator (LASSO) regression and logistic regression method were used to select the final variables to establish a model. Model performance was assessed through the area under the receiver operating characteristic (ROC) curve, calibration curves, decision curve analysis (DCA) and generalization. All tests were two-tailed and P < 0.05 was defined as statistically significant.
Results
The study included 1139 children with suspected pertussis. Children from our hospital were randomly divided into a training cohort and a validation cohort at a ratio of 7:3 while others from another were set as testing cohort for external validation. Patient characteristics for the training (n = 381), internal validation (n = 164), and external validation cohorts (n = 594) are described in Table 1. The three cohorts were comparable in terms of the data. The children in the training cohort were divided into positive and negative groups according to PCR results. Further statistical analysis revealed statistically significant differences in blood parameters between these groups, such as CRP levels and blood cell counts (Table 2).
Table 1.
Baseline demographic and clinical characteristics of patients included in the three cohort
| Characteristics | Cohort | P1-value | P2-value | ||
|---|---|---|---|---|---|
| Training (n = 381) | Internal Validation (n = 164) | External Validation (n = 594) | |||
| Gender (male, %) | 187 (49.1%) | 90 (54.9%) | 296 (49.8%) | 0.214 | 0.437 |
| Age (years) | 6.0 (4.0, 8.0) | 7.0 (4.0, 9.0) | 6.0 (4.0, 8.0) | 0.221 | 0.364 |
| Clinical symptoms | |||||
| Fever, n | 123 | 145 | 306 | < 0.001 | < 0.001 |
| Cough, n | 258 | 11 | 191 | < 0.001 | < 0.001 |
| Throat pain, n | 75 | 0 | 79 | < 0.001 | < 0.001 |
| Abdominal pain, n | 7 | 1 | 4 | 0.446 | 0.185 |
| Diarrhea, n | 27 | 7 | 33 | 0.212 | 0.390 |
| Vomit, n | 25 | 1 | 29 | 0.003 | 0.012 |
| Dizziness, n | 6 | 1 | 5 | 0.681 | 0.459 |
| Muscle pain, n | 6 | 4 | 10 | 0.498 | 0.766 |
| Fatigue, n | 29 | 2 | 19 | 0.003 | < 0.001 |
| CRP (mg/L) | 1.5 (0.5, 5.1) | 1.5 (0.5, 4.8) | 2.2 (0.8, 5.5) | 0.487 | < 0.001 |
| WBC (×10 9 /L) | 9.1 (7.3, 11.1) | 8.8 (7.3, 11.5) | 8.8 (7.2, 11.0) | 0.771 | 0.896 |
| Lymphocyte (×10 9 /L) | 3.08 (2.49, 3.69) | 3.12 (2.19, 4.25) | 2.96 (2.30, 4.10) | 0.979 | 0.977 |
| Monocyte (×10 9 /L) | 0.61 (0.47, 0.79) | 0.56 (0.44, 0.75) | 0.56 (0.45, 0.77) | 0.920 | 0.013 |
| Neutrophil (×10 9 /L) | 4.95 (3.47, 6.52) | 4.69 (3.43, 6.21) | 4.65 (3.37, 6.52) | 0.899 | 0.374 |
| Lymphocyte (%) | 37.5 (26.4, 46.1) | 36.1 (26.8, 46.2) | 34.7 (27.8, 42.5) | 0.824 | 0.190 |
| Monocyte (%) | 6.60 (5.40, 8.58) | 6.20 (5.10, 8.00) | 6.50 (5.08, 8.00) | 0.596 | 0.009 |
| Neutrophil (%) | 52.7 (42.8, 64.4) | 52.7 (44.0, 65.0) | 54.2 (45.7, 62.1) | 0.920 | 0.842 |
| RBC (×10 12 /L) | 4.67 (4.48, 4.91) | 4.60 (4.35, 4.84) | 4.69 (4.45, 4.87) | 0.061 | 0.001 |
| Hemoglobin (g/L) | 131 (125, 137) | 129 (122, 135) | 130 (124, 135) | 0.253 | 0.002 |
| Hematocrit | 39.70 (38.03, 41.60) | 39.90 (37.50, 41.80) | 40.10 (38.50, 41.80) | 0.086 | 0.151 |
| MCV (fL) | 84.7 (82.9, 87.0) | 87.0 (84.2, 89.5) | 86.9 (83.9, 89.2) | 0.510 | < 0.001 |
| MCH (pg) | 28.00 (27.20, 28.70) | 28.20 (27.40, 29.00) | 28.00 (27.10, 28.73) | 0.069 | 0.025 |
| MCHC (g/L) | 330 (325, 335) | 323 (318, 329) | 323 (318, 328) | 0.197 | < 0.001 |
| RDW (%) | 12.90 (12.50, 13.30) | 13.00 (12.70, 13.50) | 13.25 (12.90, 13.60) | 0.004 | < 0.001 |
| PLT (×10 9 /L) | 294 (249, 338) | 300 (251, 351) | 280 (248, 320) | 0.004 | 0.015 |
| MPV (fL) | 9.10 (8.43, 9.80) | 8.50 (8.00, 9.30) | 8.70 (7.98, 9.30) | 0.520 | < 0.001 |
| PDW (%) | 15.90 (15.40, 16.10) | 15.90 (15.70, 16.30) | 16.05 (15.78, 16.33) | 0.055 | < 0.001 |
| NLR | 1.56 (1.07, 2.25) | 1.38 (0.93, 2.43) | 1.46 (0.95, 2.44) | 0.805 | 0.428 |
| PLR | 96 (76, 119) | 98 (75, 130) | 90 (70, 121) | 0.082 | 0.192 |
| MLR | 0.20 (0.15, 0.27) | 0.18 (0.12, 0.28) | 0.19 (0.13, 0.28) | 0.898 | 0.018 |
| SII | 464 (308, 665) | 438 (275, 695) | 399 (279, 693) | 0.418 | 0.228 |
| SIRI | 0.93 (0.61, 1.60) | 0.77 (0.44, 1.57) | 0.90 (0.47, 1.44) | 0.794 | 0.016 |
Data are expressed as number, median (interquartile range)
P1, Training cohort vs. Internal Validation cohort; P2, Training cohort vs. Internal Validation cohort vs. External Validation cohort
Abbreviations: CRP C-reactive protein, WBC White blood cell, RBC Red blood cell, MCV Mean red blood cell volume, MCH Mean red blood cell hemoglobin, MCHC Mean red blood cell hemoglobin concentration, RDW Red blood cell distribution width, PLT Platelet, MPV Mean platelet volume, PDW Platelet distribution width, NLR Neutrophil to lymphocyte ratio, PLR Platelet to lymphocyte ratio, MLR Monocyte to lymphocyte ratio, SII platelet × neutrophil/lymphocyte, SIRI neutrophil × monocyte/lymphocyte
Table 2.
Comparison of characteristics in training cohort
| Characteristics | Training cohort | ||
|---|---|---|---|
| Negative (n = 267) | Positive (n = 114) | P value | |
| Gender (male, %) | 138 (51.7%) | 49 (43.0%) | 0.120 |
| Age (years) | 6.0 (4.0, 8.0) | 7.0 (6.0, 10.0) | < 0.001 |
| CRP (mg/L) | 1.7 (0.6, 7.7) | 1.5 (0.5, 3.4) | 0.003 |
| WBC (×10 9 /L) | 8.5 (7.0, 11.0) | 9.4 (8.0, 12.2) | 0.001 |
| Lymphocyte (×10 9 /L) | 2.74 (1.99, 3.98) | 3.80 (2.89, 5.10) | < 0.001 |
| Monocyte (×10 9 /L) | 0.56 (0.44, 0.75) | 0.59 (0.44, 0.76) | 0.531 |
| Neutrophil (×10 9 /L) | 4.53 (3.24, 6.16) | 4.77 (3.77, 6.26) | 0.324 |
| Lymphocyte (%) | 35.3 (23.9, 45.1) | 41.0 (30.4, 48.8) | 0.002 |
| Monocyte (%) | 6.50 (5.40, 8.35) | 5.60 (4.83, 7.35) | 0.001 |
| Neutrophil (%) | 54.2 (43.2, 66.1) | 50.3 (41.6, 59.9) | 0.013 |
| RBC (×10 12 /L) | 4.54 (4.31, 4.82) | 4.68 (4.45, 4.88) | 0.012 |
| Hemoglobin (g/L) | 128 ± 10 | 131 ± 11 | 0.018 |
| Hematocrit | 39.4 ± 3.2 | 40.6 ± 3.2 | 0.001 |
| MCV (fL) | 86.8 (84.0, 89.1) | 87.9 (84.7, 90.5) | 0.022 |
| MCH (pg) | 28.10 (27.30, 29.00) | 28.50 (27.43, 29.10) | 0.158 |
| MCHC (g/L) | 323 (319, 330) | 323 (317, 328) | 0.047 |
| RDW (%) | 13.10 (12.70, 13.50) | 12.90 (12.70, 13.30) | 0.224 |
| PLT (×10 9 /L) | 292 (243, 346) | 320 (277, 358) | 0.007 |
| MPV (fL) | 8.50 (8.00, 9.40) | 8.55 (8.10, 9.10) | 0.693 |
| PDW (%) | 15.90 (15.70, 16.30) | 16.00 (15.73, 16.28) | 0.758 |
| NLR | 1.50 (0.96, 2.75) | 1.23 (0.89, 1.87) | 0.005 |
| PLR | 105 (78, 136) | 81 (62, 104) | < 0.001 |
| MLR | 0.20 (0.13, 0.31) | 0.14 (0.11, 0.22) | < 0.001 |
| SII | 455 (280, 755) | 378 (249, 605) | 0.047 |
| SIRI | 0.83 (0.45, 1.78) | 0.69 (0.44, 1.22) | 0.060 |
Data are expressed as number, median (interquartile range), mean ± standard deviations
Abbreviations: CRP C-reactive protein, WBC White blood cell, RBC Red blood cell, MCV Mean red blood cell volume, MCH Mean red blood cell hemoglobin, MCHC Mean red blood cell hemoglobin concentration, RDW Red blood cell distribution width, PLT Platelet, MPV Mean platelet volume, PDW Platelet distribution width, NLR Neutrophil to lymphocyte ratio, PLR Platelet to lymphocyte ratio, MLR Monocyte to lymphocyte ratio, SII platelet × neutrophil/lymphocyte, SIRI neutrophil × monocyte/lymphocyte
To the training cohort, 17 continuous factors with P < 0.1 in Table 2 between the positive and negative groups were firstly converted to binary factors with the optimal cutoff value and then were included in the LASSO regression. At the optimal lambda value, only eight predictive variables with non-zero coefficients were selected, representing the most influential factors potentially associated with pertussis diagnosis (Fig. 1). Finally, multivariate regression analysis confirmed that age [odds ratio (OR) = 2.654, 95% confidence interval (CI): 1.363–5.167, P = 0.004], CRP (OR = 0.948, 95% CI: 0.216–0.885, P = 0.021), lymphocyte (OR = 3.496, 95% CI: 2.104–5.808, P < 0.001), and hematocrit (OR = 2.266, 95% CI: 1.356–3.786, P = 0.002) were independent factors associated with pertussis. The OR value of each variable is shown in Table 3. Then, a model was built based on the above results.
Fig. 1.
Features selection using the LASSO binomial regression model. A The partial likelihood deviance (binomial deviance) curve was plotted versus log (lambda). B LASSO regression coefficients correspond to lambda values. C Coefficients of the selected features
Table 3.
Univariable and multivariable logistic regression model for training cohort patients
| Variables | Univariate analysis | Multivariate analysis | ||||
|---|---|---|---|---|---|---|
| OR | 95% CI | P value | OR | 95% CI | P value | |
| Age | 2.790 | 1.501–5.185 | 0.001 | 2.654 | 1.363–5.167 | 0.004 |
| CRP | 0.285 | 0.148–0.548 | < 0.001 | 0.437 | 0.216–0.885 | 0.021 |
| Lymphocyte | 3.806 | 2.376–6.097 | < 0.001 | 3.496 | 2.104–5.808 | < 0.001 |
| Monocyte (%) | 0.425 | 0.271–0.665 | < 0.001 | |||
| RBC | 2.193 | 1.370–3.512 | 0.001 | |||
| Hematocrit | 2.543 | 1.584–4.085 | < 0.001 | 2.266 | 1.356–3.786 | 0.002 |
| MCHC | 0.489 | 0.290–0.824 | 0.007 | |||
| PLR | 0.326 | 0.207–0.513 | < 0.001 | |||
Age, CRP, Lymphocyte, Monocyte (%), RBC, Hematocrit, MCHC and PLR were identified based on LASSO regression
Abbreviations: CRP C-reactive protein, RBC Red blood cell, MCHC Mean red blood cell hemoglobin concentration, PLR Platelet to lymphocyte ratio
Model: Logit P = 0.967 × Age − 0.828 × CRP + 1.252 × Lymphocyte + 0.818 × Hematocrit − 4.894
The ROC analysis was shown in Fig. 2A-C, with an AUC of 0.741 (95% CI: 0.691–0.792) in training cohort, 0.694 (95% CI: 0.590–0.797) in internal validation cohort, and 0.675 (95% CI: 0.625–0.724) in external validation cohort. The sensitivity and specificity were 64.9% and 70.8% respectively, in training cohort, 79.5% and 40.8% respectively, in internal validation cohort, and 68.5% and 56.6% respectively, in external validation cohort. The calibration curve showed fairly good agreement in training cohort and generally good in internal validation and external validation cohort (Fig. 2D-F). DCA revealed that employing the model to diagnose pertussis yielded a substantial net benefit for nearly all threshold probabilities in all cohorts (Fig. 2G-I). This suggests that the model’s potential clinical applicability and efficacy in pertussis diagnosis and treatment.
Fig. 2.
Performance of the model in different cohorts. A ROC for training cohort; B ROC for internal validation cohort; C ROC for external validation cohort. D Calibration curve of training cohort; E Calibration curve of internal validation cohort; F Calibration curve of external validation cohort. G DCA for training cohort; H DCA for internal validation cohort; I DCA for external validation cohort
Discussion
Pertussis is a highly contagious respiratory infection worldwide, experiencing epidemic spikes every 2–5 years and affecting individuals across all age groups [12]. It is particularly devastating for infants who suffer from the highest age-specific incidence and are responsible for nearly all pertussis hospital admissions and fatalities [13]. This may be due to the lack of vaccination during the first three months of life and the insufficient titer of maternally transmitted antibodies for effective protection. The most frequent complications of pertussis include pulmonary, neurologic, and nutritional [14]. A prospective multicenter surveillance study in Germany showed that the overall rate of major complications in infants and children with pertussis was 6% [15]. In severe cases, children would die from respiratory failure or acute pertussis encephalopathy. Son et al. conducted a multinational survey among Asian children and adolescents and found evidence of substantial B. pertussis circulation, indicating that 1 in 20 individuals has recently been infected, irrespective of their vaccination status [16]. Early discovery, diagnosis and treatment were essential to reduce disease severity and duration of pertussis, thereby improving patients’ quality of life.
Currently, the diagnosis of pertussis encompasses both clinical and laboratory assessments. However, the reduction in typical pertussis cases post-vaccination has made early detection challenging for clinicians using current clinical case definitions, often resulting in undiagnosed or misdiagnosed cases. Laboratory diagnostic methods include pathogen isolation, nucleic acid amplification tests, and detection of specific antibodies in serum. This is particularly crucial for cases of pertussis presenting with atypical symptoms, as laboratory testing serves as a vital tool for accurate disease diagnosis. Culturing and isolating the B. pertussis remains the gold standard for diagnosing pertussis with significant limitations. As a fastidious bacterium, B. pertussis requires special culture media and conditions, and its slow growth rate making culture time-consuming and unable to meet the need for early diagnosis and treatment in clinical practice [17]. The serological antibody detection also faces limitations, primarily due to the time constraints. Antibodies may not be detectable in patients until after a certain incubation period following exposure to the antigen. Additionally, vaccination against pertussis can lead to increased antibody levels, complicating the interpretation of test results [18]. PCR detection has emerged as the preferred method for diagnosing pertussis internationally, offering a short turnaround time, high sensitivity, and a low risk of contamination [19]. However, this method requires specialized laboratory equipment and technical expertise, which can be challenging to access in resource-limited settings and is relatively expensive.
To address this, we developed a predictive model for pertussis in children which can reliably assist clinicians in diagnosing pertussis. Timely management of patients predicted to be positive through our model could play a crucial role in preventing the widespread transmission of pertussis. In our study, we used ROC and calibration plots to evaluate diagnostic effectiveness and validated its clinical net benefits through DCA. The model displayed fairly good discriminative ability for the training cohort, with an AUC of 0.741. The calibration curve showed fairly good agreement in training cohort and generally good in internal validation and external validation cohort.
Our study identified higher age, lymphocyte count, hematocrit, and lower CRP as significant predictors of pertussis in children. We found that children older than 5 years are more susceptible to pertussis. The current immunization schedule in China is that infants should receive one dose of the combined vaccine against Tdap at 3 months, 4 months, 5 months, and 18 months of age [4]. The immunity provided by the vaccine may weaken over time, leaving older children vulnerable to pertussis. In addition, close contact between young children in confined spaces such as childcare facilities or schools also increases the risk of pertussis infection.
Blood cell markers had a good ability to predict pertussis. Previous studies have identified several biomarkers related to pertussis diagnosis, including white blood cells, lymphocytes, and neutrophils [20]. In this study, through the retrospective analysis of the laboratory characteristics of pertussis, we also found that lymphocyte was independent factors associated with pertussis. Pertussis toxin (PT) secreted by B. pertussis is a key factor in promoting lymphocyte proliferation [21]. PT activates a variety of cell signaling pathways through ADP ribosylation, promoting the release of lymphocytes from the bone marrow and spleen into the blood, resulting in an increase of peripheral blood lymphocytes [22]. In addition, pertussis infection triggers a strong immune response, especially the activation of CD4+ T lymphocytes.
Gao et al. found that hematocrit was independent factors associated with pertussis, which was consistent with our results [23]. In response to infections, the production of red blood cells within the bloodstream increases. This physiological adaptation could be an attempt to enhance oxygen delivery, which is essential for bolstering the immune system and facilitating tissue repair. Moreover, sever infections can trigger the release of inflammatory mediators that boost blood vessel permeability [24]. This allows fluid components to seep into the surrounding tissues, leading to a relative increase in the concentration of formed blood elements, which result in a relative rise in hematocrit levels consequently. In line with the results of previous studies [25], CRP was not significantly elevated in children with pertussis in our study. As a traditional bacterial marker, the underlying mechanism remains ambiguous and requires additional investigation.
Tozzi et al. developed an algorithm based on clinical symptoms, including a suspicion of pertussis by a physician, whooping, cyanosis and absence of fever which had high predictive value for laboratory-confirmed pertussis, but the sensitivity was only 46.30% [26]. Daluwatte et al. also developed a machine learning model for pertussis based on signs and symptoms predicted pertussis with an area under the precision-recall of 0.24 [27]. The diagnostic efficacy of these models was not particularly good and they were not applicable to asymptomatic patients with pertussis. The model based on laboratory data established by Gao, which includes WBC, CRP and PDW-MPV-R, had good performance with AUC of 0.773 for the training cohort and 0.804 for the validation cohort, which indicated satisfactory discriminative ability [23]. However, a lack of multi-center clinical validation limited the external utility of the model. Another single-center retrospective study proposed a machine learning model combining pertussis symptoms and routine blood test results [28]. The AUC was 0.69, which was comparable to that in our study (0.741 for the training cohort, 0.694 for the internal validation, and 0.675 for the external validation cohort). Unfortunately, the authors transformed the continuous variables included in the study (such as white blood cell/lymphocyte count and CRP) into binary variables based on the cutoff values of published literature, which might affect the diagnostic efficacy of the model. For all we know, this was the first multi-center diagnostic model for pertussis based on laboratory data, which could assist the diagnosis and treatment of pertussis especially in underdeveloped areas of China and low-income Asian developing countries.
Our model also has some limitations in clinical applications. First, it merely served as a reference and cannot confirm pertussis presence as definitively as PCR tests. Second, retrospective data collection may have introduced selection bias and confounding factors. The constraints imposed by the sample size and potential data bias may lead to model instability or suboptimal performance. Third, our study just enrolled Chinese patients and the generality of our findings to non-Asian populations cannot be determined. Fourth, our model only incorporates laboratory data which was varied across disease stages, this may affect the accuracy of the model. Further studies are required to address these limitations and improve our model’s external utility in the future.
Conclusion
A novel model composed of age, CRP, lymphocyte and hematocrit can help us quickly identify pertussis in children in this multiple center cohorts. Based on the observed performance of our model, this approach may serve as the basis for the diagnosis of pertussis in the low-economic area where laboratory confirmation is unavailable.
Authors’ contributions
J Z, L J W, and M W contributed equally to this work. J P L and J Z designed the study and wrote the manuscript. All the authors contributed to the generation, collection, assembly, and analysis and/or interpretation of data. X J C, L J W, and M W revised the manuscript. All the authors have read manuscript and approved the final manuscript.
Funding
None.
Data availability
The datasets are available from the corresponding author upon reasonable request.
Declarations
Ethics approval and consent to participate
The study was performed to conform with the Declaration of Helsinki and was approved by the local ethics committee of the First Affiliated Hospital with Nanjing Medical University (2024-SR-908) and Children’s Hospital of Nanjing Medical University (202411002-1). The requirement for obtaining informed consent from participants was waived by the ethics committee of the First Affiliated Hospital with Nanjing Medical University and Children’s Hospital of Nanjing Medical University because personally identifiable information was not used in this study.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jun Zhou, Lijun Wu and Min Wang contributed equally to this work.
Contributor Information
Xiangjun Cheng, Email: cxj-1983@163.com.
Jingping Liu, Email: liujp0983@njmu.edu.cn.
References
- 1.Zheng J, Zhang N, Shen G, Liang F, Zhao Y, He X, Wang Y, He R, Chen W, Xue H, Shen Y, Fu Y, Zhang W, Zhang L, Bhatt S, Mao Y, Zhu B. Spatiotemporal and seasonal trends of class A and B notifiable infectious diseases in China: retrospective analysis. JMIR Public Health Surveill. 2023;9:e42820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Domenech de Cellès M, Rohani P. Pertussis vaccines, epidemiology and evolution. Nat Rev Microbiol. 2024;22(11):722–35. [DOI] [PubMed]
- 3.Pertussis vaccines: WHO position paper - September 2015. Releve epidemiologique hebdomadaire. 2015;90(35):433–458. [PubMed]
- 4.Liu Y, Yu D, Wang K, Ye Q. Global resurgence of pertussis: a perspective from China. J Infect. 2024;89(5):106289. [DOI] [PubMed] [Google Scholar]
- 5.Pertussis reported cases and incidence. World Health Organization. 2024. https://immunizationdata.who.int/global/wiise-detail-page/pertussis-reported-cases-and-incidence?CODE=CHN&YEAR=.
- 6.Li L, Deng J, Ma X, Zhou K, Meng Q, Yuan L, Shi W, Wang Q, Li Y, Yao K. High prevalence of macrolide-resistant Bordetella pertussis and ptxP1 genotype, mainland China, 2014–2016. Emerg Infect Dis. 2019;25(12):2205–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang Y, Chen Z, Zhao J, Zhang N, Chen N, Zhang J, Li S, He Q. Increased susceptibility to pertussis in adults at childbearing age as determined by comparative seroprevalence study, China 2010–2016. J Infect. 2019;79(1):1–6. [DOI] [PubMed] [Google Scholar]
- 8.Birru F, Al-Hinai Z, Awlad Thani S, Al-Mukhaini K, Al-Zakwani I, Al-Abdwani R. Critical pertussis: a multi-centric analysis of risk factors and outcomes in Oman. Int J Infect Dis. 2021;107:53–8. [DOI] [PubMed] [Google Scholar]
- 9.Ganeshalingham A, Wilde J, Anderson BJ. The neutrophil-to-lymphocyte ratio in Bordetella pertussis infection. Pediatr Infect Dis J. 2017;36(11):1100–2. [DOI] [PubMed] [Google Scholar]
- 10.Pertussis. Vaccine Preventable Diseases Surveillance Standards. World Health Organization. 2018. https://www.who.int/publications/m/item/vaccine-preventable-diseases-surveillance-standards-pertussis.
- 11.Subspecialty Group of Infectious Diseases, the Society of Pediatrics, Chinese Medical Association. Editorial board, Chinese journal of pediatrics. Recommendation for diagnosis and treatment of Chinese children with pertussis. Chin J Pediatr. 2017;55(8):568–72. [DOI] [PubMed] [Google Scholar]
- 12.de Domenech M, Magpantay FM, King AA, Rohani P. The pertussis enigma: reconciling epidemiology, immunology and evolution. Proc Biol Sci. 2016;283:1822. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Decker MD, Edwards KM. Pertussis (Whooping cough). J Infect Dis. 2021;224(12 Suppl 2):S310-20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Olsen M, Thygesen SK, Østergaard JR, Nielsen H, Henderson VW, Ehrenstein V, Nørgaard M, Sørensen HT. Hospital-diagnosed pertussis infection in children and long-term risk of epilepsy. JAMA. 2015;314(17):1844–9. [DOI] [PubMed] [Google Scholar]
- 15.Heininger U, Klich K, Stehr K, Cherry JD. Clinical findings in Bordetella pertussis infections: results of a prospective multicenter surveillance study. Pediatrics. 1997;100(6):E10. [DOI] [PubMed] [Google Scholar]
- 16.Son S, Thamlikitkul V, Chokephaibulkit K, Perera J, Jayatilleke K, Hsueh PR, Lu CY, Balaji V, Moriuchi H, Nakashima Y, Lu M, Yang Y, Yao K, Kim SH, Song JH, Kim S, Kim MJ, Heininger U, Chiu CH, Kim YJ. Prospective multinational serosurveillance study of Bordetella pertussis infection among 10- to 18-year-old Asian children and adolescents. Clin Microbiol Infect. 2019;25(2):250.e251-250.e257. [DOI] [PubMed] [Google Scholar]
- 17.van der Zee A, Schellekens JF, Mooi FR. Laboratory diagnosis of pertussis. Clin Microbiol Rev. 2015;28(4):1005–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.May ML, Evans J, Riley J, Lambkin G, Robson JM. Development and validation of an Australian in-house anti-pertussis toxin IgG and IgA enzyme immunoassay. Pathology. 2013;45(2):172–80. [DOI] [PubMed] [Google Scholar]
- 19.Luis BAL, Guerrero Almeida ML, Ruiz-Palacios GM. A place for Bordetella pertussis in PCR-based diagnosis of community-acquired pneumonia. Infect Dis (London England). 2018;50(3):232–5. [DOI] [PubMed] [Google Scholar]
- 20.George TI. Malignant or benign leukocytosis. Hematol Am Soc Hematol Educ Program. 2012;2012:475–84. [DOI] [PubMed] [Google Scholar]
- 21.Gregg KA, Merkel TJ. Pertussis toxin: a key component in pertussis vaccines? Toxins. 2019. 10.3390/toxins11100557. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Locht C, Antoine R. History Pertussis Toxin. Toxins (Basel). 2021;13(9). [DOI] [PMC free article] [PubMed]
- 23.Gao Q, Xu D, Guan X, Jia P, Lei X. Development and validation of a diagnostic prediction model for children with pertussis. Sci Rep. 2024;14(1):17154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Kotepui M, Kotepui KU. Prevalence and laboratory analysis of malaria and dengue co-infection: a systematic review and meta-analysis. BMC Public Health. 2019;19(1):1148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Selbuz S, Çiftçi E, Özdemir H, Güriz H, İnce E. Comparison of the clinical and laboratory characteristics of pertussis or viral lower respiratory tract infections. J Infect Dev Ctries. 2019;13(9):823–30. [DOI] [PubMed] [Google Scholar]
- 26.Tozzi AE, Gesualdo F. A data driven clinical algorithm for differential diagnosis of pertussis and other respiratory infections in infants. PLoS One. 2020;15(7):e0236041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Daluwatte C, Dvaretskaya M, Ekhtiari S, Hayat P, Montmerle M, Mathur S, Macina D. Development of an algorithm for finding pertussis episodes in a population-based electronic health record database. Hum Vaccin Immunother. 2023;19(1):2209455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Iaco MC-D, Gesualdo KA, Pandolfi F, Croci E, Tozzi I. Machine learning clinical decision support systems for surveillance: a case study on pertussis and RSV in children. Front Pead. 2023;11:1112074. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets are available from the corresponding author upon reasonable request.


