Skip to main content
PLOS Digital Health logoLink to PLOS Digital Health
. 2025 Aug 8;4(8):e0000984. doi: 10.1371/journal.pdig.0000984

Expression of Concern: Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems

The PLOS Digital Health Editors
PMCID: PMC12333973  PMID: 40779442

The PLOS Digital Health Editors issue this Expression of Concern to inform readers of the following issues that came to light after this article’s [1] publication.

  • Two articles cited in [1] and included in the systematic review were retracted due to concerns about potential manipulation of the publication process. Reference 40 [2,3] was retracted before [1] was published, and Reference 49 [4,5] was retracted after [1] was published. A member of the PLOS Digital Health Editorial Board advised that the retracted references are not essential to support the research reported in [1].

  • The references cited in Table 1 of [1] do not correspond to the description in the table. This issue was not fully resolved in post-publication discussions.

  • The References list includes incomplete information for several cited works. These issues were not fully resolved in post-publication discussions.

  • The article [1] does not fully comply with PRISMA reporting guidelines or include a completed PRISMA checklist. In post-publication discussions the corresponding author provided a completed PRISMA checklist (S1 File) and the following methodological information, but even with these updates the methodology is not reported in sufficient detail to meet PLOS reporting requirements and enable replication of the literature search:
    • The search strategy used the following sources: PubMed; Google Scholar; Web of Science; Scopus.
    • The search strategy keywords included: predictive modeling in healthcare systems; machine learning prediction accuracy score; disease diagnosis with machine learning; machine learning prediction of disease (chronic kidney, hypertension, breast cancer, machine learning model performance evaluations, fraud detection with machine learning, detection of spam messages with machine learning, machine learning prediction with balanced accuracy score, dealing with class imbalance in machine learning etc.).
    • Searches were restricted to work published in or after 2016.
    • Eligibility Criteria: Study design includes sample population in healthcare settings, interventions with the use of machine learning models and a focus on imbalanced class distribution inequality, evaluation metric outcome such as prediction accuracy score as the determining factor for model performance evaluation in healthcare settings where the incidence of class imbalance is a natural recurring phenomenon.
    • Study Selection Process: Authors (MOA and TF) examined articles selected based on title/abstract screening, full-text review and appropriate method use.
    • Disagreement Resolution Process: Resolution of disagreements involved thorough consultation among all authors.
    • Standardized Data Extraction Process: A standardized data extraction form (S2 File) was used to extract relevant study details including research type, methodology, evaluation metric used and score value obtained.
    • Risk of Bias: This study reports on synthesis-level risk of bias arising from availability of selective publications involving machine learning with class imbalance datasets from selective reporting on evaluation metrics.
    • Limitations and Implications: In this review, emphasis on prediction accuracy score use as the determining factor for best model estimation is assessed in the context of healthcare and other real-world application settings where class imbalance distribution datasets exist. The proposed model for performance evaluation is applicable within the stated context. Implications for this review emphasized the use of the proposed model in settings where false alarms pose a greater challenge due to the negative effect it has on larger segments of the population.

Supporting information

S1 File. PRISMA checklist.

(DOCX)

pdig.0000984.s001.docx (19.1KB, docx)
S2 File. Standardized data extraction form.

(DOCX)

pdig.0000984.s002.docx (20.6KB, docx)

References

  • 1.Owusu-Adjei M, Ben Hayfron-Acquah J, Frimpong T, Abdul-Salaam G. Imbalanced class distribution and performance evaluation metrics: A systematic review of prediction accuracy for determining model performance in healthcare systems. PLOS Digit Health. 2023;2(11):e0000290. doi: 10.1371/journal.pdig.0000290 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Karthick K, Aruna SK, Samikannu R, Kuppusamy R, Teekaraman Y, Thelkar AR. Implementation of a Heart Disease Risk Prediction Model Using Machine Learning. Comput Math Methods Med. 2022;2022:6517716. doi: 10.1155/2022/6517716 [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 3.Methods In Medicine CAM. Retracted: Implementation of a Heart Disease Risk Prediction Model Using Machine Learning. Comput Math Methods Med. 2023;2023:9764021. doi: 10.1155/2023/9764021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mehbodniya A, Alam I, Pande S, Neware R, Rane KP, Shabaz M, et al. Financial Fraud Detection in Healthcare Using Machine Learning and Deep Learning Techniques. Security and Communication Networks. 2021;2021:1–8. doi: 10.1155/2021/9293877 [DOI] [Google Scholar]
  • 5.Retracted: Financial Fraud Detection in Healthcare Using Machine Learning and Deep Learning Techniques. Security and Communication Networks. 2023;2023:1–1. doi: 10.1155/2023/9758612 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 File. PRISMA checklist.

(DOCX)

pdig.0000984.s001.docx (19.1KB, docx)
S2 File. Standardized data extraction form.

(DOCX)

pdig.0000984.s002.docx (20.6KB, docx)

Articles from PLOS Digital Health are provided here courtesy of PLOS

RESOURCES