Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2023 Mar 8;2:e41264. doi: 10.2196/41264

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

©Elvira Nurmambetova, Jie Pan, Zilong Zhang, Guosong Wu, Seungwon Lee, Danielle A Southern, Elliot A Martin, Chester Ho, Yuan Xu, Cathy A Eastwood. Originally published in JMIR AI (https://ai.jmir.org), 08.03.2023.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR AI, is properly cited. The complete bibliographic information, a link to the original publication on https://www.ai.jmir.org/, as well as this copyright and license information must be included.

PMC Copyright notice

A visual illustration of the sequential forward selection process for identifying feature subsets that maximize the performance of the ML pipeline. The candidate feature subsets were evaluated by using 5-fold cross-validation. For each subset of document types, 40 experiments were conducted with all possible combinations of 5 folds, 2 vectorizers, and 4 ML models. FPR: false positive rate. ML: machine learning; TF-IDF: term frequency-inverse document frequency; TPR: true positive rate.