Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2024 Mar 18;110(7):4096–4102. doi: 10.1097/JS9.0000000000001359

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Copyright © 2024 The Author(s). Published by Wolters Kluwer Health, Inc.

This is an open access article distributed under the Creative Commons Attribution License 4.0 (CCBY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. http://creativecommons.org/licenses/by/4.0/

PMC Copyright notice

Evaluating the Performance of the Multimodal ChatGPT-4V in Interpreting Radiological Images for Diagnosis and Formulating Treatment Plans. (A) Performance of ChatGPT-4 in USMLE-style questions. The shapes represent different imaging modalities, whereas the colors represent different anatomical regions. The annotations on the right show the accuracy calculated using different criteria. From top to bottom, the criteria are: considering ChatGPT-4 to have answered correctly only if all 3 attempts for a given question are correct; if at least 2 out of the 3 attempts are correct; and if at least 1 of the 3 attempts is correct. (B) Difference in accuracy of multimodal ChatGPT-4V in USMLE-style questions with both image and history, with history only and with image only. ***P<=0.001. ****P<=0.0001 (C) Scores of multimodal ChatGPT-4V in USMLE-style questions involving treatment plan formulation. The score for each question is recorded as the average of the scores given by three reviewers. (D) ROC curve of multimodal ChatGPT-4 in diagnosing the presence of abnormalities in chest radiography from the ChestX-ray8 database. If ChatGPT considers there to be abnormalities, it is scored 1; if it considers there to be no abnormalities, it is scored 0. The sum of the scores from the three attempts was taken to make a comprehensive judgment. AUC, the area under the receiver operating characteristic curve.