Focus |
Broad medical coding (ICD-9-CM, ICD-10-CM, and CPT) |
Nephrology-specific ICD-10 coding |
AI models |
GPT-3.5, GPT-4, Gemini Pro, and Llama2-70b Chat |
ChatGPT 3.5 and 4.0 |
Input format |
Official code descriptions |
Clinical scenarios mimicking pre-visit testing |
Task |
Generate exact matching codes |
Identify single most appropriate ICD-10 code |
Prompt design |
Standardized for code generation |
Simple, clinically relevant |
Top performance |
GPT-4: 45.9% (ICD-9-CM), 33.9% (ICD-10-CM), and 49.8% (CPT) |
ChatGPT 4.0: 99% (ICD-10 for nephrology) |
Performance range |
Below 3% to below 50% |
87–99% |
Code types |
Multiple (ICD-9-CM, ICD-10-CM, and CPT) |
Single (ICD-10) |
Specialty focus |
General medical |
Nephrology-specific |
Main finding |
Base LLMs inadequate for medical coding |
AI shows high potential for specialty-specific coding. |
Accuracy factors |
Code frequency, length, description conciseness |
Specialty focus, clinical context, latest AI versions. |
Conclusion |
Need for further research on complex ICD structures |
AI can reduce administrative burden in specialty coding through Nephrology case scenarios for pre-visit testing. |