Table 2.
Evaluation criteria for assessing LLM-generated clinical reports within the HealthProcessAI framework.
| Criterion | Weight | Description | Validation method |
|---|---|---|---|
| Clinical accuracy | 25% | Correctness of medical interpretations and terminology usage | Expert clinical review |
| Process mining understanding | 20% | Accurate interpretation of analytical results | Technical validation |
| Actionable insights | 20% | Quality and feasibility of clinical recommendations | Implementation assessment |
| Statistical interpretation | 15% | Correct analysis of quantitative findings | Statistical validation |
| Report structure & clarity | 10% | Organization and readability | Communication analysis |
| Evidence-based reasoning | 10% | Use of clinical evidence and literature | Evidence synthesis evaluation |
*Current validation uses automated LLM evaluation rather than clinical validation. Each criterion is assigned a relative weight based on its importance to clinical utility and interpretability. Validation methods involve domain-specific assessments, including clinical expert review, technical accuracy checks, and implementation feasibility testing.