Method overview
(A) Experimental design for selecting radiology reports and comparing metrics and radiologists in evaluating reports.
(B) Given a test report, selecting the report with the highest metric score from the training report corpus with respect to the test report and a particular metric.
(C) Conducting radiologist evaluation on the high metric score report relative to the test report, where radiologists identify the number of clinically significant and insignificant errors in the high metric score report across six error categories.
(D) Determining the alignment between metric scores and radiologist scores assigned to the same reports using the Kendall rank correlation coefficient.