Table 3.
Performance comparison of TrOCR-ctx, TrOCR, Abinet, and PP-OCRv2 models on UoS_Data_Rescue, CORD, SROIE, PubTabNet, and ICDAR 2015 datasets using segmented text lines. Evaluation metrics include Rouge-L, Word Error Rate (WER), Character Error Rate (CER), Exact Match, and F1-scores at both character and token levels
| OCR Model | Rouge-L | WER | CER | EM | F1-score (Char) | F1-score (Token) |
|---|---|---|---|---|---|---|
| UoS_Data_Rescue | ||||||
| TrOCR | 0.849 | 0.055 | 0.047 | 0.825 | 0.963 | 0.945 |
| TrOCR-ctx | 0.857 | 0.049 | 0.035 | 0.847 | 0.966 | 0.951 |
| Abinet | 0.545 | 0.557 | 0.346 | 0.432 | 0.681 | 0.449 |
| PP-OCRv2 | 0.812 | 0.348 | 0.178 | 0.646 | 0.825 | 0.666 |
| CORD | ||||||
| TrOCR | 0.898 | 0.168 | 0.056 | 0.802 | 0.946 | 0.834 |
| TrOCR-ctx | 0.957 | 0.034 | 0.016 | 0.833 | 0.985 | 0.986 |
| Abinet | 0.520 | 0.356 | 0.304 | 0.574 | 0.710 | 0.644 |
| PP-OCRv2 | 0.789 | 0.114 | 0.144 | 0.746 | 0.897 | 0.886 |
| SROIE | ||||||
| TrOCR | 0.919 | 0.053 | 0.044 | 0.830 | 0.984 | 0.947 |
| TrOCR-ctx | 0.940 | 0.033 | 0.014 | 0.849 | 0.988 | 0.967 |
| Abinet | 0.872 | 0.629 | 0.497 | 0.301 | 0.507 | 0.381 |
| PP-OCRv2 | 0.882 | 0.432 | 0.235 | 0.493 | 0.777 | 0.577 |
| PubTabNet | ||||||
| TrOCR | 0.878 | 0.141 | 0.069 | 0.748 | 0.940 | 0.859 |
| TrOCR-ctx | 0.913 | 0.091 | 0.067 | 0.789 | 0.965 | 0.909 |
| Abinet | 0.315 | 0.813 | 0.450 | 0.153 | 0.598 | 0.197 |
| PP-OCRv2 | 0.833 | 0.131 | 0.097 | 0.700 | 0.915 | 0.877 |
| ICDAR 2015 | ||||||
| TrOCR | 0.777 | 0.245 | 0.102 | 0.744 | 0.904 | 0.750 |
| TrOCR-ctx | 0.776 | 0.250 | 0.102 | 0.749 | 0.905 | 0.755 |
| Abinet | 0.151 | 0.741 | 0.624 | 0.259 | 0.436 | 0.259 |
| PP-OCRv2 | 0.665 | 0.374 | 0.178 | 0.626 | 0.831 | 0.627 |