Skip to main content
. 2020 Jun 19;15(6):e0234925. doi: 10.1371/journal.pone.0234925

Table 3. Identifying CTRI matches for the 581 CTG records with unknown CTRI matches.

The nature of the records Number of records Percentage of 581
1. Using fuzz and the title field, 581 CTRI matches were picked up for the 581 CTG records. The model predicted 434 of these to be true. 434 74.7%
2. Assuming that fuzz picked up 80% of the true matches, the number of correct matches would be 543. 543 93.5%
3. Incorrect matches, as predicted by fuzz. 38 6.5%
4. Using tfidf and the title field, 581 CTRI matches were picked up for the 581 CTG records. The model predicted 441 of these to be true. 441 75.9%
5. Assuming that tfidf picked up 81% of the true matches, the number of correct matches would be 544. 544 93.6%
6. Incorrect matches, as predicted by tfidf. 37 6.4%
7. CTRI matches picked up by both fuzz and tfidf. 304 52.3%
8. Of the 304 CTRI matches picked up by both fuzz and tfidf, the model predicted 288 to be true. 288 49.6%
9. Incorrect matches from the common cases. 293 50.4%