Skip to main content
. 2019 Aug 7;10:839. doi: 10.3389/fphar.2019.00839

Table 2.

Error analysis evaluation results on different types of error occurrence on the test dataset.

Sl. No. Sources of error True value in data Observed value in data Error percentage
1. Entity detection error 633,074* 582,428 8.00%
2. Entity absent in text 633,074* 615,650 2.75%
3. Failure to detect entity 633,074* 609,413 3.73%
4. Entity normalisation error
a. Gene normalization error 42,607 50,336 18.14%
b. Disease normalization error 71,704 92,481 28.97%
c. Drug normalization error 11,033 14,563 31.99%

Error percentage=|Observed value-True value|True value×100(1)

PharmGKB has been considered as the gold standard dataset for all the comparisons. *in total PGx corpus extracted from MEDLINE. The error percentage has been calculated according to the formula 1.