Table 3.
Lineage generation | # of SHMs in 125-bp BCR sequence | TP | TN | FP | FN | ACC | PPR | |
---|---|---|---|---|---|---|---|---|
BRILIA human | Germline | 0 | 0 | 51,544 | 1,031 | 0 | 0.98 | 0.00 |
1st | 5 | 1,230 | 50,072 | 629 | 644 | 0.98 | 0.66 | |
2nd | 10 | 2,433 | 48,146 | 676 | 1,320 | 0.96 | 0.78 | |
3rd | 15 | 3,373 | 46,141 | 755 | 2,306 | 0.94 | 0.82 | |
4th | 20 | 4,376 | 44,310 | 633 | 3,256 | 0.93 | 0.87 | |
5th | 25 | 5,588 | 42,370 | 612 | 4,005 | 0.91 | 0.90 | |
VQUEST + JA human | Germline | 0 | 0 | 46,724 | 502 | 0 | 0.99 | 0.00 |
1st | 5 | 913 | 44,296 | 812 | 758 | 0.97 | 0.53 | |
2nd | 10 | 1,636 | 41,973 | 844 | 1,645 | 0.95 | 0.66 | |
3rd | 15 | 2,146 | 39,634 | 919 | 2,745 | 0.92 | 0.70 | |
4th | 20 | 2,365 | 36,937 | 1,010 | 4,007 | 0.89 | 0.70 | |
5th | 25 | 2,324 | 33,983 | 1,114 | 5,410 | 0.85 | 0.68 | |
partis human | Germline | 0 | 0 | 50,685 | 654 | 0 | 0.99 | 0.00 |
1st | 5 | 1,160 | 48,182 | 1,149 | 665 | 0.96 | 0.50 | |
2nd | 10 | 2,325 | 45,946 | 1,695 | 1,334 | 0.94 | 0.58 | |
3rd | 15 | 3,505 | 43,251 | 2,228 | 2,001 | 0.92 | 0.61 | |
4th | 20 | 4,655 | 40,405 | 2,828 | 2,686 | 0.89 | 0.62 | |
5th | 25 | 5,648 | 37,713 | 3,154 | 3,441 | 0.87 | 0.64 | |
BRILIA mouse | Germline | 0 | 0 | 42,144 | 804 | 0 | 0.98 | 0.00 |
1st | 5 | 976 | 41,062 | 410 | 500 | 0.98 | 0.70 | |
2nd | 10 | 1,976 | 39,518 | 443 | 1,011 | 0.97 | 0.82 | |
3rd | 15 | 2,746 | 37,891 | 527 | 1,784 | 0.95 | 0.84 | |
4th | 20 | 3,531 | 36,427 | 421 | 2,569 | 0.93 | 0.89 | |
5th | 25 | 4,579 | 34,951 | 374 | 3,044 | 0.92 | 0.92 | |
VQUEST + JA mouse | Germline | 0 | 0 | 38,602 | 209 | 0 | 0.99 | 0.00 |
1st | 5 | 743 | 36,843 | 459 | 592 | 0.97 | 0.62 | |
2nd | 10 | 1,407 | 35,471 | 522 | 1,315 | 0.95 | 0.73 | |
3rd | 15 | 1,864 | 33,978 | 639 | 2,270 | 0.92 | 0.74 | |
4th | 20 | 2,176 | 32,294 | 680 | 3,328 | 0.90 | 0.76 | |
5th | 25 | 2,298 | 30,090 | 739 | 4,346 | 0.86 | 0.76 |
Accuracy is computed as (TP + TN)/(TP + TN + FN + FP), where T = true, P = positive, F = false, and N = negative. PPR is computed as TP/(TP + FP). The nucleotides in the CDR3 are pooled together based on the sequence lineage (e.g., 1st descendants of the Germline B cells), prior to determining the SHM accuracy statistics. The total nucleotide count differs among algorithms because the annotation coverage differs.