Skip to main content
. 2020 Jan 31;36(9):2731–2739. doi: 10.1093/bioinformatics/btaa065

Fig. 1.

Fig. 1.

Length distribution of proteins in datasets relevant to comparison of HHsearch and LAMPA. This plot depicts sizes of six protein datasets labeled from A to F and used or cited in this study. (A) 6271 SCOP domains used for HHsearch training (range: 21–1504 aa); (B) 2985 RefSeq virus polyproteins (range: 1001–8572 aa); (C) 431 RefSeq virus polyproteins which include 507 regions exclusively annotated by LAMPA (range: 1039–8572 aa); (D) 507 hit regions generated by LAMPA from 431 RefSeq polyproteins (range: 88–2172 aa); (E) 507 domains tentatively demarcated around LAMPA hits (range: 164–732 aa); and (F) 41 designed sizes of each of three proteins, 123 in total, tested in computational experiments (range: 10–100.000 aa)