Skip to main content
. 2020 Dec 8;49(4):e20. doi: 10.1093/nar/gkaa1158

Table 3.

Spurious Arrays in the CRISPRCasFinder datasets (27). To detect arrays that are likely false positive candidates, we first filtered arrays that are not covered by the parameter setting of the different tools (i.e. ≤3 repeats, or repeat length <23 or >55). The remaining number of arrays is listed in the row standard arrays. Arrays that were labeled as likely negative (see Methods) if they had one of the following four defects: 1) damaged repeats, 2) high similarity between spacers, 3) uneven distribution of spacer lengths and 4) high similarity between repeats and spacers. The following rows list how many of these likely negative arrays had specific deficiencies (see text for details). Note that some arrays can have several problems, hence there are multiple entries in these rows. The FDR is calculated considering all likely negatives as true negatives, as we do not have other information. This FDR values are, however, likely overestimates. Nevertheless, they show the dimension of the problem

CRT CRISPRCasFinder Detect
Arrays declared positive 1866 3263 764
Standard arrays 1358 1395 346
Likely negative arrays 602 628 110
of which are arrays with damaged repeats 165 189 40
with similar spacers 57 0 13
with uneven spacers lengths 473 508 83
with sim. repeats to spacers 23 4 11
FDR 44,3% 45,0% 31,8%
likely positive arrays 756 767 236