TABLE 1.
Family/groupa | CL and/or Anabaena sp. PCC 7120 CL membersc | Other members of the CLd | L end (excluding variants)e | Inverted R end (excluding variants)f | E value of IRsg | Typical DR (no. of bp) | Tentative target and IS insertion point (/)i | Basis for identification of endsj | Supplemental figure reference |
---|---|---|---|---|---|---|---|---|---|
IS4/IS50 | IS(alr1332), 5 MITEs | 4 Sy7002 | CTACG GTGTACACACAAGTC CAAGT AAAGT | CTACC GTGTACACACAAGTA CTATC TGCAA | 5E−9 | 10 or 11 | GC-rich | A, IRs, DRs | S1 CL 1 |
IS4/IS4Sa | IS(alr5204) | 2 Av, 7 Np | CAGAA GTGTTGAATG TTAAG AAAAA GATGA GAAAA ATAGC | CAAAA ATGTTGAAAG CTGAT ACAAA ATTTT ACATA GTTAG | 0.058 | Imperfect 9 | AT-rich | A, IRs | S1 CL 2 |
IS4/ISPepr1 | IS(all7115) | 7 Np | CAATACCTTA GCCAAAATAA GAGCA TAAAG AGGTA GGGCG | CAATACCTCT GCCAAATTACGAGGG TTCAA CACCC GTAGA | 0.015 | 6 | Usually AT-rich | A, IRs, DRs, Table 2 | S1 CL 3 |
IS5/IS1031 | CL 1: IS(all0016-15), IS(all2693-92), IS(alr3610-11), IS(all4400, all4399), IS(alr4438-39), IS(all4817-16), IS(alr5157-58), IS(all7002, all7001) | GAGG CTATTTATAAAGTAAATCTA AAGGA GAGCT ATCAG | GAGA CCATTTATAAAGTAAATCTT TAGAC GACTA GACGA | 6E−8 | 3 | TWA (one TCA) | A, IRs, DRs, Table 2 | S2 CL 1 | |
IS5/IS1031 | CL 2: IS(alr7025) | Am, 3 Np | GAGG GTGTTTGAAAAGTAG GGGAT GTTGT AAAAA AACTC CCTCG GTATA | GAGG ATGTT TGAAAAGTTA TAGGG AGTCA AAATT AAGCC AATCG CTTCA | 6E−6 | 3 | TWA | A, IRs, DRs | S2 CL 2 |
IS110/(−) | IS1594: IS(alr0249), IS(all0306), IS(all0732), IS(all1099), IS(alr1212), IS(all1986), IS(all2065), IS(alr3571), IS(alr3636), IS(all3682), IS(all3734), IS(all4756) | Np | TGTAT ATTAA AAGAA GTGGT AGACC GTCGC | AGCGA CTGTC TTGAA AGTCA AGCGA TCGTT | NSS | 0 | CCT/AC, CC/TAC, or C/CTAC | A, K (rRNA), Fig. 1B | S3 |
IS200-IS605/IS1341 | IS891: IS(all3986), IS(alr4104), IS(all5207), IS(alr7228), IS(alr7231), IS(all8010) | Ns, 3 Lyn | GAGCC GTGAA GCGTA AAGCC CCCGT ATTTT | TTGAC ATCCT CCCCC GTTTA GAAAA CGGGG | 0.032 | 0 | TTAC/ | K (see text), TGTCAA at R terminus, Table 2 | S4 CL 1 |
IS200-IS605/IS1341 | IS891-related CL 2: IS(all0315-14), IS(alr1157), IS(all4465), IS(alr7325) | Av | CAAGA AACTG GGTCT AAAGC CCCGT CCTTG | TTGAC ACTCT CCGCC CTATA AGTGC GGAGA | NSS | 0 | TTAC/ | A, TGTCAA at R terminus, Table 2 | S4 CL 2 |
IS200-IS605/IS1341 | IS891-related CL 3: IS(all2167, alr2168), partial IS(all1608) | 9 Te | CAAAA GAATG GGATA CAAGC CCCGT CGTTC TAGGA CGGCT | TTGAC ATACT CACCG ACCTA AAGGT GCGGT GATTC TTGAC | NSS | 0 | TGAC/ | A | S4 CL 3 |
IS200-IS605/IS1341 | IS891-related CL 4: IS(alr1531) (left end unclear) | Av, 7 Te | TGGTA AAATG TGAGG TATGGAAAAA GCCTA CCGCT ACCGA | TTGAC ATCCT CACCG CCCTG AAAGT GCGGT GATTC CTAAG | NSS | Unclear | Unclear | A, TGTCAA at R terminus, L terminus unclear | S4 CL 4 |
IS200-IS605/IS1341, (−) | IS891-related CL 5: IS(all7148, alr7149), IS(all7008, alr7009) | Av, Cy7424 | AGTTT CTCAA AAATA TATTG ATGTT AGACG | TTGAC ACTCT CGCCG CTAAC CGCAA AGCAG | NSS | 0 | TTAC? | A, TGTCAA at R terminus | S4 CL 5 |
IS200-IS605/IS608 | IS891-related CL 6: IS(all3371, alr3372), IS(all7085, alr7086) (approximately, respectively, ISNsp2 and ISNsp3 of ISfinder) | 2 Np | GAGTC GTGAT GCGTA AAGCC CCCAA TTATG | TTGAG CCACT CCCCC GTTTT GAAAA CGGGG | NSS | 0 | TTAC | A, TCAA at R terminus | S4 CL 6 |
IS200-IS605b | IS(alr1015), IS(alr4734) | Cy8801, Cy0110, Mc; ISLjo5_ a1 | GTAGG GTGGG CAATG CCCACCAAAA ATATT ATGTA AAAAT | GTAGG GTGGG CATTG CCCACCAATT ATCTC ATTAT GTAGT | 4E−9 | 3 | A, IRs | S4 CL 7 | |
IS630/(−) | IS895: IS(alr0552-53), IS(alr1726-27), IS(alr1853-54), IS(all1972-71), IS(all2067-66), IS(alr2773-74), IS(alr4628), IS(all4868-67) | TAGGAATCCT ATTTGATTTG TGAAC AAGAC CAAGA | TAGGAATCCT ATTTGATTTT TGAAT AAGTT CCGTA | 8E−10 | 2 | TA | A, IRs | S5 CL 1 | |
IS630/(−) | IS895-related CL 2: IS(asl1992), IS(asr3082) (diverges from others close to R end) | Av, Np, 2 Cw, 4 Am | ACCAA TTTAAATTAG AGACAGGGCAGATGA GGTAA | ACCAA ATTAAATGGT GTTTAGGGCAGATAG GGCAT | 0.003 | 2 | ATAT | A, IRs | S5 CL 2 |
IS630/(−) | IS895-related CL 3: IS(alr0018-19), IS(all0363-62), IS(alr1858-59) | Cw, Np, Cy7425 | TAGCGTTTAC CAGTA TAATGAAGTACACTAATTAA AATAA | TAGCGTTTCT CAGTC TGGTGAAGTACAGTAACAAG AATGG | 0.004 | 2 | TA | A, IRs, Table 2 | S5 CL 3 |
IS630/(−) | IS895-related CL 4: IS(alr5227-28) | Av, 7 Ns | AGTAGGTAGG CACGAAAAAA CCAAA CTATGTGAAGATAAG TAAAG ACGGAGAATA AATCT | AGTAGGTGGG TGGGAAAAGT CCCAA GTATGTAACGAAACA TTAAG TAATAGAATT AGAGT | 0.011 | 2 | TA | A, IRs | S5 CL 4 |
IS630/(−) | IS895-related CL 5: IS(all7564-63) | 17 Cw | GTACA GGTCGGCGTAAATAA ACAGACCATA | GTACA CCTCGGCGTAAATCA GCAGACCATT | 2E−6 | 2 | CTAG | A, IRs, Table 2 (unchanged reading frame) | S5 CL 5 |
IS892 (unclassified) [akaIS(all7268)]k | IS892: IS(all7005-04), IS(all7106-05), IS(all7112-11), IS(all7178-77), IS(all7303-02), IS(alr7323), IS(all7376-75), IS(alr8510), IS(alr8566, asr8501, alr8502) | CTAGCGTGGC AAAACTTACT AGAGA GGAGC AGAGA TCCTG | CTAGCGTGGCAAAACTTACTAGAGA AGACG ACTCT CTAGA | 2E−12 | 8h | AT-rich | A, IRs, DRs, Table 2 | S6 CL 1 | |
IS982/(−) | CL 1: IS(alr0999), IS(all2664), IS(alr2683), IS(alr2694), IS(alr3384), IS(all3624) ISNsp1 [aka IS(alr1569)] | ACGTG ATGTG CGACTTATTG TTTCGTTACA CAATT GAGGT | ACGCC ATGTGCGACTTAATA TTCTGTAACA AGATC GTCGA | 6E−5 | mostly 6 | AT-rich | A, IRs, DRs, Table 2 | S7 CL 1 | |
IS982/(−) | CL 2: IS(asl0588), IS(alr0590) | 2 Gvi | ACGTGAGTTC GACGG GTTAA TTTAG GTGAA | ACGTGAGTTCGACGA ACTAA AAAAC AGCAG | 2E−6 | 8 or less | AT-rich | A, IRs, DRs | S7 CL 2 |
IS982/(−) | CL 3: IS(all8559), IS(alr4082), IS(asr7385) (a fragment) | ISRmsp1 | ATTTA GGGTT TGTGC GAGCC AACTA TTTGA | TACGC CTTAT GTGAA TTAAG CCGGA GGATG | NSS | 1? | AT-rich | A, Table 2 | S7 CL 3 |
ISAzo13/(−) | CL 1: IS(alr7562), IS(all8069) | ISStau6 | GAGAACTGCACAGAA TGATTGATCC TATGA TCAGA GAAAG | GAGAACTCCACAAAA AAGATGATCC AATAG CTATG CTGGT | 0.015 | 3 | AT-rich | A, IRs, DRs | S8 CL 1 |
ISAzo13/(−) | CL 2: ISNsp4[aka IS(alr8019)], IS(all2145) (truncated at its R end); IS(asr7385) is an R-end fragment | 8 Np | AGGCA TCATGTAAAAATAAC TTGAACGATT TACCG AATAG TTAGA | AGGAG TTATGTAAAAATAAC CTGAACAATT AAGTG CCTAC TTTGG | 2E−6 | 3 | TWA in AT-rich region | IRs, Table 2 | S8 CL 2 |
ISL3/(−) | CL 1: IS(alr1609), IS(alr2698), IS(all7161), IS(alr7305) (approximating ISAsp1), IS(alr7349) (truncated), IS(alr7350) | 3 Np | GGTTCTTTCG GATATTTTATGGAGAAAGCA AAAAG TAATG AAAATTAATG | GGTTCTTGCG CCTGTTTTATGGAGAATTAA TACTA AAGTG CCAGTTTAAT | E−4 | Up to 8 perfect, often imperfect | AT-rich | A, IRs, DRs | S9 CL 1 |
ISL3/(−) | CL 2: IS(alr7386′, alr7003′, asr7006; alr7007), IS(alr8016-17) | 2 Np | GGTTCTTGGCAACTTTTGGTGATCTTGGTT GGGGAAAGGCAGAAGGCAGG AGTCAGAAAG ATATA ATTGA | GGTTCTTGGCAACTTTTGGTGATCTTGGTTGGGGAAAGGCAGAGGGCAGA GGGCAGAGGG CAGAA GGCAG | 5E−23 | Up to 8 perfect, often imperfect | AT-rich | A, IRs, DRs | S9 CL 2 |
ISs bearing the following transposase ORFs cannot (yet) be excised computationally: IS5 family, all2152; IS200/IS605 family, alr1531, alr1685, alr2719, all4675, all5207 (see Fig. S4 in the supplemental material, cluster [CL] 1), all7008 and alr7009 (Fig. 4E and F), alr7153 (perhaps in the IS607 family), all7158 (closely related to the IS891transposase gene), all7245, asl7246, alr7329, all8070, alr8071; and unclassified by ISfinder: alr1015 and alr4734 (see the text and Fig. S4, CL 7); IS481 family, all3630; IS607 family, asr7146, alr7147, asr7152; IS630 family, asl1657, alr1926, asr3082 (see Fig. S5, CL 3); IS982 family, asl0588 (very short, but retains IR and DR; see Fig. S7, CL 2); asr7385 (fragment), alr4082, and all8559 (see Fig. S7, CL 3); IS1182 family, alr9024; ISAs1 family, all8064, all8065 (see the text); ISAzo13 family, all2145 (see Fig. S8, CL 2), asr7385; ISH3 family, all7244; not classified: alr7163 (see the text). −, not assigned by ISfinder.
Nunvar et al. (22).
Cluster (CL) identifications are in boldface.
The number of ORFs found in non-Anabaena strains that bear members of a cluster is given. Strains are abbreviated as follows: Am, Acaryochloris marina MBIC 11017; Acma, A. marina Acma49; Av, Anabaena variabilis ATCC 29413; Cy7425, Cyanothece sp. strain PCC 7425; Cw, Crocosphaera watsonii WH8501; Cy0110, Cy7424, and Cy8801, Cyanothece sp. strains PCC 0110, PCC 7424, and PCC 8801, respectively; Gvi, Gloeobacter violaceus strain PCC 7421; Lyn, Lyngbya sp. strain PCC 8106; Mc, Microcoleus chthonoplastes strain PCC 7420; Np, Nostoc punctiforme strain PCC 73102; Ns, Nodularia spumigena CCY9414; Sy7002, Synechococcus sp. strain PCC 7002; Te, Thermosynechococcus elongatus. For other ISs, see ISfinder.
Identities to the inverted R end are underlined. Boldface indicates palindromic sequence.
Identities to the L end are underlined. Boldface indicates palindromic sequence.
NSS, not significantly similar.
Duplicated sequence is underlined.
A, alignment; K, in known sequence.
aka, also known as.