Table 1.
Analysis of Alpha Satellite Sequences Found in High Copy Number in the CPO XmnI Monomer Data Set
Id | Sequence | Number | Forward (%) |
---|---|---|---|
1 | Consensus C1 | 2983 | 46 |
2 | C158G | 848 | 48 |
3 | C116T | 568 | 41 |
4 | C114Del | 508 | 1* |
5 | C137A-CC149AA | 455 | 34 |
6 | T101Del | 323 | 98* |
7 | C2A-G17Del | 250 | 66 |
8 | C2A-G17Del-C158G | 208 | 70 |
9 | C114Del-C158G | 145 | 0* |
10 | A3741T-G64A-C158G | 136 | 15 |
11 | A40C-C42G | 116 | 44 |
12 | C116T-C158G | 112 | 46 |
13 | C2A | 103 | 73 |
14 | T121A | 100 | 43 |
15 | C137A-C158G | 100 | 51 |
16 | A3741T-G64A | 100 | 24 |
17 | C137A-CC149AA-C114Del | 89 | 1* |
18 | C2A-G17Del-C114Del | 81 | 1* |
19 | T38G | 77 | 29 |
20 | A110G | 76 | 56 |
21 | A86T | 74 | 39 |
22 | T80Del-T101Del | 67 | 100* |
23 | A41G | 65 | 38 |
24 | T101Del-C158G | 62 | 98* |
25 | G17Del | 59 | 47 |
26 | G17C | 58 | 54 |
27 | C144A | 57 | 46 |
28 | C2A-C158G | 54 | 54 |
29 | C137A | 54 | 65 |
30 | A40C-C42G-G28T | 53 | 49 |
The sequences are named according to the “Id” column. The “Sequence” column indicates how each sequence variant differs from the consensus sequence of the C1 family, using standard notations. The “Number” column displays the number of identical copies of the sequence in the monomer data set. The “Forward” column displays the percentage of reads obtained in the forward orientation (i.e., the orientation of our reference sequence). Strong biases for read orientation reveal artifactual sequences which are indicated by an asterisk.