Table 2.
chr | HORmon a centromere | HiCAT b centromere | SRF (k=171) chromosome c | SRF/171 assembly d | SRF/171 HiFi reads d | SRF/101 Illumina d | SRF (k=171) HPRC assembly e |
---|---|---|---|---|---|---|---|
| |||||||
1 | 6 | 2 | 6 (4.2); 11 (0.5) | 6 (2.0) | 6 (3.5) | 2 (2.2) | 6 [89] |
2 | 4 | 4 | 4 (2.3) | 4 (2.3) | 4 (2.2) | 4 (2.2) | 4 [94] |
3 | 17 | 17 | 17 (1.4) | 17 (1.4) | 17 (1.4) | 17 (1.4) | 17 [94] |
4 | 19 | 19 | 19 (3.5) | 19 (3.5) | 19 (2.9) | 19 (3.4) | 19 [94] |
5 | 6 | 12 | 8 (2.5) | 8 (1.8) | 4 (1.9) | missing | 8 [43]; 4 [37] |
6 | 18 | 18 | 18 (2.0) | 18 (2.0) | 18 (2.0) | 18 (2.0) | 18 [93] |
7 | 6 | 6 | 6 (3.3) | 6 (3.2) | 6 (3.2) | 6 (3.2) | 6 [92]; 12 [2] |
8 | 11 | 15 | 7 (1.1) | 7 (1.1) | 7 (1.0) | 11 (1.0) | 7 [61]; 8 [33] |
9 | 7 | 11 | 4 (1.8) | 4 (1.4) | 11 (2.0) | 4 (1.7) | 4 [77]; 11 [17] |
10 | 8 | 6 | 8 (2.1) | 8 (2.1) | 8 (1.7) | 8 (2.1) | 6 [66]; 8 [28] |
11 | 5 | 5 | 5 (3.4) | 5 (3.3) | 5 (3.4) | 5 (3.4) | 5 [94] |
12 | 8 | 8 | 8 (2.6) | 8 (2.6) | 8 (2.6) | 8 (2.6) | 8 [94] |
13 | 11 | 7 | 4 (0.4) | 4 (0.4) | 7 (1.5) | 7 (1.5) | 4 [55]; 11 [23]; 7 [16] |
14 | 8 | 8 | 8 (2.6) | missing | missing | missing | missing |
15 | 11 | 15 | 11 (0.8); 20 (0.5) | 11 (0.8) | 11 (0.8) | 11 (0.8) | 11 [94] |
16 | 10 | 10 | 10 (2.0) | 10 (1.9) | 10 (1.9) | missing | 10 [94] |
17 | 16 | 14 | 16 (3.3) | 16 (3.3) | 16 (3.5) | 16 (3.5) | 16 [56]; 13 [38] |
18 | 12 | 12 | 8 (3.6) | 8 (3.8) | 12 (4.9) | missing | 12 [66]; 8 [19] |
19 | 2 | 2 | 4 (0.4); 2 (0.4) | missing | 13 (0.5) | missing | 13 [29]; 32 [4] |
20 | 16 | 16 | 16 (2.1) | 16 (2.1) | 16 (2.1) | 8 (0.5) | 16 [94] |
21 | 11 | 11 | 11 (0.3) | missing | missing | missing | missing |
22 | 8 | 8 | 8 (2.9); 20 (0.5) | 8 (2.8) | 8 (2.6) | 8 (2.9) | 8 [94] |
X | 12 | 12 | 12 (3.1) | 12 (3.1) | 12 (3.1) | 12 (3.1) | 12 [76] |
Y | 34 | No Y | 34 (0.3) | 34 (0.3) | No Y | No Y | 34 [18] |
αHOR lengths in the monomer unit in the CHM13 v2.0 genome, retrieved from Kunyavskaya et al. (2022).
length of “top 1” αHOR from each chromosome retrieved from Gao et al. (2022). Both HORmon and HiCAT were applicable to extracted centromeric sequences only.
SRF applied to each CHM13 chromosome separately. In a format “m (L)”, m denotes the length of an HOR in the monomer unit and L is its span on the CHM13 assembly in megabases.
SRF applied to CHM13 assembly, PacBio High-Fidelity (HiFi) reads and Illumina short reads, respectively. k=101 used for Illumina reads. CHM13 reads do not contain chrY.
SRF applied to 94 phased haploid assemblies produced by the Human Pangenome Reference Consortium (HPRC). In a format “m [n]”, m is the monomer length and n is the number of samples with the HOR according to manual inspection.