Skip to main content
. Author manuscript; available in PMC: 2022 Jun 10.
Published in final edited form as: Science. 2022 Mar 31;376(6588):44–53. doi: 10.1126/science.abj6987

Table 1.

Comparison of GRCh38 and T2T-CHM13v1.1 human genome assemblies.

Summary GRCh38 T2T-CHM13 ±%
Assembled bases (Gbp) 2.92 3.05 +4.5%
Unplaced bases (Mbp) 11.42 0 −100.0%
Gap bases (Mbp) 120.31 0 −100.0%
# Contigs 949 24 −97.5%
Ctg NG50 (Mbp) 56.41 154.26 +173.5%
# Issues 230 46 −80.0%
Issues (Mbp) 230.43 8.18 −96.5%

Gene Annotation

# Genes 60,090 63,494 +5.7%
   protein coding 19,890 19,969 +0.4%
# Exclusive genes 263 3,604
   protein coding 63 140
# Transcripts 228,597 233,615 +2.2%
   protein coding 84,277 86,245 +2.3%
# Exclusive transcripts 1,708 6,693
   protein coding 829 2,780

Segmental duplications (SDs)

% SDs 5.00% 6.61%
SD bases (Mbp) 151.71 201.93 +33.1%
# SDs 24097 41528 +72.3%

RepeatMasker

% Repeats 51.89% 53.94%
Repeat bases (Mbp) 1,516.37 1,647.81 +8.7%
   LINE 626.33 631.64 +0.8%
   SINE 386.48 390.27 +1.0%
   LTR 267.52 269.91 +0.9%
   Satellite 76.51 150.42 +96.6%
   DNA 108.53 109.35 +0.8%
   Simple repeat 36.5 77.69 +112.9%
   Low complexity 6.16 6.44 +4.6%
   Retroposon 4.51 4.65 +3.3%
   rRNA 0.21 1.71 +730.4%

GRCh38 summary statistics exclude “alts” (110 Mbp), patches (63 Mbp), and Chromosome Y (58 Mbp). Assembled bases: all non-N bases. Unplaced bases: not assigned or positioned within a chromosome. # Contigs: GRCh38 scaffolds were split at three consecutive Ns to obtain contigs. NG50: half of the 3.05 Gbp human genome size contained in contigs of this length or greater. # Exclusive genes/transcripts: for GRCh38, GENCODE genes/transcripts not found in CHM13; for CHM13, extra putative paralogs that are not in GENCODE. Segmental duplication analysis is from (42). RepeatMasker analysis is from (49).