Skip to main content
. Author manuscript; available in PMC: 2024 Mar 1.
Published in final edited form as: Nature. 2023 Aug 23;621(7978):344–354. doi: 10.1038/s41586-023-06457-y

Table 1 |.

Comparison of GRCh38-Y and T2T-Y.

GRCh38-Y T2T-Y

Assembly Total bases 57,264,655 62,460,029 +9.1
Assigned bases 57,227,415 62,460,029 +9.1
Unlocalized bases 37,240 0
Num. gaps 56 0
Num. N-bases 30,812,366 0

Annotation Num. genes 589 693 +17.7
  Protein coding 66 106 +60.6
Num. additional genes 6 110
  Protein coding 1 41
Num. transcripts 681 883 +29.7
  Protein coding 372 488 +31.2
Num. additional transcripts 4 206
  Protein coding 4 120

Ampliconic gene copy numbers BPY2 4 (3, 0) 4 (3, 0) 0
CDY 26 (4, 0) 26 (4, 0) 0
DAZ 4 (4, 0) 4 (4, 0) 0
HSFY 8 (2, 0) 8 (2, 0) 0
PRY 8 (2, 0) 8 (2, 0) 0
RBMY 32 (6, 4) 34 (6, 4) +3.3
TSPY 25 (7, 0) 66 (45, 0) +164.0
VCY 2 (2, 0) 2 (2, 0) 0
XKRY 8 (0, 2) 8 (0, 2) 0

Haplogroup Haplogroup R-L20 (R1b1a2a1a2b1a1) J-L816 (J1a2b3a1)
Ancestry European Ashkenazi Jewish

Repetitive bases SINE 2,625,350 4,385,917 +67.1
Retroposon 18,506 18,500 −0.0
LINE 6,378,323 6,456,888 +1.2
LTR 4,604,368 4,613,537 +0.2
DNA/Rolling-circle 2,626,425 4,387,030 +67.0
Satellite 1,578,773 14,522,636 +819.9
Simple repeat 1,124,311 21,568,381 +1,818.4
Other 705,062 972,612 +37.9

All repeat classes 17,501,283 53,004,524 +202.9
% repetitive (non-N) 66.3 84.9 +28.1

Accessible with short-reads 13,785,359 14,363,623 +4.2

Annotation statistics for GRCh38-Y are taken from the RefSeq (v110) annotation, and T2T-Y statistics are taken from a lifted and curated combination of RefSeq (v110) and GENCODE (v35) annotations. Num. additional genes/transcripts are those found exclusively in one assembly compared to the other. Ampliconic gene copy numbers are shown as X(Y,Z) where X = total number of annotated genes; Y = protein-coding genes; and Z = transcribed pseudogenes. %Δ is the percent change from GRCh38-Y to T2T-Y. Blank spaces indicate not applicable.