Skip to main content
. Author manuscript; available in PMC: 2024 Oct 4.
Published in final edited form as: Annu Rev Genomics Hum Genet. 2024 Aug 6;25(1):77–104. doi: 10.1146/annurev-genom-021623-081639

Table 1.

Summary of variants in the euchromatic portion of a human genome

Variant type Average number of sites (thousands)a Average sum of variant length (Mbp)b Percentage of diploid genomec
All 5,045.39 44.24 0.763
SNV (including MNPs) 3,992.73 3.99 0.069
Indel 1,021.73 3.63 0.063
SVd 30.93 36.62 0.631
 STR 2.65 0.19 0.003
 VNTR 12.58 1.36 0.023
 Other low complexity 2.58 0.13 0.002
 SD 0.55 6.25 0.108
 Mobile element 6.18 1.91 0.033
  LINE1 0.98 0.91 0.016
  ERV 0.64 0.27 0.005
  Alu 3.49 0.48 0.008
  SVA 1.07 0.25 0.004
 Inversion 0.15 23.2 0.400
 Unclassified/mixed 6.23 3.58 0.062

Abbreviations: ERV, endogenous retrovirus; indel, insertion or deletion; LINE1, long interspersed element 1; MNP, multiple-nucleotide polymorphism; SD, segmental duplication; SINE, short interspersed element; SNV, single-nucleotide variant; STR, short tandem repeat; SV, structural variant; SVA, SINE-VNTR-Alu; VCF, Variant Call Format; VNTR, variable number tandem repeat.

a

The average number of sites observed of a given variant type within each genome.

b

The average total length of variant sites.

c

The percentage of a diploid genome that each variant type represents, assuming a 5.8-Gb diploid euchromatic genome length. The values exclude heterochromatin due to uncertainty around assembly and alignment for all variants except inversions, where estimates are from Porubsky et al. (115) and not necessarily restricted to euchromatic sequence.

d

SVs include all structural variants; the remaining rows are SV subclasses. Unclassified/mixed denotes a class of SVs for which reliable annotation could not be given. SV counts, excluding inversions, were calculated from Minigraph (89) VCF files released as part of a paper by Liao et al. (93), provided by Heng Li and Wen-Wei Liao. Small-variant numbers are also from Liao et al. (93) and were calculated using PacBio HiFi sequencing data and DeepVariant (113).