Skip to main content
. 2021 Jul 21;37(24):4756–4763. doi: 10.1093/bioinformatics/btab489

Table 1.

Storage size required for various compressed reference haplotype formats

Dataset Size (MB) for format:
vcf.gz jlso bref3 m3vcf.gz imp5
sim 10K 49 6 18 7 57
sim 100K 472 56 81 29 565
sim 1M 4660 776 415 NA 5650
1000G chr10 459 188 342 116 621
1000G chr20 200 89 154 52 279
HRC chr 10 3346 1328 1156 529 3636
HRC chr 20 1510 706 554 253 1610

Note: Here, vcf.gz is the standard compressed VCF format; jlso is used by MendelImpute; bref3 is used by Beagle 5.1; m3vcf.gz is used by Minimac 4; and imp5 is used by Impute 5. For all jlso files, we chose the maximum number of unique haplotypes per window to be dmax=1000. Note we could not generate the m3vcf.gz file for the sim 1 M panel because it required too much memory (RAM).