. 2019 Sep 30;48(D1):D964–D970. doi: 10.1093/nar/gkz822

Table 1.

Information provided in the OGRDB standardized genotype

Field	Description
sequence_id	Identifier of the allele (either IMGT, or the name assigned by the submitter to an inferred gene)
sequences	Overall number of sequences assigned to this allele
closest_reference	For inferred alleles, the closest reference gene and allele, as inferred by the tool
closest_host	For inferred alleles, the closest reference gene and allele that is in the subject's inferred genotype
nt_diff	For inferred alleles, the number of nucleotides that differ between this sequence and the closest reference gene and allele
nt_diff_host	For inferred alleles, the number of nucleotides that differ between this sequence and the closest reference gene and allele that is in the subject's inferred genotype
nt_substitutions	For inferred alleles, comma-separated list of nucleotide substitutions (e.g. G112A) between the sequence and the closest reference gene and allele. IMGT numbering is used for V-genes, and number from start of coding sequence for D- or J- genes.
aa_diff	For inferred alleles, the number of amino acids that differ between this sequence and the closest reference gene and allele
aa_substitutions	For inferred alleles, the list of amino acid substitutions (e.g. A96N) between the sequence and the closest reference gene and allele. IMGT numbering is used for V-genes, and number from start of coding sequence for D- or J- genes.
unmutated_sequences	The number of sequences exactly matching this unmutated sequence
assigned_unmutated_frequency	The number of sequences exactly matching this allele divided by the number of sequences assigned to this allele, *100
unmutated_umis	The number of molecules (identified by Unique Molecular Identifiers) exactly matching this unmutated sequence (if UMIs were used)
allelic_percentage	The number of sequences exactly matching the sequence of this allele divided by the number of sequences exactly matching any allele of this specific gene, *100
unmutated_frequency	The number of sequences exactly matching this sequence divided by the number of sequences exactly matching any allele of any gene, *100
unique_vs	The number of V allele calls (i.e. unique allelic sequences) found associated with this allele
unique_ds	The number of D allele calls (i.e. unique allelic sequences) found associated with this allele
unique_js	The number of J allele calls (i.e. unique allelic sequences) found associated with this allele
unique_cdr3s	The number of unique CDR3s found associated with this allele
unique_vs_unmutated	The number of V allele calls (i.e. unique allelic sequences) associated with unmutated sequences of this allele
unique_ds_unmutated	The number of D allele calls (i.e. unique allelic sequences) associated with unmutated sequences of this allele
unique_js_unmutated	The number of J allele calls (i.e. unique allelic sequences) associated with unmutated sequences of this allele
unique_cdr3s_unmutated	The number of unique CDR3s associated with unmutated sequences of this allele
haplotyping_gene	The gene or genes from which haplotyping was inferred, where haplotyping is possible (e.g. IGHJ6)

Provision of statistics for each allele in the personalized genotype (both reference alleles and novel alleles) allows the novel inferences to be considered in the context of overall gene usage (usage frequency, exact unmutated matches, association with distinct CDR3 and so on), and also provides useful aggregate information on overall gene usage.