Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2015 May 11.

Published in final edited form as: Annu Rev Genomics Hum Genet. 2013 Jul 15;14:301–323. doi: 10.1146/annurev-genom-091212-153455

Genomic landscape of the MHC. The classical MHC is shown on the short arm of chromosome 6 (base pair positions 29,640,000–33,120,000 from the Genome Reference Consortium Human Build 37, hg19), comprising the class I, II, and III regions. Transcription and chromatin states are illustrated for CD20⁺ normal human B cells using data from the ENCODE project (29). Constitutive expression of many MHC genes occurs in this cell type. Transcribed regions are shown by strand orientation for polyA⁺ RNA >200 nucleotides long from whole cells quantified by RNA-seq (red) based on short reads generated by the Illumina GAIIx platform. Separate tracks are shown for short total RNA (20–200 nucleotides long) (blue), with directional reads from the 5′ ends sequenced on an Illumina GAIIx. Chromatin accessibility is shown for the same cells based on DNase I hypersensitivity analyzed by DNase-seq (black), and is a useful guide to the location of putative regulatory regions. Data are also shown for a specific chromatin modification (H3K27ac) (green) for these cells analyzed by ChIP-seq. H3K27ac is an activating acetylation mark useful, for example, in identifying active enhancers. In terms of the recombination landscape of the MHC, data are shown for the deCODE recombination map (69) (dark brown), representing calculated rates of recombination (sex-averaged) using 10-kb windows. Vertebrate conserved elements are shown based on analysis of 46 species with prediction using PhastCons (107) (light brown). Sequence-level variation is shown for simple nucleotide polymorphisms, that is, single-nucleotide substitutions and small insertions and deletions (indels) found with at least 1% frequency in dbSNP. Variants are denoted in black except those in coding regions with synonymous variants (green), nonsynonymous variants (red), splice-site variants (red), and untranslated-region variants (blue). Remarkably high levels of polymorphism are seen, notably in classical HLA genes where variation is enriched in coding exons involved in defining the antigen-binding cleft. Structural genomic variants are also shown from the Database of Genomic Variants (54) involving segments of DNA larger than 1 kb. Copy number variants (CNVs) and indels are illustrated relative to the reference where gain in size (blue), loss in size (red), or both gain and loss in size (brown) have been reported. Structural variation is common in the MHC, including the RCCX module in the MHC class III region (comprising a number of genes, including RP-C4A/B-CYP21-TNXB), which may be duplicated or triplicated and present in different configurations, including two versions of the C4 gene (50). Other structurally complex sites include the HLA-DRB1 hypervariable region, which has five major haplogroups comprising variable numbers of functional genes and pseudogenes. All data tracks were downloaded from the UCSC Genome Browser (http://genome.ucsc.edu) (64).