Skip to main content
. 2024 Jul 10;111(8):1700–1716. doi: 10.1016/j.ajhg.2024.06.007

Figure 1.

Figure 1

The genetic architecture of MUC5AC in 206 human haplotypes

(A) Recombination-aware phylogenetic analysis of ∼27 kbp neutral sequence (5.592 kbp from introns 31–48 and 21 kbp from 3′ flanking sequence) from 206 human haplotypes of MUC5AC with two chimpanzee haplotypes as outgroup. () = central node with 100% bootstrap support. H1–H3 correspond to three major haplogroups; P1–P6 correspond to protein groups (consistent with C).

(B) Frequency of population-specific haplotypes found in the three common phylogenetic haplogroups of MUC5AC. H1–H3 correspond to the three major haplogroups.

(C) Protein predictions for haplotypes of MUC5AC. Diagrams represent protein domains with the large central exon of MUC5AC, modeled after Guo et al.7 Colors correspond to protein groups visualized in (A). CysD corresponds to cys domains and PTS corresponds to proline-, serine-, and threonine-rich domains.

(D) Distributions of absolute serine and threonine (S/T) count across VNTR domains within the four most common protein groups of MUC5AC.

(E) Distributions of percent S/T content within VNTR domains for the four most common protein groups of MUC5AC.

(F) Logo plot of the 130 8-mer amino acid motif variants used in MUC5AC VNTR domains. Colors correspond to biochemical groupings of amino acids.

(G) Heatmap of 8-mer motif utilization across 206 protein variants of human MUC5AC, colored vertically by protein group identities. Heatmap constructed with normalization within motifs (columns) and hierarchical clustering of haplotypes (rows) and motifs (columns). See Figure S2 for an extended version that includes the matched motifs (columns).