Abstract
Balancing selection has maintained human leukocyte antigen (HLA) allele diversity, but it is unclear whether this selection is symmetric (all heterozygotes are comparable and all homozygotes are comparable in terms of fitness) or asymmetric (distinct heterozygote genotypes display greater fitness than others). We tested the hypothesis that HLA is under asymmetric balancing selection in populations by estimating allelic branch lengths from genetic sequence data encoding peptide-binding domains. Significant deviations indicated changes in the ratio of terminal to internal branch lengths. Such deviations could arise even if no individual alleles present a strikingly altered branch length (e.g. if there is an overall distortion, with all or many terminal branches being longer than expected). DQ and DP loci were also analyzed as haplotypes. Using allele frequencies for 419 distinct populations in 10 geographical regions, we examined population differentiation in alleles within and between regions, and the relationship between allelic branch length and frequency. The strongest evidence for asymmetrical balancing selection was observed for HLA-DRB1, HLA-B and HLA-DPA1, with significant deviation (P ≤ 1.1 × 10−4) in about half of the populations. There were significant results at all loci except HLA-DQB1/DQA1. We observed moderate genetic variation within and between geographic regions, similar to the rest of the genome. Branch length was not correlated with allele frequency. In conclusion, sequence data suggest that balancing selection in HLA is asymmetric (some heterozygotes enjoy greater fitness than others). Because HLA polymorphism is crucial for pathogen resistance, this may manifest as a frequency-dependent selection with fluctuation in the fitness of specific heterozygotes over time.
INTRODUCTION
The human major histocompatibility complex (MHC) is an important gene-dense region spanning over 4 Mb of DNA on chromosome 6p21.3 (1). The MHC exhibits the strongest linkage disequilibrium observed in the human genome and almost half of expressed loci in this region have functions related to immune activation and response (2–5). These include the class I and II human leukocyte antigen (HLA) membrane glycoproteins, which play a critical role in activating the adaptive immune response. HLA classes I and II are expressed on antigen-presenting cells and present intracellular (viral) and extracellular (bacterial) peptides, respectively, for T-cell recognition. HLA loci are highly polymorphic. For example, HLA-B has over 2000 known alleles, encoding over 1500 unique sequences. Evidence suggests that in addition to novel mutations, intergenic and interallelic recombination and gene conversion are responsible for generating novel HLA alleles (6–9).
Symmetric overdominance in HLA
HLA has been shaped both by demographic history and selection, and particularly, balancing selection has been shown to maintain HLA diversity (10–12). It has long been proposed that balancing selection could be pathogen-driven through a fitness advantage conferred by the ability to present a wider range of antigens to T-cells (13,14). Balancing selection is an evolutionary process that maintains allelic diversity over time. The conventional model of balancing selection in the MHC (symmetric overdominance) assumes that all HLA alleles are equivalent in the sense that every heterozygote has the same fitness and every homozygote has the same, somewhat lower, fitness (15). The underlying assumption is that heterozygotes may have greater fitness than homozygotes because their immune systems can recognize a wider range of pathogens. Hedrick and Thomson (16) concluded that balancing selection inferred from the Ewens–Watterson test must be symmetric. However, this purely symmetrical model is over-simplified and the reality is most likely closer to some combination of heterozygote advantage and selective advantage in specific geographic regions. The equality of fitnesses of all heterozygotes and of all homozygotes in the model of symmetric overdominance is assumed because there is no definitive evidence to the contrary despite the many allele-specific disease associations known for HLA loci. This conventional model is purely for convenience and unlikely to be valid. In reality, the fitness of particular homozygotes or heterozygotes is likely to vary, due to the distribution of pathogens and allele-specific disease associations that pertain to particular environmental and behavioral contexts.
Evidence against symmetric overdominance
HLA class I diversity is higher in populations that have been subjected to a higher diversity of pathogens (17). In addition, individuals heterozygous at HLA loci are reported to exhibit increased survival and resistance to pathogens, but these studies may not always have accounted for allele frequencies (18–22). Furthermore, a recent theoretical model suggests that overdominance could only have maintained HLA diversity under both high degenerate pathogen recognition and low intersection advantage (23). Degenerate pathogen recognition refers to the ability of different alleles at a given HLA locus to encode multiple molecules that recognize the same pathogen (23). Intersection advantage is the selective fitness provided by carrying two different alleles that recognize the same pathogen compared with carrying only one allele that recognizes that pathogen (23). Although degenerate pathogen recognition is probably high, evidence suggests that intersection advantage is unlikely to be low (23–28).
Other mechanisms of balancing selection in HLA
Negative frequency-dependent selection has also been proposed as a model of balancing selection that could maintain HLA diversity (15,29). Also referred to as rare-allele advantage or diversifying frequency-dependent selection, it assumes that rare alleles may provide higher fitness through increased resistance to novel pathogens, and that the frequencies of these rare alleles will increase as exposure to the pathogen increases in the population. This cycle is driven by host–pathogen coevolution. Allele-based fitness functions indicate that frequency-dependent selection is the driving force in maintaining HLA diversity (30–32). However, frequency-dependent selection and overdominance are not mutually exclusive. Furthermore, HLA diversity could potentially be maintained by fluctuating selection, where varying spatial and temporal patterns of the frequency and strains of pathogens create a short-term directional selection that is independent of host–pathogen coevolution (33,34).
Our objective was to apply a sequence-based approach to test the hypothesis that HLA class I and class II alleles are under symmetrical balancing selection in single populations (Tables 1 and 2). In addition, we tested the hypotheses that the branch lengths of HLA alleles are correlated with allele frequency within populations, and that the genetic variation of HLA alleles within and between geographical regions is greater than that observed in the rest of the genome.
Table 1.
Analysis of distinct populations
Geographical regions (N = 10) | Number of populations analyzed (N = 419) | Number of individuals included in populations (N = 50 303) |
---|---|---|
SSA | 34 | 4198 |
NAF | 18 | 1489 |
EUR | 88 | 14 708 |
SWA | 37 | 3038 |
SEA | 56 | 6491 |
OCE | 61 | 3596 |
AUS | 4 | 895 |
NEA | 63 | 10 696 |
NAM | 20 | 1810 |
SAM | 38 | 3382 |
Table 2.
Analysis of HLA alleles and haplotypes
Locus | Location on Chr 6p21.3 (base pair, Build 37) | Number of sequences (N = 1112) | Number of base pair analyzed | Number of genealogies (N = 1223) | Percent of genealogies with significant deviation (%) |
---|---|---|---|---|---|
HLA-A | 29 910 247–29 913 661 | 155 | 610 | 129 | 11.6 |
HLA-C | 31 236 529–31 239 855 | 70 | 578 | 92 | 34.8 |
HLA-B | 31 321 649–31 324 989 | 286 | 557 | 120 | 61.7 |
HLA-DRB1 | 32 546 546–32 557 562 | 179 | 274 | 224 | 53.6 |
HLA-DQA1 | 32 605 183–32 611 429 | 8 | 249 | 167 | 4.2 |
HLA-DQB1 | 32 627 657–32 634 466 | 30 | 270 | 177 | 2.3 |
HLA-DPA1 | 33 032 346–33 048 555 | 9 | 246 | 34 | 44.1 |
HLA-DPB1 | 33 043 703–33 057 473 | 64 | 264 | 123 | 2.4 |
HLA-DQA1/DQB1 | 169 | 519 | 130 | 7.4 | |
HLA-DPA1/DPB1 | 142 | 510 | 27 | 0 |
This study examined 1001 HLA alleles (unique exonic sequences encoding peptide-binding domains) and 311 HLA haplotypes in 419 distinct populations, using publicly available sequence information for each allele (60).
Because the data were incomplete, there were less than 419 populations for each locus. The threshold for statistical significance after correcting for multiple testing was P ≤ 1.1 × 10−4.
RESULTS
A genealogy that significantly deviates from the expected shape (P ≤ 1.1 × 10−4) indicated changes in the ratio of terminal to internal branch lengths, compared with a neutral genealogy that is consistent with symmetrical balancing selection (Fig. 1), and in these cases indicates asymmetrical balancing selection. Such deviations could plausibly arise even if no individual alleles presented a strikingly altered branch length (e.g. if there is an overall distortion, with all or many terminal branches being longer than expected). Some HLA alleles had a significantly higher ratio of terminal to internal branch lengths than expected under symmetrical balancing selection (Fig. 2 and Table 2). Over all populations, this phenomenon was most pronounced for the HLA-B and HLA-DRB1 loci, with P ≤ 1.1 × 10−4 in over half of populations. At the HLA-DPA1 and HLA-C loci, over a third of the populations exhibited P ≤ 1.1 × 10−4. Conversely, very few of the populations displayed genealogies that deviated from expected topologies for the HLA-DQA1, HLA-DQB1 and HLA-DPB1 loci. At the HLA-A locus, a 10th of the populations demonstrated significant deviation. There were significant deviations from symmetric balancing selection at all loci except HLA-DQB1/DQA1. In particular, consider the HLA-B locus in the Northern Irish population, RSD = 6.92 (P = 1 × 10−5). Results were similar for the HLA-C (RSD = 6.71, P = 1 × 10−5) and HLA-DRB1 (RSD = 7.84, P = 1 × 10−5) loci in this population. Weaker evidence was observed at the HLA-A locus (RSD = 3.21, P = 1.5 × 10−2) in this population. There were some regional trends in the proportion of RSD values that were significant for certain loci. Populations in Sub-Saharan Africa (SSA), North Africa (NAF), Europe (EUR) and Northeast Asia (NEA) tended to have a larger proportion of significant P-values at HLA-DRB1. Populations in SSA, NAF, EUR and NEA tended to have a larger proportion of significant P-values at HLA-DRB1. Populations in SSA, EUR, Southwest Asia (SWA) and NEA tended to have a larger proportion of significant P-values at HLA-B. Populations in EUR and SWA tended to have a larger proportion of significant P-values at HLA-C. Populations in South America (SAM) tended to have a larger proportion of significant P-values at HLA-DPA1, HLA-DPB1, HLA-DQA1 and HLA-DQB1.
Figure 1.
(A) Example of a genealogy that did not deviate from the expected shape: HLA-DRB1 in the Tolai population from Papa New Guinea (RSD = 2, P = 0.12). (B) Example of a genealogy that significantly deviated from the expected shape: HLA-DRB1 in the Zuni population from NAM (RSD = 4.2, P = 2 × 10−4). RSD represents the proportion of allelic terminal branch lengths in the population. P-values indicate the significance of the deviation from the expected RSD value under the null (1.25). HLA, human leukocyte antigen.
Figure 2.
Distribution of P-values quantifying the significance of the deviation of observed RSD values from those expected in a model of symmetrical balancing selection: (A) HLA-A, (B) HLA-C, (C) HLA-B, (D) HLA-DRB1, (E) HLA-DQA1, (F) HLA-DQB1, (G) HLA-DPA1, (H) HLA-DPB1, (I) HLA-DQA1/DQB1 haplotype, (J) HLA-DPA1/DPB1 haplotype. The threshold for statistical significance after correcting for multiple testing was P ≤ 1.1 × 10−4. HLA, human leukocyte antigen.
Branch length was not correlated with allele frequencies in populations (r2 < 0.10, see Supplementary Material, Table S1 and Fig. S1). Population differentiation between regions and within regions was generally low to moderate (FST < 0.15, Fig. 3 and Supplementary Material, Table S2).
Figure 3.
Within Region FST compared with Between Region FST: (A) HLA-A (156 alleles), (B) HLA-C (70 alleles), (C) HLA-B (286 alleles), (D) HLA-DRB1 (179 alleles), (E) HLA-DQA1 (8 alleles), (F) HLA-DQB1 (30 alleles), (G) HLA-DPA1 (9 alleles), (H) HLA-DPB1 (64 alleles). FST, Wright's fixation index; HLA, human leukocyte antigen.
DISCUSSION
Results from this study provide strong evidence supporting different levels of fitness conferred by different HLA alleles in most populations, with the exception of HLA-DQB1/DQA1. This evidence is strongest for the HLA-B locus, which is the most polymorphic HLA locus, both worldwide (with 2123 distinct alleles identified) and in this study (286 alleles reported). Among HLA loci, the reported within-population variation is also highest at HLA-B (with values of 92.5 and 92.8%) (10,35), and we observed a mean Total FST value of up to 0.217 (at the HLA-B*07:04 allele) for this locus. This extensive molecular diversification may in part be due to asymmetric balancing selection favoring genotypic novelty over the course of the expansion of the human population.
It is important to note that HLA allele frequency data show a strong influence of population history rather than selection (36). Results from the current empirical study provide evidence for asymmetrical balancing selection, but identifying when this asymmetry arose is beyond the scope of this study. The asymmetry reflects the whole history of the alleles. This paper does not conclude that the different degrees of asymmetry found between populations are independent of one another. They partly reflect the ancestral history of humans and even primate ancestors.
RSD estimates have been shown to be significantly biased upward in distance-based genealogies when the per-base-pair recombination rate is >0.008 (37). Several recent genomewide estimates of recombination rates present values of HLA in ranges compatible with applying the RSD statistic to test deviation from expectations under a model of symmetric overdominance. For example, there are no recombination hotspots located in the classical HLA loci, and the recombination rates in the classical HLA loci are very low (38). It has been reported that there is no recombination in HLA-C, HLA-DQA1 and HLA-DQB1, and that the recombination in HLA-DRB1 is very small (38). The MHC region contains short recombination hotspots separated by large recombination coldspots: 80% of the recombination in the MHC region occurs in 10% of the sequence, in contrast to the rest of the genome, where 50% of the recombination occurs in <10% of the sequence (39). However, a previous study of multiallelic balancing selection with recombination conducted simulations that indicated that even very low levels of recombination may have the potential to bias genealogy shape (40). Intragenic recombination and gene conversion have been previously reported in HLA (41,42). Therefore, it is plausible that the asymmetry in the trees observed in results from the current study may be partly due to recombination. Although recombination plays a role in the results of this study, selection may also have been important.
Overall, our results are in concordance with recent studies of selection at the HLA loci. A large meta-analysis of HLA allele frequencies in 70 000 individuals recently confirmed the role of balancing selection in maintaining HLA variation in the classical HLA loci (11). HLA-DQA1 and HLA-C showed the strongest evidence of balancing selection, whereas HLA-DPB1 was compatible with a neutral genealogy (11). A recent study of sequence divergence in 23 500 individuals supported these findings, and also suggested that asymmetrical balancing selection may be present (12). Balancing selection has been inferred for the HLA-A and HLA-B loci in previous studies as well (38).
There is limited evidence from studies of humans to support the role of asymmetrical balancing selection in HLA evolution. The intense evolutionary arms race between the human immunodeficiency virus (HIV) and humans is an interesting case of HLA evolution (43). The HLA A2/6802 supertype (HLA-A*02:02, HLA-A*02:05, HLA-A*02:14 and HLA-A*68:02) has been shown to be associated with negative seroconversion in HIV-exposed East African female commercial sex workers and infants born to HIV-positive mothers (44,45). HLA-B*57:01 is strongly associated with low viral load and protection from disease (46,47). HLA supertypes are functional classifications based on overlapping peptide-binding specificities (48). A population-based study of American men infected with HIV provided further evidence for negative frequency-dependent selection (i.e. rare-allele advantage) in HLA class I loci (49). The frequency of HLA supertypes and the presence of HLA-B*57:01 was correlated with HIV viral load at the set point (49).
In addition, evidence from animal populations also supports a role for asymmetrical balancing selection in maintaining diversity in the MHC. Recent studies have reported the first evidence supporting MHC-specific overdominance in free-living animal populations (50,51). Earlier studies of free-living animal populations suggested that MHC diversity is maintained by divergent allele advantage, which assumes that heterozygotes carrying highly divergent alleles have increased fitness compared with heterozygotes carrying similar alleles and homozygotes (52–54). Animal studies indicate that heterozygote superiority is driven by increased T-cell repertoire diversity and T-cell avidity (55).
Recent animal studies of MHC heterozygosity attributed increased survival in heterozygotes to dominance rather than overdominance (24,56). Under a dominant model, heterozygotes are more fit than homozygotes not carrying the resistance allele and have the same fitness as homozygotes carrying the resistance allele. The shape of HLA-DRB1 genealogies in humans and several animals deviates significantly from the expectation under overdominance and support divergent allele advantage (52).
We found that the classical HLA alleles have low-to-moderate population differentiation within and between regions, similar to the rest of the genome. An FST value between 0.15 and 0.25 is considered to be large and an FST value >0.5 is considered to be extremely large. The genomewide level of population differentiation has previously been reported to be low, with FST values of 0.071 between Europeans and Africans, 0.083 between Africans and Chinese/Japanese and 0.052 between Chinese/Japanese and Europeans (57). The frequency of HLA alleles is known to vary widely between populations, allowing them to be used as ancestry informative markers to control for population structure in genetic studies of association. Therefore, it is somewhat unexpected that results from this study demonstrate low-to-moderate population differentiation between regions and within regions (FST < 0.15, Fig. 3). Nevertheless, there were some larger regional Total FST values (FST > 0.25, Table 3).
Table 3.
HLA alleles with regional Total Wright's fixation index (FST) > 0.25
Locus | Allele | Regional Total FST | Region |
---|---|---|---|
HLA-A | 68:02 | 0.35 | SAM |
24:02 | 0.29 | SEA | |
0.27 | SAM | ||
HLA-B | 07:04 | 0.40 | NEA |
0.25 | SEA | ||
37:01 | 0.27 | SEA | |
39:02 | 0.26 | NAM | |
HLA-DRB1 | 04:11 | 0.43 | NAM |
0.39 | SAM | ||
12:02 | 0.29 | OCE | |
04:04 | 0.25 | SEA |
Upon careful examination, it appears that the population structure in HLA is more localized than that which could be captured by our global Within Region FST and Between Region FST estimates. Some HLA alleles had much smaller Within Region FST and Between Region FST estimates compared with regional Total FST values (Table 3). For example, HLA-DRB1*04:11 had regional Total FST values of 0.43 and 0.39 in North America (NAM) and SAM, respectively, whereas the Within Region FST and Between Region FST values were 0.087 and 0.168, respectively. There were also several alleles with relatively large Between Region FST or Within Region FST values (FST > 0.15, Fig. 3 and Table 4). For example, the largest FST value was for the HLA-A*34:01 allele (Between Region FST = 0.309).
Table 4.
HLA alleles with values of Within Region Wright's fixation index (FST) or Between Region FST greater than 0.15
Locus | Allele | Within Region FST | Between Region FST |
---|---|---|---|
HLA-A | 34:01 | 0.309 | |
HLA-B | 07:04 | 0.217 | |
35:07 | 0.205 | ||
35:19 | 0.152 | ||
HLA-DRB1 | 08:03 | 0.158 | |
04:11 | 0.168 | ||
HLA-DPA1 | 01:03 | 0.169 | |
02:02 | 0.158 | ||
HLA-DPB1 | 05:01 | 0.291 | |
04:02 | 0.261 |
In conclusion, our study is the largest protein sequence-based study of balancing selection in HLA to date and provides strong evidence that balancing selection in HLA is not symmetric. Further research is required to identify which type(s) of asymmetrical balancing selection may be playing a role in maintaining the diversity in HLA loci.
MATERIALS AND METHODS
HLA frequency data
This analysis included allele and haplotype frequencies for 419 population samples from around the world with at least 20 individuals each (total N = 50 303 individuals) (Table 1). The data were available from the dbMHC Anthropology database (www.ncbi.nlm.nih.gov/gv/mhc) and Supplementary Material compiled by Solberg et al. (11) from a systematic review of literature data sets, the 12th and 13th International Histocompatibility Workshops and the AlleleFrequencies.net database (www.pypop.org/popdata) (58,59). Five duplicated data sets and 70 admixed and migrant population samples were excluded from analysis. Population samples from the same region were included in this study, as densely sampled populations from a specific area can be quite different from each other (11).
HLA sequence data
The HLA Informatics Group and European Bioinformatics Institute, in collaboration with the international ImMunoGeneTics project (IMGT), have established the IMGT/HLA database (60). Because clinicians regularly test HLA genes to match organ transplants, all HLA sequences are maintained in a global repository and novel alleles must conform to nomenclature regulations (6,61). The exonic sequences of 5000 classical HLA alleles were publicly available from the IMGT/HLA database as part of release version 3.4.0, of which 801 were analyzed in this study (Table 2). An additional 311 HLA haplotypes were also analyzed. The total number of allele and haplotype sequences analyzed was 1112. Official G groups were used for ambiguous alleles.
Statistical analysis
Allelic genealogies for each locus in single populations were generated from DNA sequences that encode the peptide-binding domains for the classical HLA loci: exons 2 and 3 for class I loci (HLA-A, HLA-B and HLA-C) and exon 2 for class II loci (HLA-DRB1, HLA-DPB1, HLA-DPA1, HLA-DQB1 and HLA-DQA1). Each genealogy represented one locus and one population and provided terminal branch lengths for alleles that were present in that single population. Because HLA-DQB1/DQA1 and HLA-DPB1/DPA1 gene products form heterodimers, DQ and DP loci were also analyzed as haplotypes. In total, 1223 genealogies were generated (Table 1). The neighbor-joining distance method in PAUP was used to make the genealogies (v 4.0b10 for Unix; http://paup.csit.fsu.edu). The neighbor-joining method iteratively tests all pairs of neighbors (alleles or cluster of alleles) while minimizing the sum of the branch lengths to compute pair-wise distances that are then used to create the tree (62,63). Trees were rooted at the midpoint between the two most distant alleles in the tree, based on branch lengths. The RSD statistic compares the ratio of terminal to internal branches. Thus, it is sensitive to the location of the root, and the choice of rooting can introduce a bias. Rooting the tree at the midpoint would break the longest branch of the tree and minimize its contribution to deviation from the neutral expectation for RSD. Therefore, rooting the tree at the midpoint is a conservative choice.
An appropriate model of nucleotide selection was identified based on the analysis of sequence data for exon 2 in HLA-DRB1 and exons 2 and 3 in HLA-B for the Irish population (jModelTest v 0.1.1) (64–66). Three models were considered: the Jukes–Cantor correction, the Kimura two-parameter and the general time-reversible models. The Jukes–Cantor correction assumes that mutations occur at a constant rate and that each nucleotide is equally likely to mutate to any other (62,67). The Kimura two-parameter model extends the Jukes–Cantor correction by accounting for different rates of transitions and transversions (68). Generally, transitions (changes within purines [A, G] or pyrimidines [C, T]) are more common than transversions (changes between purines and pyrimidines). The general time-reversible model accounts for different frequencies of nucleotides and different rates for each pair of nucleotide substitutions (symmetric substitution matrix) (69). A likelihood ratio test was used to compare the Jukes–Cantor and Kimura two-parameter models, and Akaike and Bayesian information criteria were examined to select the best-fitting model. The likelihood ratio test comparing the Jukes–Cantor and Kimura two-parameter models of nucleotide substitution indicated that the Kimura two-parameter model, with unequal base frequencies, fit better than the Jukes–Cantor model (P < 1 × 10−6). The general time-reversible model was selected because it fit even better than the Kimura two-parameter model, based on smaller estimates of Akaike and Bayesian information criteria (data not shown).
We compared the observed shape of the trees with the expected shape of the tree under the neutral model of coalescent expectation. The neutral model of coalescent expectation can be used to represent symmetrical balancing selection because a genealogy under symmetric balancing selection has the same properties as a neutral genealogy, but with a different effective population size (70). In this study, a neutral genealogy refers to all homozygotes having equal fitness, and lower fitness than heterozygotes, and all heterozygotes having equal fitness. Deviation was quantified with the population genetics statistic RSD, which was based on the number of alleles (n), the sum of the terminal branch lengths (L) and tree depth (D) (71). RSD was equal to L × (1–1/n)/D. The expectation of RSD was 1.25 for genealogy topologies under the neutral model of coalescence. The assumption is that if all classes of heterozygotes are equally fit, then selection should not favor maintaining certain alleles in the population over longer spans of time than other alleles; alleles should have similar branch lengths, and trees should conform to the neutral model of coalescence. However, if specific classes of heterozygotes are more fit than others, then alleles which were favored by selection and therefore maintained in the population longer spans of time ended up accumulating more mutations and hence are associated with longer branch lengths. The significance of the deviation between observed and expected genealogy was assessed by calculating two-tailed P-values empirically based on at least 10 000 permutations under the neutral model of coalescent expectation [Analyses of Phylogenetics and Evolution (APE v 2.6–2) package in R (R v2.12.1; http://www.r-project.org] (72). To address multiple testing concerns for the genealogies (N = 1223), we determined significance (P ≤ 1.1 × 10−4) by controlling the false discovery rate, using the Benjamini–Hochberg method at a conservative level of 0.05% (73). Setting a significance threshold with the Bonferroni correction for the number of distinct populations (N = 419) provides a similar cutoff (P ≤ 1.19 × 10−4).
Population differentiation, or genetic variation, of HLA alleles within and between regions was examined with Wright's fixation index (FST) (74). The Total FST for each allele was calculated as v/m× [1 − m], where v is the variance of allele frequencies and m is the mean allele frequency. Regional Total FST values were calculated for each allele by geographical region [SSA, NAF, EUR, SWA, Southeast Asia (SEA), Oceania (OCE), Australia (AUS), NEA, NAM, SAM] as v/m × [1 − m], where v is the variance of allele frequencies in the specified region and m is the mean allele frequency in the specified region. The Within Region FST was the mean of the 10 regional Total FST values. The Between Region FST was calculated by calculating the Total FST using only mean allele frequencies from each of the 10 geographical regions. An FST value of zero represents freely interbreeding populations and the maximum value of 1 represents fixed differences between populations.
Correlation between HLA allelic branch lengths and population frequencies was tested with Spearman's correlation test.
Hierarchical analysis of molecular variance (AMOVA) tests was used to partition measures of total variability into variability among (i) geographic regions, (ii) populations within geographic regions and (iii) individuals within populations [Analysis of Ecological Data (ADE4 v 1.5-0) package in R] (75). AMOVA tests are essentially nested analysis of variance tests based on differences between molecular sequences, represented by Euclidean distances. Estimates were calculated for the sums of squares and mean squares, as well as components of covariance and their contribution to the total covariance. Estimated standard errors for our AMOVA results were based on 1000 permutations.
SUPPLEMENTARY MATERIAL
FUNDING
This work was supported by the NIH/NIGMS (grant number R01-GM040282 to M.S.) and the NIH/NIAID (grant number U01-AI067068 to S.J.M.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences, the National Institute of Allergy and Infectious Diseases or the National Institutes of Health.
Supplementary Material
ACKNOWLEDGEMENTS
We would like to thank N. Matzke, T. Heath and E. Durand for their help.
Conflict of Interest statement. None declared.
REFERENCES
- 1.Beck S., Geraghty D., Inoko H., Rowen L., Aguado B., Bahram S., Campbell R.D., Forbes S.A., Guillaudeux T., Hood L., et al. Complete sequence and gene map of a human major histocompatibility complex. The MHC sequencing consortium. Nature. 1999;401:921–923. doi: 10.1038/44853. [DOI] [PubMed] [Google Scholar]
- 2.Miretti M.M., Walsh E.C., Ke X., Delgado M., Griffiths M., Hunt S., Morrison J., Whittaker P., Lander E.S., Cardon L.R., et al. A high-resolution linkage-disequilibrium map of the human major histocompatibility complex and first generation of tag single-nucleotide polymorphisms. Am. J. Hum. Genet. 2005;76:634–646. doi: 10.1086/429393. doi:10.1086/429393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Walsh E.C., Mather K.A., Schaffner S.F., Farwell L., Daly M.J., Patterson N., Cullen M., Carrington M., Bugawan T.L., Erlich H., et al. An integrated haplotype map of the human major histocompatibility complex. Am. J. Hum. Genet. 2003;73:580–590. doi: 10.1086/378101. doi:10.1086/378101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.de Bakker P.I.W., McVean G., Sabeti P.C., Miretti M.M., Green T., Marchini J., Ke X., Monsuur A.J., Whittaker P., Delgado M., et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHC. Nat. Genet. 2006;38:1166–1172. doi: 10.1038/ng1885. doi:10.1038/ng1885. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Stewart C.A., Horton R., Allcock R.J., Ashurst J.L., Atrazhev A.M., Coggill P., Dunham I., Forbes S., Halls K., Howson J.M., et al. Complete MHC haplotype sequencing for common disease gene mapping. Genome Res. 2004;14:1176–1187. doi: 10.1101/gr.2188104. doi:10.1101/gr.2188104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Robinson J., Mistry K., McWilliam H., Lopez R., Parham P., Marsh S.G. The IMGT/HLA database. Nucleic Acids Res. 2011;39:D1171–D1176. doi: 10.1093/nar/gkq998. doi:10.1093/nar/gkq998. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Richman A.D., Herrera L.G., Nash D., Schierup M.H. Relative roles of mutation and recombination in generating allelic polymorphism at an MHC class II locus in Peromyseus maniculatus. Genet. Res. 2003;82:89–99. doi: 10.1017/s0016672303006347. doi:10.1017/S0016672303006347. [DOI] [PubMed] [Google Scholar]
- 8.von Salome J., Gyllensten U., Bergstrom T.F. Full-length sequence analysis of the HLA-DRB1 locus suggests a recent origin of alleles. Immunogenetics. 2007;59:261–271. doi: 10.1007/s00251-007-0196-8. doi:10.1007/s00251-007-0196-8. [DOI] [PubMed] [Google Scholar]
- 9.von Salome J., Kukkonen J.P. Sequence features of HLA-DRB1 locus define putative basis for gene conversion and point mutations. BMC Genomics. 2008;9:228. doi: 10.1186/1471-2164-9-228. doi:10.1186/1471-2164-9-228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Meyer D., Single R.M., Mack S.J., Erlich H.A., Thomson G. Signatures of demographic history and natural selection in the human major histocompatibility complex loci. Genetics. 2006;173:2121–2142. doi: 10.1534/genetics.105.052837. doi:10.1534/genetics.105.052837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Solberg O.D., Mack S.J., Lancaster A.K., Single R.M., Tsai Y., Sanchez-Mazas A., Thomson G. Balancing selection and heterogeneity across the classical human leukocyte antigen loci: a meta-analytic review of 497 population studies. Hum. Immunol. 2008;69:443–464. doi: 10.1016/j.humimm.2008.05.001. doi:10.1016/j.humimm.2008.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Buhler S., Sanchez-Mazas A. HLA DNA sequence variation among human populations: molecular signatures of demographic and selective events. PLoS ONE. 6:e14643. doi: 10.1371/journal.pone.0014643. doi:10.1371/journal.pone.0014643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Doherty P.C., Zinkernagel R.M. A biological role for the major histocompatibility antigens. Lancet. 1975;305:1406–1409. doi: 10.1016/s0140-6736(75)92610-0. doi:10.1016/S0140-6736(75)92610-0. [DOI] [PubMed] [Google Scholar]
- 14.Doherty P.C., Zinkernagel R.M. Enhanced immunological surveillance in mice heterozygous at the H-2 gene complex. Nature. 1975;256:50–52. doi: 10.1038/256050a0. doi:10.1038/256050a0. [DOI] [PubMed] [Google Scholar]
- 15.Takahata N., Nei M. Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci. Genetics. 1990;124:967–978. doi: 10.1093/genetics/124.4.967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Hedrick P.W., Thomson G. Evidence for balancing selection at HLA. Genetics. 1983;104:449–456. doi: 10.1093/genetics/104.3.449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Prugnolle F., Manica A., Charpentier M., GuÈgan J.F., Guernier V., Balloux F. Pathogen-driven selection and worldwide HLA class I diversity. Curr. Biol. 2005;15:1022–1027. doi: 10.1016/j.cub.2005.04.050. doi:10.1016/j.cub.2005.04.050. [DOI] [PubMed] [Google Scholar]
- 18.Thursz M.R., Thomas H.C., Greenwood B.M., Hill A.V.S. Heterozygote advantage for HLA class-II type in hepatitis B virus infection. Nat. Genet. 1997;17:11–12. doi: 10.1038/ng0997-11. doi:10.1038/ng0997-11. [DOI] [PubMed] [Google Scholar]
- 19.Carrington M., Nelson G.W., Martin M.P., Kissner T., Vlahov D., Goedert J.J., Kaslow R., Buchbinder S., Hoots K., O'Brien S.J. HLA and HIV-1: heterozygote advantage and B*35-Cw*04 disadvantage. Science. 1999;283:1748–1752. doi: 10.1126/science.283.5408.1748. doi:10.1126/science.283.5408.1748. [DOI] [PubMed] [Google Scholar]
- 20.Jeffery K.J., Siddiqui A.A., Bunce M., Lloyd A.L., Vine A.M., Witkover A.D., Izumo S., Usuku K., Welsh K.I., Osame M., et al. The influence of HLA class I alleles and heterozygosity on the outcome of human T cell lymphotropic virus type I infection. J. Immunol. 2000;165:7278–7284. doi: 10.4049/jimmunol.165.12.7278. [DOI] [PubMed] [Google Scholar]
- 21.Hraber P., Kuiken C., Yusim K. Evidence for human leukocyte antigen heterozygote advantage against hepatitis C virus infection. Hepatology. 2007;46:1713–1721. doi: 10.1002/hep.21889. doi:10.1002/hep.21889. [DOI] [PubMed] [Google Scholar]
- 22.Lipsitch M., Bergstrom C., Antia R. Effect of human leukocyte antigen heterozygosity on infectious disease outcome: the need for allele-specific measures. BMC Med. Genet. 2003;4:2. doi: 10.1186/1471-2350-4-2. doi:10.1186/1471-2350-4-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Stoffels R.J., Spencer H.G. An asymmetric model of heterozygote advantage at major histocompatibility complex genes: degenerate pathogen recognition and intersection advantage. Genetics. 2008;178:1473–1489. doi: 10.1534/genetics.107.082131. doi:10.1534/genetics.107.082131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Penn D.J., Damjanovich K., Potts W.K. MHC heterozygosity confers a selective advantage against multiple-strain infections. Proc. Natl Acad. Sci. USA. 2002;99:11260–11264. doi: 10.1073/pnas.162006499. doi:10.1073/pnas.162006499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.McClelland E.E., Penn D.J., Potts W.K. Major histocompatibility complex heterozygote superiority during coinfection. Infect. Immun. 2003;71:2079–2086. doi: 10.1128/IAI.71.4.2079-2086.2003. doi:10.1128/IAI.71.4.2079-2086.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Wedekind C., Walker M., Little T.J. The course of malaria in mice: major histocompatibility complex (MHC) effects, but no general MHC heterozygote advantage in single-strain infections. Genetics. 2005;170:1427–1430. doi: 10.1534/genetics.105.040683. doi:10.1534/genetics.105.040683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wedekind C., Walker M., Little T.J. The separate and combined effects of MHC genotype, parasite clone, and host gender on the course of malaria in mice. BMC Genet. 2006;7:55. doi: 10.1186/1471-2156-7-55. doi:10.1186/1471-2156-7-55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ilmonen P., Penn D.J., Damjanovich K., Morrison L., Ghotbi L., Potts W.K. Major histocompatibility complex heterozygosity reduces fitness in experimentally infected mice. Genetics. 2007;176:2501–2508. doi: 10.1534/genetics.107.074815. doi:10.1534/genetics.107.074815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Slade R.W., McCallum H.I. Overdominant vs. frequency-dependent selection at MHC loci. Genetics. 1992;132:861–864. doi: 10.1093/genetics/132.3.861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Boer R., Borghans J., Boven M., Keşmir C., Weissing F. Heterozygote advantage fails to explain the high degree of polymorphism of the MHC. Immunogenetics. 2004;55:725–731. doi: 10.1007/s00251-003-0629-y. doi:10.1007/s00251-003-0629-y. [DOI] [PubMed] [Google Scholar]
- 31.Borghans J., Beltman J., Boer R. MHC polymorphism under host-pathogen coevolution. Immunogenetics. 2004;55:732–739. doi: 10.1007/s00251-003-0630-5. doi:10.1007/s00251-003-0630-5. [DOI] [PubMed] [Google Scholar]
- 32.Borghans J., Keşmir C., Boer R. MHC diversity in Individuals and Populations. In: Flower D., Timmis J., editors. In Silico Immunology. Springer, New York, NY,; 2007. pp. 177–195. [Google Scholar]
- 33.Spurgin L.G., Richardson D.S. How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings. Proc. Roy. Soc. Lond. B Biol. Sci. 2010;277:979–988. doi: 10.1098/rspb.2009.2084. doi:10.1098/rspb.2009.2084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Hedrick P.W. Pathogen resistance and genetic variation at MHC loci. Evolution. 2002;56:1902–1908. doi: 10.1111/j.0014-3820.2002.tb00116.x. [DOI] [PubMed] [Google Scholar]
- 35.Sanchez-Mazas A. An apportionment of human HLA diversity. Tissue Antigens. 2007;69(Suppl. 1):198–202. doi: 10.1111/j.1399-0039.2006.00802.x. doi:10.1111/j.1399-0039.2006.00802.x. [DOI] [PubMed] [Google Scholar]
- 36.Blagitko N., O'HUigin C., Figueroa F., Horai S., Sonoda S., Tajima K., Watkins D., Klein J. Polymorphism of the HLA-DRB1 locus in Colombian, Ecuadorian, and Chilean Amerinds. Hum. Immunol. 1997;54:74–81. doi: 10.1016/s0198-8859(97)00005-0. doi:10.1016/S0198-8859(97)00005-0. [DOI] [PubMed] [Google Scholar]
- 37.Schierup M.H., Hein J. Consequences of recombination on traditional phylogenetic analysis. Genetics. 2000;156:879–891. doi: 10.1093/genetics/156.2.879. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Kong A., Thorleifsson G., Gudbjartsson D.F., Masson G., Sigurdsson A., Jonasdottir A., Walters G.B., Jonasdottir A., Gylfason A., Kristinsson K.T., et al. Fine-scale recombination rate differences between sexes, populations and individuals. Nature. 2010;467:1099–1103. doi: 10.1038/nature09525. doi:10.1038/nature09525. [DOI] [PubMed] [Google Scholar]
- 39.McVean G.A., Myers S.R., Hunt S., Deloukas P., Bentley D.R., Donnelly P. The fine-scale structure of recombination rate variation in the human genome. Science. 2004;304:581–584. doi: 10.1126/science.1092500. doi:10.1126/science.1092500. [DOI] [PubMed] [Google Scholar]
- 40.Schierup M.H., Mikkelsen A.M., Hein J. Recombination, balancing selection and phylogenies in MHC and self-incompatibility genes. Genetics. 2001;159:1833–1844. doi: 10.1093/genetics/159.4.1833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Bergstrom T.F., Josefsson A., Erlich H.A., Gyllensten U. Recent origin of HLA-DRB1 alleles and implications for human evolution. Nat. Genet. 1998;18:237–242. doi: 10.1038/ng0398-237. doi:10.1038/ng0398-237. [DOI] [PubMed] [Google Scholar]
- 42.Takahata N., Satta Y. Footprints of intragenic recombination at HLA loci. Immunogenetics. 1998;47:430–441. doi: 10.1007/s002510050380. doi:10.1007/s002510050380. [DOI] [PubMed] [Google Scholar]
- 43.Carrington M., O'Brien S.J. The influence of HLA genotype on AIDS. Annu. Rev. Med. 2003;54:535–551. doi: 10.1146/annurev.med.54.101601.152346. doi:10.1146/annurev.med.54.101601.152346. [DOI] [PubMed] [Google Scholar]
- 44.MacDonald K.S., Fowke K.R., Kimani J., Dunand V.A., Nagelkerke N.J., Ball T.B., Oyugi J., Njagi E., Gaur L.K., Brunham R.C., et al. Influence of HLA supertypes on susceptibility and resistance to human immunodeficiency virus type 1 infection. J. Infect. Dis. 2000;181:1581–1589. doi: 10.1086/315472. doi:10.1086/315472. [DOI] [PubMed] [Google Scholar]
- 45.MacDonald K.S., Embree J.E., Nagelkerke N.J., Castillo J., Ramhadin S., Njenga S., Oyug J., Ndinya-Achola J., Barber B.H., Bwayo J.J., et al. The HLA A2/6802 supertype is associated with reduced risk of perinatal human immunodeficiency virus type 1 transmission. J. Infect. Dis. 2001;183:503–506. doi: 10.1086/318092. doi:10.1086/318092. [DOI] [PubMed] [Google Scholar]
- 46.Gao X.J., O'Brien T.R., Welzel T.M., Marti D., Qi Y., Goedert J.J., Phair J., Pfeiffer R., Carrington M. HLA-B alleles associate consistently with HIV heterosexual transmission, viral load, and progression to AIDS, but not susceptibility to infection. AIDS. 2010;24:1835–1840. doi: 10.1097/QAD.0b013e32833c3219. doi:10.1097/QAD.0b013e32833c3219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Gao X., Bashirova A., Iversen A.N., Phair J., Goedert J.J., Buchbinder S., Hoots K., Vlahov D., Altfeld M., O'Brien S.J., et al. AIDS restriction HLA allotypes target distinct intervals of HIV-1 pathogenesis. Nature Medicine. 2005;66 doi: 10.1038/nm1333. 1290–1292. [DOI] [PubMed] [Google Scholar]
- 48.Sette A., Sidney J. Nine major HLA class I supertypes account for the vast preponderance of HLA-A and -B polymorphism. Immunogenetics. 1999;50:201–212. doi: 10.1007/s002510050594. doi:10.1007/s002510050594. [DOI] [PubMed] [Google Scholar]
- 49.Trachtenberg E., Korber B., Sollars C., Kepler T.B., Hraber P.T., Hayes E., Funkhouser R., Fugate M., Theiler J., Hsu Y.S., et al. Advantage of rare HLA supertype in HIV disease progression. Nat. Med. 2003;9:928–935. doi: 10.1038/nm893. doi:10.1038/nm893. [DOI] [PubMed] [Google Scholar]
- 50.Kekalainen J., Vallunen J.A., Primmer C.R., Rattya J., Taskinen J. Signals of major histocompatibility complex overdominance in a wild salmonid population. Proc. Biol. Sci. 2009;276:3133–3140. doi: 10.1098/rspb.2009.0727. doi:10.1098/rspb.2009.0727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Oliver M.K., Telfer S., Piertney S.B. Major histocompatibility complex (MHC) heterozygote superiority to natural multi-parasite infections in the water vole (Arvicola terrestris) Proc. Roy. Soc. Lond. B Biol. Sci. 2009;276:1119–1128. doi: 10.1098/rspb.2008.1525. doi:10.1098/rspb.2008.1525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Richman A.D., Herrera L.G., Nash D. MHC class II beta sequence diversity in the deer mouse (Peromyscus maniculatus): implications for models of balancing selection. Mol. Ecol. 2001;10:2765–2773. doi: 10.1046/j.0962-1083.2001.01402.x. [DOI] [PubMed] [Google Scholar]
- 53.Satta Y. Effects of intra-locus recombination of HLA polymorphism. Hereditas. 1997;127:105–112. doi: 10.1111/j.1601-5223.1997.00105.x. doi:10.1111/j.1601-5223.1997.00105.x. [DOI] [PubMed] [Google Scholar]
- 54.Wakeland E.K., Boehme S., She J.X., Lu C.C., McIndoe R.A., Cheng I., Ye Y., Potts W.K. Ancestral polymorphisms of MHC class-II genes – divergent allele advantage. Immunol. Res. 1990;9:115–122. doi: 10.1007/BF02918202. doi:10.1007/BF02918202. [DOI] [PubMed] [Google Scholar]
- 55.Messaoudi I., Guevara Patino J.A., Dyall R., LeMaoult J., Nikolich-Zugich J. Direct link between MHC polymorphism, T cell avidity, and diversity in immune defense. Science. 2002;298:1797–1800. doi: 10.1126/science.1076064. doi:10.1126/science.1076064. [DOI] [PubMed] [Google Scholar]
- 56.Worley K., Collet J., Spurgin L.G., Cornwallis C., Pizzari T., Richardson D.S. MHC heterozygosity and survival in red junglefowl. Mol. Ecol. 2010;19:3064–3075. doi: 10.1111/j.1365-294X.2010.04724.x. doi:10.1111/j.1365-294X.2010.04724.x. [DOI] [PubMed] [Google Scholar]
- 57.Durbin R.M., Abecasis G.R., Altshuler D.L., Auton A., Brooks L.D., Gibbs R.A., Hurles M.E., McVean G.A. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. doi:10.1038/nature09534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Gonzalez-Galarza F.F., Christmas S., Middleton D., Jones A.R. Allele frequency net: a database and online repository for immune gene frequencies in worldwide populations. Nucleic Acids Res. 2011;39:D913–D919. doi: 10.1093/nar/gkq1128. doi:10.1093/nar/gkq1128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Meyer D., Singe R.M., Mack S.J., Lancaster A., Nelson M.P., Erlich H., Fernandez-Vina M., THomson G. Immunobiology of the Human MHC: Proceedings of the 13th International Histocompatibility Workshop and Conference. In: Hansen J.A., editor. Seattle, WA: IHWG Press; 2007. pp. 653–704. [Google Scholar]
- 60.Robinson J., Malik A., Parham P., Bodmer J.G., Marsh S.G. IMGT/HLA database—a sequence database for the human major histocompatibility complex. Tissue Antigens. 2000;55:280–287. doi: 10.1034/j.1399-0039.2000.550314.x. doi:10.1034/j.1399-0039.2000.550314.x. [DOI] [PubMed] [Google Scholar]
- 61.Sheldon S., Poulton K. HLA typing and its influence on organ transplantation. Methods Mol. Biol. 2006;333:157–174. doi: 10.1385/1-59745-049-9:157. [DOI] [PubMed] [Google Scholar]
- 62.Saitou N., Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 1987;4:406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]
- 63.Hartl D.L., Clark A.G. Principles of Population Genetics. Sunderland, MA: Sinauer Associates, Inc.; 2007. pp. 317–376. [Google Scholar]
- 64.Posada D. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 2008;25:1253–1256. doi: 10.1093/molbev/msn083. doi:10.1093/molbev/msn083. [DOI] [PubMed] [Google Scholar]
- 65.Bos D.H., Posada D. Using models of nucleotide evolution to build phylogenetic trees. Dev. Comp. Immunol. 2005;29:211–227. doi: 10.1016/j.dci.2004.07.007. doi:10.1016/j.dci.2004.07.007. [DOI] [PubMed] [Google Scholar]
- 66.Guindon S., Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 2003;52:696–704. doi: 10.1080/10635150390235520. doi:10.1080/10635150390235520. [DOI] [PubMed] [Google Scholar]
- 67.Jukes T.H., Cantor C.R. Evolution of protein molecules. Mamm. Prot. Metab. 1969;III:21–32. [Google Scholar]
- 68.Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 1980;16:111–120. doi: 10.1007/BF01731581. doi:10.1007/BF01731581. [DOI] [PubMed] [Google Scholar]
- 69.Tavare S. Some probabilistic and statistical problems in the analysis of DNA sequences. In: Miura R.M., editor. Some Mathematical Questions in Biology-DNA Sequence Analysis. Vol. 17. Providence, RI: American Mathematical Society; 1986. pp. 57–86. [Google Scholar]
- 70.Takahata N. A simple genealogical structure of strongly balanced allelic lines and transspecies evolution of polymorphism. Proc. Natl Acad. Sci. USA. 1990;87:2419–2423. doi: 10.1073/pnas.87.7.2419. doi:10.1073/pnas.87.7.2419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Uyenoyama M.K. Genealogical structure among alleles regulating self-incompatibility in natural populations of flowering plants. Genetics. 1997;147:1389–1400. doi: 10.1093/genetics/147.3.1389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Paradis E., Claude J., Strimmer K. APE: analyses of phylogenetics and evolution in R language. Bioinformatics. 2004;20:289–290. doi: 10.1093/bioinformatics/btg412. doi:10.1093/bioinformatics/btg412. [DOI] [PubMed] [Google Scholar]
- 73.Benjamini Y., Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B. 1995;57:289–300. [Google Scholar]
- 74.Wright S. The genetical structure of populations. Ann. Eugenics. 1951;15:323–354. doi: 10.1111/j.1469-1809.1949.tb02451.x. [DOI] [PubMed] [Google Scholar]
- 75.Dray S., Dufour A.B. The ade4 package: implementing the duality diagram for ecologists. J. Stat. Softw. 2007;22:1–20. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.