Abstract
Searching for genetic determinants of human longevity has been challenged by the rarity of data sets with large numbers of individuals who have reached extreme old age, inconsistent definitions of the phenotype, and the difficulty of defining appropriate controls. Meta-analysis – a statistical method to summarize results from different studies – has become a common tool in genetic epidemiology to accrue large sample sizes for powerful genetic association studies. In conducting a meta-analysis of studies of human longevity however, particular attention must be made to the definition of cases and controls (including their health status) and on the effect of possible confounders such as sex and ethnicity upon the genetic effect to be estimated. We will show examples of how a meta-analysis can inflate the false negative rates of genetic association studies or it can bias estimates of the association between a genetic variant and extreme longevity
Keywords: human longevity, meta-analysis, odds-ratios
Introduction
The hunt for genetic and non-genetic factors that promote human longevity continues to fascinate and engage aging-researchers. While substantial progress has been made in the epidemiology of extreme human longevity, particularly regarding evidence supporting Fries’s “compression of morbidity” hypothesis in oldest centenarians [1–4], the discovery of genetic factors that promote longevity and extreme longevity has been challenged by the rarity of the phenotype, the need for large samples to reach an extreme level of statistical significance in genome-wide association studies and also, in the case of association studies, the lack of clarity in the definition of both cases and controls. These challenges are related and often work against each other. For example, we have shown that the heritability of longevity expressed as sibling relative risk increases with more and more extreme definitions of longevity [5], and while using an extreme definition of longevity should result in a more heritable phenotype, accruing a sufficiently large sample of very old individuals is very difficult. It has taken the New England Centenarian Study more than 20 years to accrue almost 200 supercentenarians, those who have survived to age 110 years and older [6].
In order to boost their statistical power, some studies relax their definition of longevity [7], or the definition of controls, and more recently meta-analysis has emerged as a way to increase statistical power by aggregating results from many smaller studies. While the method of meta-analysis has many important properties and can be useful to increase evidence in support of or against a hypothesis, it can often produce misleading results. Here we discuss some of the major challenges of meta-analysis of longevity studies.
Meta-analyses of Genome-Wide Association Studies of Longevity
Definition of Genetic Effect
Genetic association studies of longevity typically use a case-control study design, in which cases are individuals who have reached some defined old age and controls are often a random sample from the population with the assumption that the phenotype is so rare, it is assumed demographically unlikely that the control will eventually survive to the age of interest. The genetic effect of an allele g that can be estimated with data collected using this study design is the odds ratio for extreme longevity, comparing carriers and non-carriers of the g allele. This odds ratio (OR) is defined as
| (1) |
where EL denotes “extreme longevity”, p (EL | g) is the probability of achieving extreme longevity in carriers of the g allele, and p (EL | ḡ) is the probability of achieving extreme longevity in non-carriers of the g allele. This odds ratio is equivalent to
| (2) |
where p (g | EL) is the prevalence of the g allele in cases, AL denotes “average longevity”, and p (g | AL) is the prevalence of the g allele in controls. The parameter θ in Equation (2) is “exposure-odds” that is the estimable quantity in a case-control study design and it can be converted into the parameter of interest in Equation (1), i.e. the “disease odds”, or in this case, “EL odds” by using Bayes’ theorem [8].
Many factors can affect the magnitude and statistical significance of the genetic effect θ, particularly definitions of the case (EL), of the control (AL), genetic confounders such as ethnicity that may affect the prevalence of the g allele in cases and controls, and of course the sample size. Regarding controls, they should ideally be matched by birth year to avoid unmeasurable confounding due to secular effects. However, even the longest running longitudinal studies have not been able to adhere to such an inclusion criterion for controls. Thus, most studies settle for controls who have not reached a certain age. As discussed later, this definition can also be problematic.
Meta-analysis of results from different studies has become a standard procedure in genetic association studies to remedy the limited sample sizes of individual studies. The underlying assumption of this approach is that “more is always better” and aggregating results from different studies will strengthen the results and increase the statistical significance of the true positive associations. Using real data, we show however that this approach can lead to more false negatives, not fewer.
Meta-Analysis: Non-ignorable Assumptions
A meta-analysis is a statistical method to summarize results from different studies that has become extremely common in genetic epidemiology [9]. A meta-analysis of genetic effects estimated through case-control studies will typically receive as input the estimated odds ratios and standard errors from each study and aggregate the results using some form of weighted average (fixed-effects meta-analysis). Weights defined as the inverse of the standard errors are used in the “inverse-variance weighting” method of meta-analysis that appears to be the most common approach in genetic epidemiology [9]. An alternative weighting system is used in Mantel Haenszel (MH) meta-analysis that results in a more robust estimate and standard error when some of the studies are small [10] or when the genetic variant is rare. The key assumption underlying a fixed-effects meta-analysis is that the different studies estimate the same population parameter and differences in the study-specific effect sizes are the result of sampling variability [11]. Heterogeneity among studies can be accounted for by using a random-effects meta-analysis, which essentially assumes a hierarchical model to describe the within-study variability and the variability between effects estimated from different studies. Although a random-effects meta-analysis accommodates more variability, it still makes the assumption that there is a well-defined “population parameter” (odds ratio) to be estimated. If heterogeneity of the studies originates from study-specific genetic effects then a meta-analysis can be inconclusive or produce misleading results.
Published Meta-Analyses of Genome-Wide Association Studies of Longevity
A comprehensive review by Broer and van Duijn [12] lists just 5 genome wide association studies of longevity [13–17], and one of extreme longevity [18] published up to 2012. The article by Newman et al. [13] reported a meta-analysis of 4 genome-wide association studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium. The meta-analysis included 1,836 individuals who survived to age 90 and older (cases) and 1,955 individuals who died at ages between 50 and 80 years (controls). Despite what was a large sample size for a study of longevity, this meta-analysis did not identify any genome-wide significant associations, not even for variants in the well replicated APOE locus on chromosome 19 that has achieved genome-wide significance levels with much smaller studies of centenarians [18]. Additional meta-analyses that expanded upon findings from the CHARGE consortium were reported by Deelen et al. [19] and Broer et al. [20]. The meta-analysis in [19] included 14 different studies and increased the overall sample size of the discovery cohort to 7,729 long lived individuals and 16,121 controls. However, the large sample size was reached by using a “relaxed definition” of longevity as survival to age 85 and older, and only SNPs in the APOE locus reached genome-wide significance in the discovery cohort. The meta-analysis in [20] used data from new studies that joined the CHARGE consortium to synthesize the results of genome-wide association studies of longevity defined as surviving beyond age 90. This analysis also identified only SNPs in the APOE locus with genome-wide significance.
The lack of novel genetic findings linked to human longevity from these meta-analyses has been exclaimed as evidence of a weak genetic contribution to the phenotype of extreme human longevity and has fed the myth, based upon twin studies of survival to the mid octogenarian years, that the heritability of survival to extreme old age is approximately 25% [6, 20, 21]. However, increasing the sample size through a meta-analysis does not necessarily increase the statistical power if including a large number of heterogeneous studies decreases the signal to noise ratio. We argue that an inconsistent definition of the phenotype across studies is a possible source of heterogeneity of the study-specific effects that may reduce the usefulness of a meta-analysis. In fact, we and others have shown that the sibling relative risk of extreme longevity defined as living beyond a threshold age increases as the threshold age increases [5, 22, 23], and we have advocated for more standard definitions of extreme longevity by choosing the threshold age based, for example, on percentile survival from some standard life tables. A consistent and specific definition of extreme longevity should result in a definition of cases that is more comparable across studies of longevity conducted in different time periods and different countries. In fact, a focused meta-analysis of 4 studies of centenarians reported in [24] showed replication of a large number of the longevity variants in [18] when the studies used a consistent and select enough definition of longevity. Similarly, in a recent meta-analysis of 4 genome-wide association studies with more than 6,000 controls and of 2,086 very old individuals (age range 95–119), in which extreme longevity was defined as surviving to or beyond the oldest 1% of individuals from the 1900 birth year cohort, we identified 37 SNPs in chromosomes 7, 12, and 19 that reached genome wide significance (p < 5E-8) ([25], under review).
While the consequence of loss of statistical power increases the false negative rate of the meta-analysis, more critical to the field of longevity have been applications of meta-analysis that produce misleading results. We will show two important examples with findings related to FOXO3A and APOE.
A Suboptimal Use of Meta-Analysis: FOXO3A and Longevity
Broer et al. [20] conducted a large meta-analysis of 19 studies that investigated the association between the single nucleotide polymorphism (SNP) rs2802292 in FOXO3A and human longevity, defined as age at death ≥90 years. This SNP was previously shown to be strongly associated with human longevity in male nonagenarians and centenarians of Japanese ancestry [26]. Studies characterized by participants of different ethnicities (Asian and Caucasians), and with various definitions of extreme longevity were included in the meta-analysis and the results were summarized in Table 6 of the online supplement material of the article [20] (http://biomedgerontology.oxfordjournals.org/content/suppl/2014/08/11/glu166.DC1).
We reanalyzed the data, focusing only on studies of white ethnicity, and among those studies we selected only those in which we could compute the mean age of cases and of controls, and could confirm the data. For example, data reported as “2011_Nebel” in supplement Table 6 were removed because the reference [15] did not report data on FOXO3A. Similarly, several other contributing studies were ignored if the mean age of cases was not reported. The results of the inverse-variance weighting meta-analysis of odds ratios are displayed in the forest plot in Figure 1 and are sorted by the mean age of cases, from the youngest to the oldest. The sorted plot highlights the effect of the definition of the phenotype of longevity on the odds ratios estimated in the different studies and shows that the genetic effect of the SNP decreases as cases become older and older. Participants in the study of long lived individuals in [27] were among the youngest, with an average age of death of 95 years in cases, and the odds ratio for extreme longevity in carriers of one copy of the minor allele was 1.57 compared to carriers of 0 copy of this allele. Participants in the study by Bae et al. [28] were among the oldest with an average age at death of 104 years and an odds ratio for extreme longevity of 1.13. The meta-analysis estimated an odds ratio for longevity for carriers of one copy of the minor allele of 1.17. While the meta-analysis does not lead to incorrect conclusions – all studies point to an association between this SNP and human longevity – it masks the important observation that the genetic effect of this variant appears to decrease as we look at more and more extreme definitions of longevity.
Figure 1.
Meta-analysis of associations of rs2802292 and human longevity in 6 different studies. SNP rs2802290 was used as a proxy for rs2802292 (r2=1) in [54]. CEPH* denotes the French cohort included in [54], and SNP rs768023 was used as proxy for rs2802292. LLI: long-lived individuals; OR: odds ratio for longevity for carriers of the longevity allele (G) of rs2802292 versus non-carriers. The ages of LLI in Soerensen et al (2010) and Anselmi et al. (2009) are ages at death.
This decreasing effect of FOXO3A alleles on the odds for extreme longevity had been noted already in [29] who conjectured that FOXO3A alleles may be associated with longevity but not extreme longevity at least in white males. Note also the different sex composition of the various studies, and the fact that larger genetic effects are present in male-only studies, who tend to have a larger number of nonagenarians. The work by Anselmi et al. [27] and Soerensen et al. [29] included results stratified by sex and showed significant association between rs2802292 and human longevity only in males. Only the results significant in males from these two articles were included in the meta-analysis by Broer et al. [20]. The study of Wilcox et al. that discovered the association of rs2802292 and human longevity was conducted in male nonagenarians [26], and based on these results it is possible that FOXO3A alleles promote survival to the nonagenarian years more pronouncedly only in males. Alternatively, it may be possible that health status drives the genetic effect. For example, Soerensen et al. [30] showed that several FOXO3A SNPs are associated with varying risk for aging-related phenotypes and Willcox et al. provided strong evidence that rs2802292 is associated with a 26% reduced risk for coronary artery disease [31, 32]. Meta-analyses that aggregated results from different studies ignoring interactions with sex or other factors like health status have not helped to characterize the role of FOXO3A and human longevity. Somehow the important finding that the genetic effect of some FOXO3A allele may be important only in males or that FOXO3A alleles may be more relevant to aging rather than extreme longevity has been lost in the recent literature dominated by meta-analyses.
False Negatives and Meta-Analysis: the APOE ε2 Allele and Extreme Longevity
Apolipoprotein E (APOE) has been one of the most studied genes in longevity since Schacter et al. showed that French centenarians deplete the ε4 allele that promotes early mortality due to Alzheimer’s and cardiovascular diseases and they also have an increased frequency of the APOE ε2 allele [33]. Three well characterized alleles of APOE are defined by the SNPs rs7412 and rs429358, as shown in Figure 2, and a comprehensive overview of genetic association studies targeting APOE and longevity is provided in the online resource “Human ageing and genomic resources” (http://genomics.senescence.info/longevity/gene.php?id=APOE) supported by the Welcome Trust.
Figure 2.
Definition of ε2, ε3, and ε4 alleles of APOE based on the alleles of the SNPs rs7412 and rs429358. The ε1 allele is considered lost in the population, so that the pairs of genotype TT for rs7412 and TC or CC for rs429358, or TC for rs7412 and CC for rs429358 should not occur. Assuming that the haplotype TC corresponding to the ε1 allele is lost, then the genotypes TC of rs7412 and TC of rs429358 can only be phased as the haplotype pair ε2ε4.
With the advances of SNP microarray technology that typically does not include assays for the SNPs rs7412 and rs429358, genetic studies have used SNPs in the APOE locus to investigate the association between variants of this gene and longevity. Several studies have shown an association, for example, between rs2075650 (a SNP within approximately 16KB from rs7412) and a variety of aging-related diseases as well as longevity. On 9/11/2016, Pubmed listed 89 publications that referred to rs2075650, 6 of which mentioned longevity. One of these articles linked the SNP rs2075650 to the ε4 allele of APOE [34]. While there is no doubt that the ε4 allele of APOE increases risk for Alzheimer’s disease and other aging-related diseases, the evidence about the role of the ε2 allele in longevity has been much less clear. This allele is rare and collecting sufficiently large samples for powerful enough genetic association studies has been challenging. In addition, the prevalence of the 3 alleles vary by ethnicity [35], and by age group because carriers of the ε4 allele are at high risk for premature mortality and consequently the frequency of this allele tends to decrease with older and older age, so that the frequencies of the ε2 and ε3 alleles in older age groups will be larger than the frequencies in younger age groups. Several underpowered studies have produced results suggesting a negative association between the ε2 allele and longevity in Danish and Finnish centenarians [36, 37], a neutral association in Chinese, Korean and Finnish centenarians [38–40], and a positive association in Japanese, French, Spanish, northern and southern Italians centenarians, and Greek nonagenarians [33, 41–44].
Some of these results were recently pooled in a meta-analysis of odds ratios for longevity comparing homozygosity for the ε2 allele relative to homozygosity for the ε3 allele. This meta-analysis reached the conclusion of no significant association between homozygosity for the ε2 allele and human longevity (odds ratio for longevity in carriers of the ε2ε2 genotype versus ε3ε3 genotype was 1.65, with 95% confidence interval 0.75, 3.64) [45]. We re-analyzed the data from 8 of the 9 studies after removal of the results cited as reference [46], in which we could not identify relevant odds ratios in the cited publication, and we included some studies that had been omitted, to increase the representation of European ethnicities [39, 40, 43]. To strengthen results for Finnish ethnicity, we also included additional control sets [47, 48]. All included studies reported the counts for the genotypes ε2ε2 and ε3ε3 (Table 1) so we used MH meta-analysis that is more appropriate when some of the studies have small counts. We also separated the data included in Garatachea et al. [41] by ethnicity. Panel A in Figure 3 shows the results of the meta-analysis of the 12 studies that included different racial and ethnic groups and reached a non-significant association: the odds ratio for longevity comparing ε2ε2 and ε3ε3 was 1.37, with a 95% confidence interval 0.78, 2.42. When we examined the association in whites (Panel B, Figure 3) the meta-analysis did not provide evidence against the null hypothesis, although the strength of the association increased (OR=1.53, 95% confidence interval 0.82, 2.96). Since some of the studies were of small sample size, they reported 0 counts for the genotype ε2ε2 in cases and/or control (see Table 1). It is common practice in this case to add a continuity correction 0.5 to the 0 cells, but this correction may produce unreliable results particularly in small studies. Therefore we removed the studies with 0 counts for the genotype ε2ε2 in cases and/or control, and did not use the continuity correction. Panel C of Figure 3 shows the meta-analysis for all informative studies including Europeans and Asians, while panel D shows the meta-analysis in Europeans. The genetic effect estimated through the MH meta-analysis in panel C increased after removal of non-informative studies but failed to reach statistical significance (OR=1.72, 95% confidence interval 0.79, 3.71). The focused meta-analysis in panel D that included only informative studies of white ethnicity produced a borderline statistically significant result (OR=2.39, 95% confidence interval 0.99, 5.76). Although the results are based on 14 ε2ε2 cases and 16 ε2ε2 controls, the meta-analysis in panel D points to a positive association between ε2ε2 and longevity with effects that become stronger in south Europeans. The variation by ethnicity could indicate gene x environment interaction or gene x gene interaction effects that were not highlighted by the meta-analysis of many studies with a variety of ethnic and racial groups.
Table 1.
Genotype counts of the 12 studies included in the meta-analysis in Figure 3. Age of cases and controls are either mean age at last contact or age range.
| Study | Cases | Controls | Cases | Cases | Controls | Controls |
|---|---|---|---|---|---|---|
| Origin | Age-year | Age–year | ε2ε2 | ε3ε3 | ε2ε2 | ε3ε3 |
| Feng (2011) | 93.7 | 53 | 2 | 263 | 4 | 328 |
| Chinese | ||||||
|
| ||||||
| Choi (2003) | 102.4 | 50.7 | 0 | 78 | 27 | 4585 |
| Korean | ||||||
|
| ||||||
| Garatachea (2014) | 100–116 | 23–59 | 1 | 562 | 0 | 350 |
| Japanese | ||||||
|
| ||||||
| Gerdes (2000) | >99 | 40 | 0 | 106 | 8 | 260 |
| Danish | ||||||
|
| ||||||
| Louhija (1994) | 1 | 124 | ||||
| Ehnholm (1986) | >100 | 20–55 | 2 | 332 | ||
| Lehtovirta (1985) | 0 | 103 | ||||
| Finnish | ||||||
|
| ||||||
| Castro (1999) | 100.8 | 0* | 0 | 118 | 1 | 96 |
| Finnish | ||||||
|
| ||||||
| Blanché (2001) | 103.1 | 51.2 | 4 | 385 | 2 | 351 |
| French | ||||||
|
| ||||||
| Schachter (1994) | >100 | 20–70 | 4 | 216 | 0 | 110 |
| French | ||||||
|
| ||||||
| Garatachea (2014) | 100–111 | 20–85 | 1 | 125 | 4 | 774 |
| Spanish | ||||||
|
| ||||||
| Garatachea (2014) | 100–104 | 27–81 | 2 | 60 | 4 | 460 |
| Italian (North) | ||||||
|
| ||||||
| Panza (1999) | 100 | 65.8 | 0 | 48 | 0 | 137 |
| Italian (South) | ||||||
|
| ||||||
| Stakias (2006) | 80–95 | 19–60 | 2 | 70 | 2 | 288 |
| Greek | ||||||
controls were newborn.
Figure 3.
A) Meta-analysis of odds ratio (OR) for longevity comparing carriers of the ε2ε2 genotypes versus carriers of the ε3ε3 genotypes in 12 studies of longevity. Numbers in parenthesis denote the size of cases and controls. B) Meta-analysis in panel A) is restricted to studies of white ethnicity. C) Meta-analysis in panel A is restricted to informative studies with at least one count of the ε2ε2 genotype in cases and controls, and continuity correction was not used. D) Meta-analysis in panel C) is restricted to studies of white ethnicity. Note that the OR in panels C and D are slightly different because the calculations displayed in panels C and D did not use continuity correction.
Publication Bias
The analysis of the association between the ε2 allele of APOE and longevity shows indirectly another problem of meta-analysis due to publication bias. Since it is difficult to publish negative results, some genetic effects estimated through meta-analyses may be inflated because the results of non-significant associations are not easily available. This well-known problem of meta-analysis has received substantial attention in the field of clinical trials [49] and genetic association studies [50]. While the current trend of publishing meta-analyses of full genome-wide association studies will reduce the impact of publication bias, some care is needed with interpretation of meta-analyses of candidate-gene association studies.
The Challenge of Controls Selection
The examples of FOXO3A and APOE focused on the definition of cases and the appropriateness of including data from studies with different definitions of longevity and different ethnicities in a meta-analysis. An additional challenge to genetic association studies of longevity that also affects a meta-analysis is the definition of controls. Defining human longevity as living past a certain “threshold age” does not provide an “automatic” definition of controls and, for example, Figure 4 suggests that controls could be individuals who died before reaching the threshold age, or individuals who died before reaching the threshold age decreased by an additional threshold. For example, the studies included in the meta-analysis in Newman et al. [13] used a definition of controls as individuals who survived to an age older than 50 years but not older than 80 years, while the threshold age to define cases was set at 90 years. Increasing the gap between the age threshold used to define cases and controls should intuitively lead to larger odds ratios and increase the power of a study. However, the odds ratios that can be estimated with this approach may represent a genetic effect that differs from what one wishes to estimate.
Figure 4.
Ambiguity in definition of controls for studies of human longevity. Cases are defined based upon survival to an age threshold τ. Should controls be “non-cases”, and hence individuals who died before reaching the age threshold? Or should controls be individuals who died at a much younger age based on an additional threshold?
Using some algebra (see appendix) one can show that if we define a case of extreme longevity based on the threshold model as EL = “S ≥ τ” where S denotes age at death and τ is the threshold age, and controls are defined as AL = “S < τ − δ” where δ denotes the quantity subtracted to the threshold age, then the odds ratio that can be estimated from this case-control study is:
| (3) |
which can be interpreted as the odds for survival past age τ − δ in carriers of the g allele versus non carriers (this is the quantity OR(S ≥ τ − δ)), and the relative risk of surviving an additional δ years, conditionally on having survived to age τ − δ, for carriers of the g alleles versus non carriers (this is the quantity RR(S ≥ τ|S ≥ τ − δ)). If δ>0, the odds ratio that is estimated through the case control study may no longer be the parameter of interest because one can identify alleles that increase the quantity OR(S ≥ τ − δ) by increasing risk for early mortality for example, or one can identify alleles that increase the quantity RR(S ≥ τ|S ≥ τ − δ)).
By setting δ=0 then the quantity in Equation (3) is the odds ratio of survival past the age threshold, comparing carriers of the g-allele to non-carriers of the g-allele. In practice, this parameter can be estimated through a case-control study in which controls are a sample from the populations of individuals who died before reaching the threshold age. In our case-control studies of extreme human longevity with cases defined as centenarians [18, 24], we have used random samples of controls without imposing an age threshold arguing that the prevalence of eventual centenarians in our control set had to be small because of the rarity of the phenotype. As the age threshold decreases in more relaxed definitions of longevity, the limit on the age of death of controls should be enforced more strictly to avoid too much overlapping between cases and controls, but large gaps between cases and controls should be avoided to ensure that the genetic effects estimated from the data are interpretable.
Looking Ahead
Meta-analysis of genetic studies of longevity has become a standard approach to overcome the limitation of small sample size of studies of extreme longevity. However, the goal to increase sample size has been accomplished at the cost of aggregating results from studies with different characteristics of cases and controls that can lead to misleading conclusions. In addition, one must remember the limitations of a meta-analysis which by definition is an “adjusted” analysis that ignores interaction between genes and other factors associated with specific studies, for example ethnicity and gender imbalance.
The examples that we discussed show that conclusions from inadequately conducted meta-analyses can result in a loss of power and increased false negative rates and bias. Although we focused our analyses on SNPs in APOE and FOXO3A, several published meta-analyses of additional longevity variants suffer of inconsistent definition of longevity, see for example [51, 52]. As technology progresses and more genetic data become available, we can foresee even more problematic use of meta-analysis. For example, in the hunt for rare genetic variants associated with human longevity, we foresee meta-analysis as potentially damaging in those situations in which the effect of rare variants in a small group of long lived individuals could be either undetected or even worse, incorrectly deemed as a false-positive result by a large consortium meta-analysis. Similarly, publication bias may lead to over-estimate some genetic associations.
Our analyses of associations between FOXO3A and APOE alleles and longevity show the danger of aggregating results from different studies and we suggest the following actions:
More attention should be paid to the definition of longevity across studies. Our suggestion to select an age threshold based on percentile survival based upon reference birth cohort tables could lead to more comparable association studies.
More attention should be paid to the effect of sex and ethnicities. Meta-analyses of studies with different ethnicities and sex composition could lead to biased results, and increased the false negative rate. While it is common practice to ask for study specific results that are adjusted by sex and genome-wide principal components [20], it would be useful to develop methods of meta-analysis that take into account the specific ethnic composition of different studies to assess how genetic effect change with different genetic backgrounds.
A final important point is the definition of controls. A random sample of individuals from the population may be a better choice of controls for studies of extreme human longevity than individuals who died before reaching a certain age. Demographers project an increase in the prevalence of centenarians due to a variety of non-genetic reasons [53], and therefore it will be important to define the phenotype of extreme longevity using ages that are reached by a small percentage of the population [5] to limit overlapping with controls randomly selected from the population.
As any statistical method, meta-analysis can be a great tool when it is applied in the correct context, and hopefully this short manuscript will draw more attention to the need to verify some of the assumptions that must be valid for a meta-analysis to also be valid.
Highlights.
Genetic association studies of human longevity have been challenged by the rarity of the phenotype and the need of large sample sizes to have extreme statistical power
Following the principle that “more is always better”, meta-analysis has become a popular method to aggregate results from individual studies and increase sample sizes.
We show that meta-analysis risk to overlook between studies differences and may inflate the false negative rate of genetic associations
Acknowledgments
This work was supported by the National Institute on Aging (NIA cooperative agreements U01-AG023755, U19-AG023122), and the William Wood Foundation.
Appendix: Derivation of Formula in Equation (3)
We define case of extreme longevity based on the threshold model as EL = “S ≥ τ” where S denotes age at death and τ is the threshold age, and controls are defined as AL = “S < τ − δ” where δ denotes the additional threshold on the threshold age (See Figure 4). Then the odds ratio that can be estimated from the case-control study is:
By using Bayes’s theorem:
And
By using the definition of case and controls:
And the factorization of the survival probability:
We derive
Which can be interpreted as the odds for survival past age τ − δ in carriers of the g allele versus non carriers, and the relative risk of surviving an additional δ years, conditionally on having survived to age τ − δ, for carriers of the g alleles versus non carriers.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Andersen SL, et al. Health span approximates life span among many supercentenarians: compression of morbidity at the approximate limit of life span. J Gerontol A Biol Sci Med Sci. 2012;67(4):395–405. doi: 10.1093/gerona/glr223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fries JF. Aging, natural death, and the compression of morbidity. N Engl J Med. 1980;303(3):130–5. doi: 10.1056/NEJM198007173030304. [DOI] [PubMed] [Google Scholar]
- 3.Sebastiani P, et al. Families Enriched for Exceptional Longevity also have Increased Health-Span: Findings from the Long Life Family Study. Front Public Health. 2013;1:38. doi: 10.3389/fpubh.2013.00038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ismail K, et al. Compression of Morbidity Is Observed Across Cohorts with Exceptional Longevity. J Am Geriatr Soc. 2016;64(8):1583–91. doi: 10.1111/jgs.14222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sebastiani P, et al. Increasing Sibling Relative Risk of Survival to Older and Older Ages and the Importance of Precise Definitions of "Aging," "Life Span," and "Longevity". J Gerontol A Biol Sci Med Sci. 2015 doi: 10.1093/gerona/glv020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sebastiani P, Perls TT. The genetics of extreme longevity: lessons from the new England centenarian study. Front Genet. 2012;3:277. doi: 10.3389/fgene.2012.00277. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Erikson GA, et al. Whole-Genome Sequencing of a Healthy Aging Cohort. Cell. 2016;165(4):1002–11. doi: 10.1016/j.cell.2016.03.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Jewell NP. Statistics for Epidemiology. Boca Raton: CRC/Chapman and Hall; 2003. [Google Scholar]
- 9.Evangelou E, Ioannidis JP. Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet. 2013;14(6):379–89. doi: 10.1038/nrg3472. [DOI] [PubMed] [Google Scholar]
- 10.Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959;22(4):719–48. [PubMed] [Google Scholar]
- 11.Borenstein M, et al. A basic introduction to fixed-effect and random-effects models for meta-analysis. Res Synth Methods. 2010;1(2):97–111. doi: 10.1002/jrsm.12. [DOI] [PubMed] [Google Scholar]
- 12.Broer L, van Duijn CM. GWAS and Meta-Analysis in Aging/Longevity. Adv Exp Med Biol. 2015;847:107–25. doi: 10.1007/978-1-4939-2404-2_5. [DOI] [PubMed] [Google Scholar]
- 13.Newman AB, et al. A meta-analysis of four genome-wide association studies of survival to age 90 years or older: the Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium. J Gerontol A Biol Sci Med Sci. 2010;65(5):478–87. doi: 10.1093/gerona/glq028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Deelen J, et al. Genome-wide association study identifies a single major locus contributing to survival into old age; the APOE locus revisited. Aging Cell. 2011;10(4):686–98. doi: 10.1111/j.1474-9726.2011.00705.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Nebel A, et al. A genome-wide association study confirms APOE as the major gene influencing survival in long-lived individuals. Mech Ageing Dev. 2011;132(6–7):324–30. doi: 10.1016/j.mad.2011.06.008. [DOI] [PubMed] [Google Scholar]
- 16.Malovini A, et al. Association study on long-living individuals from Southern Italy identifies rs10491334 in the CAMKIV gene that regulates survival proteins. Rejuvenation Res. 2011;14(3):283–91. doi: 10.1089/rej.2010.1114. [DOI] [PubMed] [Google Scholar]
- 17.Walter S, et al. A genome-wide association study of aging. Neurobiol Aging. 2011;32(11):2109e15–28. doi: 10.1016/j.neurobiolaging.2011.05.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Sebastiani P, et al. Genetic signatures of exceptional longevity in humans. PLoS ONE. 2012;7(1):e29848. doi: 10.1371/journal.pone.0029848. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Deelen J, et al. Genome-wide association meta-analysis of human longevity identifies a novel locus conferring survival beyond 90 years of age. Hum Mol Genet. 2014;23(16):4420–32. doi: 10.1093/hmg/ddu139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Broer L, et al. GWAS of longevity in CHARGE consortium confirms APOE and FOXO3 candidacy. J Gerontol A Biol Sci Med Sci. 2015;70(1):110–8. doi: 10.1093/gerona/glu166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Newman AB, Murabito JM. The Epidemiology of Longevity and Exceptional Survival. Epidem Rev. 2013;35(1):181–197. doi: 10.1093/epirev/mxs013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Perls TT, et al. Life-long sustained mortality advantage of siblings of centenarians. Proc Natl Acad Sci U S A. 2002;99(12):8442–7. doi: 10.1073/pnas.122587599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Perls TT, et al. Siblings of centenarians live longer. Lancet. 1998;351(9115):1560. doi: 10.1016/S0140-6736(05)61126-9. [DOI] [PubMed] [Google Scholar]
- 24.Sebastiani P, et al. Meta-analysis of genetic variants associated with human exceptional longevity. Aging (Albany NY) 2013;5(9):653–61. doi: 10.18632/aging.100594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sebastiani P, et al. Four Genome-Wide Association Studies Identify New Extreme Longevity Variants. J Gerontol A Biol Sci Med Sci. 2016 doi: 10.1093/gerona/glx027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Willcox BJ, et al. FOXO3A genotype is strongly associated with human longevity. Proc Natl Acad Sci U S A. 2008;105(37):13987–92. doi: 10.1073/pnas.0801030105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Anselmi CV, et al. Association of the FOXO3A locus with extreme longevity in a southern Italian centenarian study. Rejuvenation Res. 2009;12(2):95–104. doi: 10.1089/rej.2008.0827. [DOI] [PubMed] [Google Scholar]
- 28.Bae H, et al. Associations of FOXO3A Polymorphisms with Extreme Human Longevity in Four Longevity Studies. J Gerontol A Biol Sci Med Sci. 2016 [Google Scholar]
- 29.Soerensen M, et al. Replication of an association of variation in the FOXO3A gene with human longevity using both case-control and longitudinal data. Aging Cell. 2010;9(6):1010–7. doi: 10.1111/j.1474-9726.2010.00627.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Soerensen M, et al. Association study of FOXO3A SNPs and aging phenotypes in Danish oldest-old individuals. Aging Cell. 2015;14(1):60–6. doi: 10.1111/acel.12295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Willcox BJ, et al. Longevity-Associated FOXO3 Genotype and its Impact on Coronary Artery Disease Mortality in Japanese, Whites, and Blacks: A Prospective Study of Three American Populations. J Gerontol A Biol Sci Med Sci. 2016 doi: 10.1093/gerona/glw196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Willcox BJ, et al. The FoxO3 gene and cause-specific mortality. Aging Cell. 2016;15(4):617–24. doi: 10.1111/acel.12452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Schachter F, et al. Genetic associations with human longevity at the APOE and ACE loci. Nat Genet. 1994;6(1):29–32. doi: 10.1038/ng0194-29. [DOI] [PubMed] [Google Scholar]
- 34.Schupf N, et al. Apolipoprotein E and familial longevity. Neurobiol Aging. 2013;34(4):1287–91. doi: 10.1016/j.neurobiolaging.2012.08.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Ward A, et al. Prevalence of apolipoprotein E4 genotype and homozygotes (APOE e4/4) among patients diagnosed with Alzheimer’s disease: a systematic review and meta-analysis. Neuroepidemiology. 2012;38(1):1–17. doi: 10.1159/000334607. [DOI] [PubMed] [Google Scholar]
- 36.Gerdes LU, et al. Estimation of apolipoprotein E genotype-specific relative mortality risks from the distribution of genotypes in centenarians and middle-aged men: apolipoprotein E gene is a "frailty gene," not a "longevity gene". Genet Epidemiol. 2000;19(3):202–10. doi: 10.1002/1098-2272(200010)19:3<202::AID-GEPI2>3.0.CO;2-Q. [DOI] [PubMed] [Google Scholar]
- 37.Castro E, et al. Polymorphisms at the Werner locus: I. Newly identified polymorphisms, ethnic variability of 1367Cys/Arg, and its stability in a population of Finnish centenarians. Am J Med Genet. 1999;82(5):399–403. [PubMed] [Google Scholar]
- 38.Choi YH, et al. Distributions of ACE and APOE polymorphisms and their relations with dementia status in Korean centenarians. J Gerontol A Biol Sci Med Sci. 2003;58(3):227–31. doi: 10.1093/gerona/58.3.m227. [DOI] [PubMed] [Google Scholar]
- 39.Louhija J, et al. Aging and genetic variation of plasma apolipoproteins. Relative loss of the apolipoprotein E4 phenotype in centenarians. Arterioscler Thromb. 1994;14(7):1084–9. doi: 10.1161/01.atv.14.7.1084. [DOI] [PubMed] [Google Scholar]
- 40.Feng J, et al. Is APOE epsilon3 a favourable factor for the longevity: an association study in Chinese population. J Genet. 2011;90(2):343–7. doi: 10.1007/s12041-011-0075-9. [DOI] [PubMed] [Google Scholar]
- 41.Garatachea N, et al. ApoE gene and exceptional longevity: Insights from three independent cohorts. Exp Gerontol. 2014;53:16–23. doi: 10.1016/j.exger.2014.02.004. [DOI] [PubMed] [Google Scholar]
- 42.Blanche H, et al. A study of French centenarians: are ACE and APOE associated with longevity? C R Acad Sci III. 2001;324(2):129–35. doi: 10.1016/s0764-4469(00)01274-9. [DOI] [PubMed] [Google Scholar]
- 43.Stakias N, et al. Lower prevalence of epsilon 4 allele of apolipoprotein E gene in healthy, longer-lived individuals of Hellenic origin. J Gerontol A Biol Sci Med Sci. 2006;61(12):1228–31. doi: 10.1093/gerona/61.12.1228. [DOI] [PubMed] [Google Scholar]
- 44.Panza F, et al. Decreased frequency of apolipoprotein E epsilon4 allele from Northern to Southern Europe in Alzheimer’s disease patients and centenarians. Neurosci Lett. 1999;277(1):53–6. doi: 10.1016/s0304-3940(99)00860-5. [DOI] [PubMed] [Google Scholar]
- 45.Garatachea N, et al. The ApoE gene is related with exceptional longevity: a systematic review and meta-analysis. Rejuvenation Res. 2015;18(1):3–13. doi: 10.1089/rej.2014.1605. [DOI] [PubMed] [Google Scholar]
- 46.Arai Y, et al. Lipoprotein metabolism in Japanese centenarians: effects of apolipoprotein E polymorphism and nutritional status. J Am Geriatr Soc. 2001;49(11):1434–41. doi: 10.1046/j.1532-5415.2001.4911234.x. [DOI] [PubMed] [Google Scholar]
- 47.Lehtovirta M, et al. Apolipoprotein E polymorphism and Alzheimer’s disease in eastern Finland. Neurosci Lett. 1995;185(1):13–5. doi: 10.1016/0304-3940(94)11213-3. [DOI] [PubMed] [Google Scholar]
- 48.Ehnholm C, et al. Apolipoprotein E polymorphism in the Finnish population: gene frequencies and relation to lipoprotein concentrations. J Lipid Res. 1986;27(3):227–35. [PubMed] [Google Scholar]
- 49.Ahmed I, Sutton AJ, Riley RD. Assessment of publication bias, selection bias, and unavailable data in meta-analyses using individual participant data: a database survey. Bmj. 2012;344:d7762. doi: 10.1136/bmj.d7762. [DOI] [PubMed] [Google Scholar]
- 50.Munafo MR, Flint J. Meta-analysis of genetic association studies. Trends Genet. 2004;20(9):439–44. doi: 10.1016/j.tig.2004.06.014. [DOI] [PubMed] [Google Scholar]
- 51.Beekman M, et al. Chromosome 4q25, microsomal transfer protein gene, and human longevity: novel data and a meta-analysis of association studies. J Gerontol A Biol Sci Med Sci. 2006;61(4):355–62. doi: 10.1093/gerona/61.4.355. [DOI] [PubMed] [Google Scholar]
- 52.Pawlikowska L, et al. Association of common genetic variation in the insulin/IGF1 signaling pathway with human longevity. Aging Cell. 2009;8(4):460–72. doi: 10.1111/j.1474-9726.2009.00493.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Vaupel JW. Biodemography of human ageing. Nature. 2010;464(7288):536–42. doi: 10.1038/nature08984. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Flachsbart F, et al. Association of FOXO3A variation with human longevity confirmed in German centenarians. Proc Natl Acad Sci U S A. 2009;106(8):2700–5. doi: 10.1073/pnas.0809594106. [DOI] [PMC free article] [PubMed] [Google Scholar]




