Abstract
Objectives.
The need for large studies and the types of large-scale data resources (LSDRs) are discussed along with their general scientific utility, role in aging research, and affordability. The diversification of approaches to large-scale data resourcing is described in order to facilitate their use in aging research.
Methods.
The need for LSDRs is discussed in terms of (a) large sample size; (b) longitudinal design; (c) as platforms for additional investigator-initiated research projects; and (d) broad-based access to core genetic, biological, and phenotypic data.
Discussion.
It is concluded that a “lite-touch, lo-tech, lo-cost” approach to LSDRs is a viable strategy for the development of LSDRs and would enhance the likelihood of LSDRs being established which are dedicated to the wide range of important aging-related issues.
THE discovery of how early life individual differences and life-span contextual determinants influence within-person and population-level processes of aging, and health-related change is a major international research priority. An integrative life-span approach is required, involving long-term longitudinal follow-up and large sample sizes, to understand the multivariate and potentially highly idiosyncratic (i.e., interactive) processes leading to individual differences in aging and health outcomes (e.g., Hofer & Piccinin, 2010; Richards, Hatch, & Kuh, 2011; Rutter, 2007; Shanahan & Hofer, 2005, 2011). The need for an integrative, multidisciplinary, and potentially large-scale frameworks for increasing our understanding of aging and health-related change has been recently championed (e.g., Bachrach & Abeles, 2004; Butz & Torrey, 2006; Hofer & Alwin, 2008; National Research Council, 2000, 2001a, 2001b; Widaman, 2008). How these frameworks are best provided is for debate.
THE NEED FOR LARGE-SCALE DATA RESOURCES
Although the moderately sized, widely spaced repeated-assessment study design often used in aging research has many strengths there are also limitations, most notably in terms of being underpowered for many secondary analyses, and using technology that is inflexible and expensive. In this article, we argue for a strategic approach to aging research in terms of collaborative large-scale data resourcing. The need for large studies and types of large-scale data resources (LSDRs) are discussed along with their general scientific utility, role in aging research, and affordability. It is concluded that a “lite-touch, lo-tech, lo-cost” approach to LSDRs is viable and would enhance the likelihood of LSDRs being established, which are dedicated to the wide range of important aging-related issues. This is not a paper about the merits of “big” versus “little” science. It is about how to make science that needs large numbers accessible and affordable to a wider range of scientists.
The LSDR may be putatively defined as a large longitudinal data collection involving around 100,000 participants and which is easily accessible by the wider scientific community. The size constraint of 100,000 is indicative of a size sufficient to address a range of research questions definitively and in politically and socially useful timeframes. Ideally, LSDRs will be longitudinal in design, permitting analysis of within-individual change as well as opportunities to adjust for earlier individual states and exposures. Individual LSDRs may be larger or smaller according to purpose and resources.
The primary role of the LSDR is to bring together environmental, genetic, and phenotypic information, providing a platform for a wide range of hypothesis testing. This platform may be used in various ways from nested case–control or case–cohort analyses (Samani et al., 2007; Tolstrup et al., 2010) to providing a deeply phenotyped and genotyped population for more detailed or more intensive measurement studies. However, the epidemiologic design is secondary to the opportunity to hypothesis test. The issue is not whether a cohort or case–control approach is preferred (Clayton & McKeigue, 2001) but how can we most efficiently do either. We discuss in more detail the need for LSDRs in terms of the need for (a) large sample size; (b) longitudinal design; (c) as a collaborative platform for generating new science; and (d) as providing broad-based access to core genetic, biological, and phenotypic data by the scientific community.
The Need for Large Numbers
Studies requiring large numbers are growing in importance due to the nature of the research challenges we are facing (Burton et al., 2009). In part, this is due to advances in biomolecular science; however, it is also due to the complexity of interaction between environmental exposures and phenotypes as well as between phenotypes.
The impact of the built and social environment on health has long been recognized (Hippocrates C 500 BC “On airs, waters and places,” John Graunt 1662 “Natural and Political Observations Upon The Bills Of Mortality”). With respect to infectious disease, this impact is comparatively well understood, resulting in the Sanitary Movement of the 19th century. In relation to chronic disease, relatively little is known, with longitudinal data collected at an individual level being relatively rare. It is likely, however, that associations between the built and social environment and chronic disease will be complex and contingent on many factors. In relation to aging, although the environment may have a profound effect on the health and well-being (Brown, 1995; Michael & Yen, 2009), it is unlikely that any single factor has a big impact. Furthermore, the problem is intrinsically multilevel (Subramanian, Delgado, Jadue, Vega, & Kawachi, 2003). In order to design healthy environments for older people, studies are required that are sufficiently diverse in measurement to capture and model this level of complexity and sufficiently large to detect small effects.
The challenge posed by the genetics of complex disease and other complex nonmedical outcomes (e.g., cognitive decline) is substantial. Most complex disease has its own genetic architecture, which is the result of past filtering by selection (Weiss & Terwilliger, 2000). However, selection pressure has been on phenotype rather than on genotype, and there has been little selective pressure toward genotype–phenotype associations being dominated by one or a small number of alleles. As a result, the mutation of no single gene is either necessary or sufficient to cause complex disease (Altshuler, Daly, & Kruglyak, 2000). For late-onset chronic disease, the situation is exacerbated as, early onset pleiotropic effects apart, selection pressures on phenotype would not have operated prior to the development of the disease. Consequently, until recently, much late-onset complex disease has been effectively selective neutral. From these two processes, selection by phenotype and very limited selection pressure, late-onset complex disease might be expected to be associated with large numbers of alleles, each conferring a small degree of risk. Arguments that apply to complex disease also apply to complex psychological and social responses (e.g., McClearn, Vogler, & Hofer, 2001). As we strive to understand the genetic basis of individual and social behavior, it is likely that large numbers of low-risk conferring alleles will be involved.
The challenge of complex disease and complex behavior is further complicated by the issue of gene–environment interactions (G × E; e.g., Hunter, 2005; Rutter, 2008; Shanahan & Hofer, 2005, 2011). Genetic susceptibility is only half of the equation. The advent of genome-wide association studies anticipates the completion of gene hunting within the foreseeable future. The next big questions will surround how the environment (i.e., contextual, lifestyle, and social factors across the life span) affects gene expression. As with complex disease alleles, individual G × E effects for complex disease and behavior are likely to be small and sample size requirements large.
A particular challenge is causal inference. Population-based studies are prone to the criticism of reverse causality or residual confounding. Genotype may be used as a surrogate variable for environmental exposures providing comparatively secure causal inference. For example, the role of the alcohol metabolizing genes alcohol dehydrogenase (ADH) and acetaldehyde dehydrogenase (ALDH) is well established. Variants of these genes (ADH1B*2 and ALDH2*2) lead to reduced ethanol oxidation and increased effects of alcohol consumption. These variants are known to be associated with reduced alcohol consumption, reduced alcohol dependency, and reduced esophageal cancer (Lewis & Smith, 2005; Macgregor et al., 2009; Zuccolo et al., 2009). These associations between genetic variants and alcohol-related outcomes are highly likely to be causal as differences in alcohol exposure are the result of the random assortment of alleles rather than previous pathology (reverse causality) or unmeasured factors (residual confounding). Using this process for epidemiologic inference has been called Mendelian randomization (Davey-Smith & Ebrahim, 2003, 2005). Mendelian randomization is informative for testing a wide range of behavioral and social hypotheses, providing a genetically determined pathway is part of the causal model. In order to study causal pathways leading from genetic susceptibility to complex disease and other complex outcomes, studies with substantial statistical power are required. A major component of achieving sufficient power is large sample size.
Longitudinal in Design
Longitudinal study designs involve decisions regarding the temporal sampling of assessments within individuals, providing opportunities for analyses regarding both between-person differences and within-person change. Longitudinal designs are essential for understanding individual differences assessed at a single time point or at a particular age because individual differences are a complex function of early life individual differences, birth cohort, individual differences in maturation, aging, and/or other processes (e.g., health), population mortality selection, time-specific variation, and error in measurements (Hofer & Sliwinski, 2001, 2006). Analyses of cross-sectional data do not afford an opportunity to disentangle these complex effects in the identification of potential determinants of individual differences in change and aging-related outcomes (Hofer, Flaherty, & Hoffman, 2006; Kraemer, Yesavage, & Kupfer, 2000).
LSDR as a Collaborative Research Platform
Due to its size and ability to test a wide range of hypotheses, an LSDR is best conceived as a collaborative research platform. A research platform is an infrastructure from which multiple studies may be conducted. This enables new science to be conducted cost effectively, that is, the basic infrastructure costs for each study have already been met. Establishing a platform on a collaborative basis requires widespread consultation with both expert scientists and the public. UK Biobank, a complex disease LSDR, has established high levels of public trust through conducting and responding to extensive public consultation. In UK Biobank, expert consultation has ensured that important areas for study are identified and that the best available measures are used.
Cost-effective science is achieved at several levels. Most immediately, this is by providing a diverse core data set for analyses. However, the main advantage is the availability for further research of a deeply phenotyped and genotyped population sample. This sample can be used either for further large-scale measurement or more intensive measurement of smaller but finely stratified groups. For example, participants with known values for vascular risk factors may be selected for intensive cognitive assessment or participants with known rates of cognitive decline may be selected for imaging studies.
The goal of an LSDR should be to satisfy a diversity of interests, approaches, and opinions in the collection of data (Kaplan, 2006). Although careful management of the platform is required to achieve this goal as well as to avoid participant overload and to maintain participant privacy, by dint of large numbers, LSDRs provide many opportunities for investigator-led studies to be conducted highly cost effectively.
Broad-Based Access to Genetic and Phenotypic Data
The advantages of a collaborative approach are that the LSDR has been “built” by diverse inputs and may be appropriately considered to be a resource for the entire scientific community. This is in contrast to many current studies where the data are considered to be the intellectual property of the research team or funding institution. For example, although a collaborative model facilitates the conduct of investigator-led studies within the LSDR, it also places an obligation on those investigators to feedback their data into the LSDR once the primary analyses have been completed. Secure procedures allowing remote access to data are available and are already in place for a number of larger scale research efforts (e.g., Health & Retirement Study; English Longitudinal Study of Aging, SHARE network). Where genetic data are involved, further safeguards are required to ensure the nonidentifiability of participants. These have been developed, for example, by the British 1958 birth cohort. It is important that an LSDR is able to make data as accessible as possible to scientists.
SCIENTIFIC UTILITY OF LSDRS
LSDRs may not be suitable for answering all types of research question. A crude but helpful distinction is between descriptive and analytic studies. Descriptive studies are those where the characteristics of the population are described, for example, the prevalence and incidence of disease or the distribution of anxiety scores or allele frequencies. Analytic studies are those where the causal mechanisms that are operating in the population are investigated. Classically, epidemiological studies have aspired to an ideal of achieving both public health and etiological goals through the same population sample. Although the classical strategy has much to commend it, it is increasingly difficult to defend and may reflect a poor understanding of each.
Descriptive studies require representative sampling as estimates of prevalence and incidence (along with their quantitative equivalents) are sensitive to incomplete ascertainment. For example, dementia sufferers may be harder to reach than the cognitively healthy and underrepresented in a low response rate sample. The result is a conservatively biased estimate of the actual prevalence. To establish unbiased estimates of the prevalence and incidence, therefore, requires rigorous representative sampling of defined populations. Conventionally, a response rate of 90% is considered as a rule of thumb for achieving representativeness. This level of rigor is extremely difficult to achieve, and the difficulty increases with sample size. Except in the case of rare outcomes, large population samples are not required.
Analytic studies are concerned with identifying generalizable associations from which causal mechanisms can be inferred. Associations reflecting causal mechanisms are not dependent on representative sampling, rather, they are dependent on obtaining a range of exposure, which is sufficient for causality to occur, along with unbiased capture of outcomes. In short, a heterogeneous sample with complete follow-up is required. As many etiological hypotheses are concerned with increasingly complex (i.e., interactive) mechanisms, each having a small effect, analytic studies typically require large numbers.
In contrast to the classical epidemiologic model, the requirements of descriptive and analytic hypotheses suggest different roles for these distinctive designs. Descriptive studies may be used to identify health priorities for further study. This is classical descriptive epidemiology. Once priorities have been identified, analytic studies are required to identify the underlying mechanisms. This is etiologic epidemiology. Once mechanisms are identified, descriptive studies may be used to characterize the risk factor profiles of specific communities in order to plan interventions. This is public health epidemiology. These distinctions in research strategy clarify the primary role of LSDRs and hence LSDR design. LSDRs are suited to testing etiologic hypotheses and to generate findings that are generalizable between populations rather than representative or descriptive of specific populations.
LSDRs due to their size also provide opportunities for experiment. The twin scourge of observational data is reverse causality and confounding. For example, an association of anxiety with incident dementia may reflect a causal association or a prodromal result of anxiety on mood (J. Gallacher et al., 2009). Alternatively, the association could be spurious, being an artifact of a causal association of depression on both anxiety and dementia. Ultimately, these issues can only be resolved by experiment. Natural experiments are particularly appropriate. The Mendelian randomization process described above is essentially nature’s natural experiment. For many hypotheses, however, genetic pathways will not be involved and other opportunities for natural experiments are required. From a psychosocial perspective, macroenvironmental changes, such as critical incidents and birth cohort differences (e.g., historical context), provide opportunities for natural experiments (J. Gallacher, Bronstering, Palmer, Fone, & Lyons, 2007). However, not all hypotheses will be testable using natural experiments, and these will require randomized clinical trials. The nesting of trials within an observational environment is an accepted methodology. The availability of very large number of already phenotyped individuals provides an enriched sampling pool for both opportunistic natural experiments (J. E. Gallacher, 2007) and what have been called citizens trials (Ioannidis & Adami, 2008).
TYPES OF LSDR
In order to achieve large sample size cost effectively, three basic models of LSDR have been proposed. The first is the pooling of smaller legacy studies and sample collections. Another is the opportunistic storage of clinical samples and accompanying clinical data, sometimes referred to as hospital biobanking. The third is the design of bespoke LSDRs targeting a particular range of scientific problems.
Integrative Data Analysis
The pooling of legacy studies in the context of cohorts and/or for genetics case–control studies can be extremely valuable. In the case of cohorts, the time required for accruing outcomes has already occurred. In relation to genetic studies, sufficient numbers of case and controls can be compared quickly and efficiently. An exciting focus of current research is on maximizing knowledge from existing long-term longitudinal studies, utilizing cross-study differences in birth cohort, culture, and context to evaluate additional hypotheses about the influence of early life characteristics and life-span determinants on later life functioning (Kuh, Ben-Shlomo, Lynch, Hallqvist, & Power, 2003; Kuh & NDA Preparatory Network, 2007; McArdle, Grimm, Hamagami, Bowles, & Meredith, 2009; Richards & Hatch, 2011).There are now a number of active interdisciplinary collaborations composed of multiple longitudinal studies of aging and health (for review, see Piccinin & Hofer, 2008). Nevertheless, long-established studies may not have the phenotype data to test emerging hypotheses or hypotheses relevant to the special needs and interests of older people. Furthermore, consent issues frequently restrict the use of these collections for genetic analyses. Pooling of legacy studies to form LSDRs may be best viewed as a strategy for optimizing existing data rather than as a long-term strategy for aging research.
Hospital Biobanking
This approach involves the consenting for, and taking of, a biosample as a routine component of clinical care and accessing routinely collected clinical data such as the results of diagnostic investigations. It is an attractive option from a cost-effectiveness perspective as it utilize the existing health care infrastructure. Hospital biobanking is widely used in Nordic countries including the Finnish Maternity Cohort and the Swedish Institute for Infectious Disease Biobank (Dillner & Andersson, 2011). However, clinicians are frequently reluctant to engage in additional activity, however, small without dedicated resources. From a scientific perspective, however, hospital biobanking is an excellent strategy for the investigation of rare clinical syndromes and tissue pathologies such as in the Wales Cancer Bank (http://www.walescancerbank.com). At a wider level, it is more limited, as the opportunity for nonroutine non-biosample–based data collection is restricted and comparison with nonclinical populations is not readily available. This is particularly true for hospitalized older persons, who are frequently frail.
Bespoke LSDRs
Tailored LSDRs overcome the scientific and governance limitations of legacy collections and clinical biobanks; consents can be obtained which reflect current scientific needs, data collection can be flexible to address emerging hypotheses, and wide ranging population samples can be recruited. Examples of bespoke LSDRs include EPIC (Weikert et al., 2009), UK Biobank (Elliott & Peakman, 2008), and the Kadoorie Study (Chen et al., 2005). However, although bespoke LSDRs provide a substantial opportunity for aging research, they typically require a substantial initial financial investment and long-term strategic commitment. As a result, although the number of bespoke LSDRs is increasing, the total number remains few.
BESPOKE LsdrS IN AGING RESEARCH
Until recently, population-based aging research has been largely viewed as an added value outcome of chronic disease studies rather than a central focus. For the etiology of chronic disease, this approach has merit as the younger recruitment enables better measurement of earlier life exposures as well as reducing the impact of early disease and survivor effects. However, legacy chronic disease cohorts are typically poorly phenotyped for issues that become prominent with age. For example, issues of functional independence, dignity in care and social engagement are unlikely to feature in any detail in chronic disease LSDRs. In order to study, in depth, the broad range of issues that are important to older people, more tightly focused LSDRs are required.
Limitations to aging-related LSDRs include that for older persons, early life exposure data are more difficult to collect and interpret. Although genotype may be used to assess lifetime exposure, this depends on their being a relevant functional variant, and for many hypotheses, this will not be the case. Under these circumstances, birth or chronic disease cohorts are more suitable. A second limitation is that older populations are, by definition, survivor populations, and the causes of early mortality are not available to study. This is not as major a criticism as might be first thought. The largest survival effects are typically found in prevalence and incidence data that LSDRs are not suited to supply. Survivor effects on causal mechanisms are likely to be less acute as although the pool of susceptible or at-risk persons may be reduced in survivor populations, the same mechanisms are likely to be operating, for example, high blood pressure will be associated with adverse risk of heart disease risk regardless of survivor effect. Survivor effects apart, the longitudinal study of older populations remains informative being able to identify more proximal causes and current effects. A third limitation is the increased likelihood of early and prodromal effects of chronic disease, although their impact will vary according to outcome. However, it should be noted that an aging-related LSDR does not imply that the population of interest is exclusively the oldest old, rather, that the hypotheses of interest are those pertaining to older people. An aging-related LSDR can afford to have a wide age range (which increases the ability to detect prodromal effects), although overrepresentation by older persons is likely and not necessarily undesirable. A fourth limitation, and one that is generic to very large studies, is the risk of sacrificing creative innovation and measurement precision for large numbers (Frank, Di, McInnes, Kramer, & Gagnon, 2006). However, large numbers do not preclude diversity of measurements and emphases. Large platforms lend themselves to diversity providing the ideas are there and the infrastructure is flexible and responsive.
These limitations notwithstanding substantial benefits to aging research are likely to accrue through aging-related LSDRs. Bespoke aging-related LSDRs are highly likely to increase the volume and breadth of research. They are also likely to increase the rigor of research by having sufficient power to produce definitive associations and increase opportunity for robust causal inference.
Making Bespoke Aging-Dedicated LSDRs Affordable
Aging remains a Cinderella topic. In the United Kingdom, dementia commands 10% of the research funding allocated to cancer, even though its cost to society is double (Luengo-Fernandez, Leal, & Gray, 2010). Bespoke LSDRs are typically expensive. From the perspective of aging research, the expense of bespoke LSDRs using standard methods constitutes a serious obstacle. An example of a chronic disease bespoke LSDR is UK Biobank, a large-scale study using conventional methods (Peakman & Elliott, 2008). This resource comprises 500,000 men and women aged 40–69 years. Participants were contacted by mail, attended a regional clinic for about 2 hr, and provided 18 ml of blood for genetic and biochemical analysis. Blood was processed and stored at −80°C within 24 hr. The recruitment budget was over £60M of which £30M was recruitment specific and the remainder being infrastructure for the platform as a whole. This represents a recruitment cost of £60 per participant. Few cohort studies using traditional methods can achieve this level of economy. Nevertheless, in spite of economies of scale, it remains unlikely that a substantial sum will be made available to establish a bespoke aging-dedicated LSDR of similar scale in the foreseeable future.
An alternative approach is to use remote methods, adopting what may be called a “lite-touch, lo-tech, lo-cost” model. The idea is to reduce the cost of LSDRs to the point that niched LSDRs become affordable. Remote methods involve no direct (face-to-face) participant contact for recruitment, assessment, or biosampling and use established biotechnology.
Lite-touch refers to the absence of direct participant contact for recruitment and assessment. Web-based studies are an example of Lite-touch methods. Web-based epidemiology is a growth area. The Snart-Gravid study is a Swedish pregnancy cohort that has used entirely remote methods to contact, consent, and assess participants (Huybrechts et al., 2010). “Nutrinet” is a French nutritional cohort, which is aiming to recruit 500,000 members by Internet (Hercberg et al., 2010). In the United States, the Millennium Cohort of 150,000 servicemen and veterans offers a web option for assessment (Smith, 2009). Web-based methods are also suitable for use with older people. In a recently completed pilot study in Wales, men and women aged 50+ years were emailed an invitation to participate in a web-based epidemiologic study. Of the 170 who were invited to participate, 1 objection to being contacted was received and 47 (27%) invitees joined the study. In a subsequent pilot study, 632 older people joined an epidemiologic study conducted entirely online (J. Gallacher, Mitchell, Rengifo, & Burton, 2011).
A particularly challenging issue for lite-touch studies is obtaining consent for DNA donation and long-term follow-up of medical records. Pilot work has shown that this is acceptable to the public. In a wide age range focus group study, the prospect of participating in web-based epidemiological studies was found to be widely acceptable. Of the 26 participants, 21 (81%) would be willing in principle to participate in an Internet-based epidemiologic study if invited to do so (Taverner, Longley, & Gallacher, 2010). In a second qualitative study, also using a wide age range sample, a simulated Web site was used to test the acceptability of remote (web-based) consent to a genetic epidemiologic study. Using structured interview format, 32 of the 42 participants (75%) would be willing to consider participating in a web-based epidemiologic study involving the donation of DNA and long-term electronic follow-up (Wood, Kowalczuk, Eleyn, Mitchell, & Gallacher, 2011). If response is considered by age, older interviewees (50+ years) were the most enthusiastic group with 13/14 (93%) being willing to participate. Older people are willing to participate in web-based studies.
Lite-touch also means remote assessment. Apart from the need for physical examination, remote assessment of participants offers a much cheaper, more confidential, and more convenient option for participants. Web assessment, in particular, is becoming increasingly sophisticated, enabling efficient questionnaire routing for self-report items, cognitive performance testing, automatic data processing, and where appropriate virtual real-time feedback (Denissen, Neumann, & van Zalk, 2010).
Remote objective measurement is an increasingly attractive means of obtaining longitudinal, detailed, and minimally obtrusive assessments (Kaye & Maxwell, 2011; Pavel et al., 2008). The value of objective measurement is the reduced error variance and bias compared with self-report data. For example, a study using a self-report questionnaire to assess exercise was able to detect an interaction with a sample of 18,014 (Andreasen et al., 2008), whereas a study using accelerometers to assess exercise was able to detect an interaction with 704 participants (Rampersaud et al., 2008). Remote cognitive testing is an increasingly viable option. The Cardiff Cognitive Battery (CCB) is a series of short cognitive tests suitable for Internet use in epidemiologic studies. The CCB currently comprises measures of two-choice reaction time, digit span, paired associates learning, fluid intelligence, and attention (stroop). The complete battery requires 10–15 min to complete. Tests from this battery have been adopted by several large studies, including UK Biobank, Airwave, and CARTaGENE.
Lo-tech refers to remote biosampling using established rather than cutting-edge technology. Dry biochemistry and genetics are increasingly viable technologies for remote use (Pomelova & Osin, 2007). Remote methods will not be appropriate for cutting-edge biomolecular science but will be informative for investigating the downstream effects of established biomolecular processes. For example, saliva may not be useful for gene expression studies, but it remains an easily gathered source of DNA for genotyping and Mendelian randomization studies. Similarly, dry blood spots may not be suitable for analyzing the genetic basis of lipoprotein subfractions but may be suitable for investigating the effect of cytokines on vascular and inflammatory disease. Apart from the use of less sophisticated biosampling equipment, lightweight and nonhazardous mailing, and reduced storage costs, a lo-tech approach suggests substantial savings in expensive clinician time. The use of lo-tech biosampling provides substantial new opportunities for population-based aging research. In conclusion, evidence is accumulating that remote methods are acceptable to the public, are efficient for scientists, and cost effective for funders.
IMPLICATIONS FOR AGING RESEARCH
The purpose of this article has been to argue the scientific and technological cases for diversifying LSDR methodology as a basis for increasing the quantity and quality of science into aging. Currently, although there are several large aging-dedicated cohorts such as the Health and Retirement Study and the SHARE network, only the Canadian Longitudinal Study on Ageing, with an anticipated recruitment of 50,000, may be described as aspiring to LSDR status (Raina et al., 2009).
However, the funding and conduct of aging research needs “climate change.” Unless there is a more creative approach to conducting large studies and a willingness to invest in the new technologies a more creative approach represents, apart for major pathologies such as dementia and depression, it is unlikely that expensive bespoke aging-dedicated LSDRs, addressing wider questions, will be commissioned. Within the context of a broader and more fundamental argument, however, there is an opportunity for the aging research community to be a catalyst for more general change. By championing the scientific value of niched LSDRs and adopting cost-effective methods, aging researchers can pioneer a new epidemiologic paradigm that will have benefits beyond aging research. That is a legacy worth working toward.
FUNDING
This work was supported by the Welsh Assembly Government under the auspices of the National Epidemiology Wales consortium (Ref: 06/2/213) and NIH/NIA grant AG026453 supporting the Integrative Analysis of Longitudinal Studies of Aging (IALSA) international research network.
References
- Altshuler D, Daly M, Kruglyak L. Guilt by association. Nature Genetics. 2000;26:135–137. doi: 10.1038/79839. doi:10.1038/79839. [DOI] [PubMed] [Google Scholar]
- Andreasen CH, Mogensen MS, Borch-Johnsen K, Sandbaek A, Lauritzen T, Almind K, Hanson T. Lack of association between PKLR rs3020781 and NOS1AP rs7538490 and type 2 diabetes, overweight, obesity and related metabolic phenotypes in a Danish large-scale study: Case-control studies and analyses of quantitative traits. BMC Medical Genetics. 2008;9:118. doi: 10.1186/1471-2350-9-118. doi:10.1186/1471-2350-9-118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bachrach CA, Abeles RP. Social science and health research: Growth at the National Institutes of Health. American Journal of Public Health. 2004;94:22–28. doi: 10.2105/ajph.94.1.22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brown V. The effects of poverty environments on elders’ subjective well-being: A conceptual model. The Gerontologist. 1995;35:541–548. doi: 10.1093/geront/35.4.541. doi:10.1093/geront/35.4.541. [DOI] [PubMed] [Google Scholar]
- Burton PR, Hansell AL, Fortier I, Manolio TA, Khoury MJ, Little J, Elliot P. Size matters: just how big is BIG?: Quantifying realistic sample size requirements for human genome epidemiology. International Journal of Epidemiology. 2009;38:263–273. doi: 10.1093/ije/dyn147. doi:10.1093/ije/dyn147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Butz WP, Torrey BB. Some frontiers in social science. Science. 2006;312:1898–1900. doi: 10.1126/science.1130121. doi:10.1126/science.1130121. [DOI] [PubMed] [Google Scholar]
- Chen Z, Lee L, Chen J, Collins R, Wu F, Guo Y, Peto R. Cohort profile: The Kadoorie Study of Chronic Disease in China (KSCDC) International Journal of Epidemiology. 2005;34:1243–1249. doi: 10.1093/ije/dyi174. doi:10.1093/ije/dyi174. [DOI] [PubMed] [Google Scholar]
- Clayton D, McKeigue PM. Epidemiological methods for studying genes and environmental factors in complex diseases. Lancet. 2001;358:1356–1360. doi: 10.1016/S0140-6736(01)06418-2. doi:10.1016/S0140-6736(01)06418-2. [DOI] [PubMed] [Google Scholar]
- Davey-Smith G, Ebrahim S. ‘Mendelian randomization’: Can genetic epidemiology contribute to understanding environmental determinants of disease? International Journal of Epidemiology. 2003;32:1–22. doi: 10.1093/ije/dyg070. doi:10.1093/ije/dyg070. [DOI] [PubMed] [Google Scholar]
- Davey-Smith G, Ebrahim S. What can Mendelian randomisation tell us about modifiable behavioural and environmental exposures? British Medical Journal. 2005;330:1076–1079. doi: 10.1136/bmj.330.7499.1076. doi:10.1136/bmj.330.7499.1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Denissen J, Neumann L, van Zalk M. How the internet is changing the implementation of traditional research methods, people's daily lives, and the way in which developmental scientists conduct research. International Journal of Behavioral Development. 2010;34:564–575. doi:10.1177/0165025410383746. [Google Scholar]
- Dillner J, Andersson K. Biobanks collected for routine healthcare purposes: Build-up and use for epidemiologic research. In: Dillner J, editor. Methods in biobanking. New York, NY: Springer; 2011. pp. 113–125. [DOI] [PubMed] [Google Scholar]
- Elliott P, Peakman TC. The UK Biobank sample handling and storage protocol for the collection, processing and archiving of human blood and urine. International Journal of Epidemiology. 2008;37:234–244. doi: 10.1093/ije/dym276. doi:10.1093/ije/dym276. [DOI] [PubMed] [Google Scholar]
- Frank J, Di RE, McInnes RR, Kramer M, Gagnon F. Large life-course cohorts for characterizing genetic and environmental contributions: The need for more thoughtful designs. Epidemiology. 2006;17:595–598. doi: 10.1097/01.ede.0000239725.48908.7d. doi:10.1097/01.ede.0000239725.48908.7d. [DOI] [PubMed] [Google Scholar]
- Gallacher JE. The case for large scale fungible cohorts. European Journal of Public Health. 2007;17:548–549. doi: 10.1093/eurpub/ckm086. doi:10.1093/eurpub/ckm086. [DOI] [PubMed] [Google Scholar]
- Gallacher J, Bayer A, Fish M, Pickering J, Pedro S, Dunstan F, Ben-Shlomo Y. Does anxiety affect risk of dementia? Findings from the Caerphilly Prospective Study. Psychosomatic Medicine. 2009;71:659–666. doi: 10.1097/PSY.0b013e3181a6177c. doi:10.1097/PSY.0b013e3181a6177c. [DOI] [PubMed] [Google Scholar]
- Gallacher J, Bronstering K, Palmer S, Fone D, Lyons R. Symptomatology attributable to psychological exposure to a chemical incident: A natural experiment. Journal of Epidemiology and Community Health. 2007;61:506–512. doi: 10.1136/jech.2006.046987. doi:10.1136/jech.2006.046987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gallacher J, Mitchell CP, Rengifo A, Burton P. The good life: From Socrates to Surbiton. Quality in Ageing and Older Adults. 2011;12:17–25. doi:10.5042/qiaoa.2011.0141. [Google Scholar]
- Hercberg S, Castetbon K, Czernichow S, Malon A, Mejean C, Kesse E, Galan P. The Nutrinet-Sante Study: A web-based prospective study on the relationship between nutrition and health and determinants of dietary patterns and nutritional status. BMC Public Health. 2010 doi: 10.1186/1471-2458-10-242. doi:10.1186/1471-2458-10-242. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hofer SM, Alwin DF. The future of cognitive aging research: Interdisciplinary perspectives and integrative science. In: Hofer SM, Alwin DF, editors. Handbook on cognitive aging: Interdisciplinary perspectives. Thousand Oaks, CA: Sage Publications; 2008. pp. 662–672. [Google Scholar]
- Hofer SM, Flaherty BP, Hoffman L. Cross-sectional analysis of time-dependent data: Problems of mean-induced association in age-heterogeneous samples and an alternative method based on sequential narrow age-cohorts. Multivariate Behavioral Research. 2006;41:165–187. doi: 10.1207/s15327906mbr4102_4. doi:10.1207/s15327906mbr4102_4. [DOI] [PubMed] [Google Scholar]
- Hofer SM, Piccinin AM. Toward an integrative science of lifespan development and aging. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences. 2010;65:269–278. doi: 10.1093/geronb/gbq017. doi:10.1093/geronb/gbq017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hofer SM, Sliwinski MJ. Understanding ageing: An evaluation of research designs for assessing the interdependence of ageing-related changes. Gerontology. 2001;47:341–352. doi: 10.1159/000052825. doi:10.1037/0882-7974.18.2.285. [DOI] [PubMed] [Google Scholar]
- Hofer SM, Sliwinski MJ. Design and analysis of longitudinal studies of aging. In: Birren JE, Schaie KW, editors. Handbook of the psychology of aging. 6th ed. San Diego, CA: Academic Press; 2006. pp. 15–37. [Google Scholar]
- Hunter DJ. Gene-environment interactions in human diseases. Nature Review Genetics. 2005;6:287–298. doi: 10.1038/nrg1578. doi:10.1038/nrg1578. [DOI] [PubMed] [Google Scholar]
- Huybrechts KF, Mikkelsen EM, Christensen T, Riis AH, Hatch EE, Wise LA, Rothman KJ. A successful implementation of e-epidemiology: The Danish pregnancy planning study ‘Snart-Gravid’. European Journal of Epidemiology. 2010;25:297–304. doi: 10.1007/s10654-010-9431-y. doi:10.1007/s10654-010-9431-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ioannidis JP, Adami HO. Nested randomized trials in large cohorts and biobanks: Studying the health effects of lifestyle factors. Epidemiology. 2008;19:75–82. doi: 10.1097/EDE.0b013e31815be01c. doi:10.1097/EDE.0b013e31815be01c. [DOI] [PubMed] [Google Scholar]
- Kaplan GA. How big is big enough. Epidemiology. 2006;18:18–20. doi: 10.1097/01.ede.0000249507.52550.90. doi:10.1097/01.ede.0000249507.52550.90. [DOI] [PubMed] [Google Scholar]
- Kaye J, Maxwell SA. Intelligent systems for assessing aging change: Towards home-based, unobtrusive continuous assessment of aging. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences. 2011 doi: 10.1093/geronb/gbq095. doi:10.1093/geronb/gbq095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kraemer HC, Yesavage JA, Taylor JL, Kupfer D. How can we learn about developmental processes from cross-sectional studies, or can we? American Journal of Psychiatry. 2000;157:163–171. doi: 10.1176/appi.ajp.157.2.163. doi:10.1176/appi.ajp.157.2.163. [DOI] [PubMed] [Google Scholar]
- Kuh D, Ben-Shlomo Y, Lynch J, Hallqvist J, Power C. Life course epidemiology. Journal of Epidemiology and Community Health. 2003;57:778–783. doi: 10.1136/jech.57.10.778. doi:10.1136/jech.57.10.778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuh D NDA Preparatory Network. A lifecourse approach to healthy aging, frailty and capability. The Journals of Gerontology, Series A: Biological Sciences and Medical Sciences. 2007;62:717–721. doi: 10.1093/gerona/62.7.717. [DOI] [PubMed] [Google Scholar]
- Lewis SJ, Smith GD. Alcohol, ALDH2, and esophageal cancer: A meta-analysis which illustrates the potentials and limitations of a Mendelian randomization approach. Cancer Epidemiology, Biomarkers & Prevention. 2005;14:1967–1971. doi: 10.1158/1055-9965.EPI-05-0196. doi:10.1158/1055-9965.EPI-05-0196. [DOI] [PubMed] [Google Scholar]
- Luengo-Fernandez R, Leal J, Gray A. Dementia 2010: The prevalence, economic cost and research funding of dementia compared with other diseases. Cambridge, UK: Alzheimer's Research Trust; 2010. [Google Scholar]
- Macgregor S, Lind PA, Bucholz KK, Hansell NK, Madden PA, Richter MM, Whitfield JB. Associations of ADH and ALDH2 gene variation with self report alcohol reactions, consumption and dependence: An integrated analysis. Human Molecular Genetics. 2009;18:580–593. doi: 10.1093/hmg/ddn372. doi:10.1093/hmg/ddn372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McArdle JJ, Grimm KJ, Hamagami F, Bowles RP, Meredith W. Modeling life-span growth curves of cognition using longitudinal data with multiple samples and changing scales of measurement. Psychological Methods. 2009;14:126–149. doi: 10.1037/a0015857. doi:10.1037/a0015857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McClearn GE, Vogler GM, Hofer SM. Gene-gene and gene-environment interactions. In: Masoro EJ, Austad SN, editors. Handbook of the biology of aging. 5th ed. San Diego, CA: Academic Press; 2001. pp. 423–444. [Google Scholar]
- Michael YL, Yen IH. Invited commentary: Built environment and obesity among older adults—Can neighborhood-level policy interventions make a difference? American Journal of Epidemiology. 2009;169:409–412. doi: 10.1093/aje/kwn394. doi:10.1093/aje/kwn394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- National Research Council. Committee on Future Directions for Cognitive Research on Aging. In: Stern Paul C., Carstensen Laura L., editors. The aging mind: Opportunities in cognitive research. Washington, DC: National Academy Press; 2000. Commission on behavioral and social sciences and education. [PubMed] [Google Scholar]
- National Research Council. Committee on Future Directions for Behavioral and Social Sciences Research at the National Institutes of Health. In: Singer BH, Ryff CD, editors. New horizons in health: An integrative approach. Washington, DC: National Academy Press; 2001a. [Google Scholar]
- National Research Council. Preparing for an aging world: The case for cross-national research. Washington, DC: National Academy Press; 2001b. Panel on a Research Agenda and New Data for an Aging World, Committee on Population and Committee on National Statistics, Division of Behavioral and Social Sciences and Education. [Google Scholar]
- Pavel M, Jimison HB, Hayes TL, Kaye J, Dishman E, Wild K, Williams D. Continuous, unobtrusive monitoring for the assessment of cognitive function. In: Hofer SM, Alwin DF, editors. Handbook of cognitive aging: Interdisciplinary perspectives. Thousand Oaks, CA: Sage Publications; 2008. pp. 524–543. [Google Scholar]
- Peakman TC, Elliott P. The UK Biobank sample handling and storage validation studies. International Journal of Epidemiology. 2008;37(Suppl. 1):i2–i6. doi: 10.1093/ije/dyn019. doi:10.1093/ije/dyn019. [DOI] [PubMed] [Google Scholar]
- Piccinin AM, Hofer SM. Integrative analysis of longitudinal studies on aging: Collaborative research networks, meta-analysis, and optimizing future studies. In: Hofer SM, Alwin DF, editors. Handbook on cognitive aging: Interdisciplinary perspectives. Thousand Oaks, CA: Sage Publications; 2008. pp. 446–476. [Google Scholar]
- Pomelova VG, Osin NS. [Prospects of the integration of dry blood spot technology with human health and environmental population studies] Vestnik Rossiiskoi Akademii Meditsinskikh Nauk. 2007;12:10–16. [PubMed] [Google Scholar]
- Raina PS, Wolfson C, Kirkland SA, Griffith LE, Oremus M, Patterson C, Brazil K. The Canadian longitudinal study on aging (CLSA) Canadian Journal on Aging. 2009;28:221–229. doi: 10.1017/S0714980809990055. doi:10.1017/S0714980809990055. 221. [DOI] [PubMed] [Google Scholar]
- Rampersaud E, Mitchell BD, Pollin TI, Fu M, Shen H, O’Connell JR, Snitker S. Physical activity and the association of common FTO gene variants with body mass index and obesity. Archives of Internal Medical. 2008;168:1791–1797. doi: 10.1001/archinte.168.16.1791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richards M, Hatch SL. A life course approach to the development of mental skills. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences. 2011 doi: 10.1093/geronb/gbr013. doi:10.1093/geronb/gbr013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rutter M. Gene-environment interdependence. Developmental Science. 2007;10:12–18. doi: 10.1111/j.1467-7687.2007.00557.x. doi:10.1111/j.1467-7687.2007.00557.x. [DOI] [PubMed] [Google Scholar]
- Rutter M. Biological implications of gene-environment interaction. Journal of Abnormal Child Psychology. 2008;36:969–975. doi: 10.1007/s10802-008-9256-2. doi:10.1007/s10802-008-9256-2. [DOI] [PubMed] [Google Scholar]
- Samani NJ, Erdmann J, Hall AS, Hengstenberg C, Mangino M, Mayer B, WTCCC and the Cardiogenics Consortium Genomewide association analysis of coronary artery disease. New England Journal of Medicine. 2007;357:443–453. doi: 10.1056/NEJMoa072366. doi:10.1056/NEJMoa072366. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shanahan MJ, Hofer SM. Social context in gene-environment interactions: Retrospect and prospect. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences. 2005;60:65–76. doi: 10.1093/geronb/60.special_issue_1.65. Spec No 1. doi:10.1093/geronb/60.Special_Issue_1.65. [DOI] [PubMed] [Google Scholar]
- Shanahan MJ, Hofer SM. Molecular genetics, aging, and wellbeing: Sensitive period, accumulation and pathway models. In: Binstock RH, George LK, editors. Handbook on Aging and Social Sciences. 7 ed. New York, NY: Elsevier; 2011. pp. 135–147. [Google Scholar]
- Smith TC. The US Department of Defense Millennium Cohort Study: Career span and beyond longitudinal follow-up. Journal of Occupational and Environmental Medicine. 2009;51:1193–1201. doi: 10.1097/JOM.0b013e3181b73146. doi:10.1097/JOM.0b013e3181b73146. [DOI] [PubMed] [Google Scholar]
- Subramanian SV, Delgado I, Jadue L, Vega J, Kawachi I. Income inequality and health: multilevel analysis of Chilean communities. Journal of Epidemiology and Community Health. 2003;57:844–848. doi: 10.1136/jech.57.11.844. doi:10.1136/jech.57.11.844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Taverner N, Longley M, Gallacher J. Willingness of the public to participate in online epidemiologic studies. Interactive Journal of Medical Internet Research. 2010 http://knol.google.com/k/nicki-taverner/willingness-of-the-public-to/3l9j5hre3ttja/2#. [Google Scholar]
- Tolstrup JS, Hansen JL, Gronbaek M, Vogel U, Tjonneland A, Joensen AM, Overvad K. Alcohol drinking habits, alcohol dehydrogenase genotypes and risk of acute coronary syndrome. Scandinavian Journal of Public Health. 2010;38:489–494. doi: 10.1177/1403494810371248. doi:10.1177/1403494810371248. [DOI] [PubMed] [Google Scholar]
- Weikert C, Dietrich T, Boeing H, Bergmann MM, Boutron-Ruault MC, Clavel-Chapelon F, Riboli E. Lifetime and baseline alcohol intake and risk of cancer of the upper aero-digestive tract in the European Prospective Investigation into Cancer and Nutrition (EPIC) study. International Journal of Cancer. 2009;125:406–412. doi: 10.1002/ijc.24393. doi:10.1002/ijc.24393. [DOI] [PubMed] [Google Scholar]
- Weiss KM, Terwilliger JD. How many diseases does it take to map a gene with SNPs? Nature Genetics. 2000;26:151–157. doi: 10.1038/79866. doi:10.1038/79866. [DOI] [PubMed] [Google Scholar]
- Widaman KF. Integrative perspectives on cognitive aging: Measurement and modeling with mixtures of psychological and biological variables. In: Hofer SM, Alwin DF, editors. Handbook on cognitive aging: Interdisciplinary perspectives. Thousand Oaks, CA: Sage Publications; 2008. pp. 50–68. [Google Scholar]
- Wood F, Kowalczuk J, Eleyn G, Mitchell C, Gallacher J. Achieving online consent to participation in large scale gene-environment studies: A tangible destination. Journal of Medical Ethics. 2011 doi: 10.1136/jme.2010.040352. doi:10.1136/jme.2010.040352. [DOI] [PubMed] [Google Scholar]
- Zuccolo L, Fitz-Simon N, Gray R, Ring SM, Sayal K, Smith GD, Lewis SJ. A non-synonymous variant in ADH1B is strongly associated with prenatal alcohol use in a European sample of pregnant women. Human Molecular Genetics. 2009;18:4457–4466. doi: 10.1093/hmg/ddp388. doi:10.1093/hmg/ddp388. [DOI] [PMC free article] [PubMed] [Google Scholar]