Predictors of Attrition in a Longitudinal Population-Based Study of Aging

Erin Jacobsen; Xinhui Ran; Anran Liu; Chung-Chou H Chang; Mary Ganguli

doi:10.1017/S1041610220000447

. Author manuscript; available in PMC: 2022 Feb 1.

Published in final edited form as: Int Psychogeriatr. 2020 Apr 17;33(8):767–778. doi: 10.1017/S1041610220000447

Predictors of Attrition in a Longitudinal Population-Based Study of Aging

Erin Jacobsen ^a, Xinhui Ran ^b, Anran Liu ^b, Chung-Chou H Chang ^b,^c, Mary Ganguli ^a,^d,^e

PMCID: PMC7572515 NIHMSID: NIHMS1576520 PMID: 32301414

Abstract

Background

Longitudinal studies predictably experience non-random attrition over time. Among older adults, risk factors for attrition may be similar to risk factors for outcomes such as cognitive decline and dementia, potentially biasing study results.

Objective

To characterize participants lost to follow-up which can be useful in study design and interpretation of results.

Methods

In a longitudinal aging population study with ten years of annual follow-up, we characterized the attrited participants (77%) compared to those who remained in the study. We used multivariable logistic regression models to identify attrition predictors. We then implemented four machine learning approaches to predict attrition status from one wave to the next and compared the results of all five approaches.

Results

Multivariable logistic regression identified those more likely to drop out as older, male, not living with another study participant, having lower cognitive test scores and higher Clinical Dementia Ratings, lower functional ability, fewer subjective memory complaints, no physical activity, reported hobbies, or engagement in social activities, worse self-rated health, and leaving the house less often. The four machine learning approaches using areas under the Receiver Operating Characteristic curves produced similar discrimination results to the multivariable logistic regression model.

Conclusions

Attrition was most likely to occur in participants who were older, male, inactive, socially isolated, and cognitively impaired. Ignoring attrition would bias study results especially when the missing data might be related to the outcome (e.g., cognitive impairment or dementia). We discuss possible solutions including oversampling and other statistical modeling approaches.

Keywords: Epidemiology, Loss to follow-up, Least Absolute Shrinkage and Selection Operator–type regression (LASSO), Random Forest (RF), Gradient Boosting Machine (GBM), Artificial Neural Network (ANN)

INTRODUCTION

Attrition, the loss of study participants, is a well-established challenge in longitudinal research. This is particularly concerning in studies on aging and dementia as two of the established risk factors for attrition are increasing age and lower levels of cognition. (Burke et al., 2019; Cacioppo et al., 2018; Chatfield et al., 2005; Matthews et al., 2006; Van Beijsterveldt et al., 2002) Other factors increasing attrition in longitudinal epidemiological aging studies may be the same ones influencing the risk of dementia and cognitive decline, including lower education, (Cacioppo et al., 2018; Kuh et al., 2016; Matthews et al., 2006; Young et al., 2006) lower levels of functioning in activities of daily living, (Burke et al., 2019; Matthews et al., 2006) depressive symptoms, (Burke et al., 2019; Chang et al., 2009) poorer subjective health, (Kuh et al., 2016; Matthews et al., 2006; Salthouse, 2014) social isolation and loneliness, (Cacioppo et al., 2018; Mein et al., 2012) and comorbidities. (Young et al., 2006) Attrition is not random, and those who have remained in a study for many years are systematically different from those who were lost earlier in the study. For example, those who remain are younger, healthier, and more socially engaged than those lost. These features can cause attrition bias by creating, for example, the spurious impression that youth and good health increase risk of dementia. (Weuve et al., 2015)

Knowing the characteristics of those most likely to be lost to follow-up can potentially allow for methodological adjustments to minimize attrition bias. Here, we sought to characterize participants in a longitudinal population-based study who were lost over ten years of follow-up.

Because longitudinal studies collect an abundance of variables over many years, characterizing lost participants and predicting attrition requires navigation of complex interactions and hidden patterns in the data. Various statistical methods can be used for this purpose, including hypothesis-based logistic regression models and atheoretical machine-learning models. Oftentimes, machine-learning can be superior to traditional statistical methods (Amalakuhan et al., 2012; Hsich et al., 2011; Lo-Ciganic et al., 2019; Thottakkara et al., 2016) by providing higher discrimination ability in prediction. We explored the use of machine learning tools, in addition to a traditional logistic regression method, to determine whether the former, atheoretical, approaches would provide a more precise prediction of attrition than the latter, hypothesis-based approach could do. We employed four familiar, commonly-used, machine learning methods known to yield good prediction results. (Hastie et al., 2008; Chu et al., 2008). Although machine-learning approaches often provide higher discrimination ability in prediction, traditional logistic regression methods are easier to understand for most readers, and the relationship between predictors and the outcome is less complex. Therefore, we used both approaches in our analysis and compared their performance when the final models were built.

METHODS

Study setting and participants

The Monongahela-Youghiogheny Healthy Aging Team (MYHAT) is an age-stratified random sample drawn from the publicly available voter registration list for a group of small towns in southwestern, Pennsylvania, USA. This population-based cohort was recruited between 2006 and 2008 and is being followed annually for the development of mild cognitive impairment (MCI) and dementia. Inclusion criteria at study entry included 1) being 65 years and older, 2) living in one of the designated towns, 3) not residing in a long-term-care facility, 4) having vision and hearing sufficient to permit neuropsychological testing, and 5) not being decisionally impaired. Eligible participants who consented were briefly assessed using the Mini-Mental State Exam (MMSE). (Folstein et al., 1975) Only participants without substantial cognitive impairment at recruitment (age-education adjusted MMSE score (Mungas et al., 1996) ≥21) were invited to complete the full assessment and thus eligible for annual follow-up. The University of Pittsburgh Institutional Review Board approved all study procedures, and all participants provided written informed consent. (Ganguli et al., 2009)

Study assessment and predictor variables

At baseline and each annual follow-up, participants underwent detailed assessments including but not limited to demographic information, MMSE, health history, subjective memory complaints, functional ability, depressive symptoms, lifestyle, social support, and a review of current medications. Participants were rated using the Clinical Dementia Rating (CDR®) Dementia Staging Instrument (Morris, 1993) by certified study staff, based on independence in cognitively-driven everyday activities.

Demographic and personal information

Age was used as a continuous measure and other demographic variables were categorical: sex, education (less than or equal to high school, and greater than high school), marital status (currently married/living as married or not), living arrangement (alone or not), living with someone who was also a participant in the MYHAT study or not, and employment status (working or not). Caregiver status was defined by the participant being regularly depended by another person for help with activities like cooking and shopping. Self-reported family history of dementia was captured for first degree relatives of the participant.

Cognition

MMSE scores were treated as continuous variables ranging from 0–30.

Clinical Dementia Rating

CDR was categorized into three groups (0=normal, 0.5=mild cognitive impairment, ≥1=dementia).

Health history

Participants’ self-rated health was grouped into three categories (poor, fair/good/very good, and excellent). Medical history was captured as self-report in response to the question, “Has a doctor or nurse told you that you have…”. Endorsement of hypertension, myocardial infarction, congestive heart failure, arterial fibrillation, or cardiac arrest was included as “cardiovascular disease”. An indicator variable for “sleep problems” was created from four questions assessing sleep patterns including difficulty falling asleep, difficulty staying asleep, early morning awakening, and uncontrollable daytime sleepiness.

Blood pressure

Systolic blood pressure was grouped as <140 vs. ≥140 mmHg, and diastolic pressure was categorized as <90 and ≥90 mmHg.

Lifestyle and activities based on self-report

Smoking and alcohol consumption were based on use during the preceding year. Physical activity included both exercises done at moderate intensity and also physical activity from everyday activities (walking, housework, etc.). Hobbies and computer/electronic device use were dichotomized as having or not having a hobby and using or not using a computer/electronic device. A social activity indicator variable (any vs. none) was created from four questions assessing if participants left their homes in the past year to 1) attend a family occasion, 2) attend another social occasion, 3) go to work or a volunteer activity, or 4) attend place of worship. Social engagement was defined as belonging to any organizations including, but not limited to churches, lodges, societies, and volunteer groups, and also attending meetings/activities at least some of the time, dichotomized as any vs. no social engagement. Frequency of leaving the home was assessed and grouped into three categories (daily, 2–6 times per week, less than weekly).

Social support

We assessed social support as the number of people (< 3 vs. ≥3) to whom the participant reported feeling close enough to confide, and the participant’s satisfaction with the help they receive from others (much less/slightly less help vs. enough/more than enough help).

Depression

Depressive symptoms were assessed using the modified Center for Epidemiologic Studies- Depression scale (CES-D) (Ganguli et al., 1995; Radloff, 1977) and categorized as ≤3 vs. >3 symptoms endorsed as present during the preceding week.

Functional ability

Independence in instrumental activities of daily living (IADL) was examined and categorized as being completely independent vs. needing help in at least 1 activity on the Older Americans Resources and Services (OARS) IADL scale. (Fillenbaum, 1988)

Subjective memory complaints

Subjective memory complaints were assessed using a 21-question assessment (Snitz et al., 2012) and categorized as 0–2 complaints vs. 3–4 vs. ≥5 complaints.

Outcome variables

Attrition

The outcome variable in our analyses was loss to the study at any point during the first ten years of follow-up, regardless of the specific reason for attrition (death, drop out, etc.) and of any other changes in status such as incident MCI or dementia.

Tracking

Between annual assessments, interviewers telephoned participants every 3–6 months to “check in” with them and ask about any major events and changes in their lives. Quarterly study newsletters and personalized birthday cards were sent to all active participants. The phone calls and mailings served both to minimize attrition by building rapport with the participants and also to track people who may have relocated, using a returned mail service. Participants were offered a “skip” option if they were too busy or otherwise unable to complete their assessment in a given year and were contacted the following year for continued participation in the study.

Statistical Analysis

To compare characteristics of participants who remained in the study to those who were lost at or before the end of the current observation period (Study Year 11), we examined categorical and continuous variables using frequencies with percentages and means with standard deviations, respectively. We also conducted between-group comparisons using chi-square or Fisher’s exact test for categorical variables and t-test or Mann-Whitney U test for continuous variables.

We used multivariable logistic regression models to assess the association between each predictor at each cycle and attrition at the next cycle, while adjusting for other covariates. Our main model reported here used a robust sandwich estimator of variance to account for correlated measurements within each participant from multiple data collection cycles. In a sensitivity analysis, we used generalized estimating equations (GEE) methods with various working correlation structures including independent, exchangeable, first-order autoregressive, and unstructured for post-hoc comparison. As the results of all GEE methods with different correlation structures were entirely consistent with the regression model, we do not show their results in the main manuscript. Instead, they are available as an online supplement (see Supplemental Tables 1–4 published online).

For inclusion in the logistic regression model, we selected variables based on the stepwise procedure with the smallest Akaike Information Criterion (AIC). In the final multivariable logistic regression model, statistical significance of associations between predictors and attrition was determined using a two-sided P value <0.05.

In order to assure reproducibility and avoid overfitting, we randomly split the data into training and testing sets with 1:1 ratio. The model was developed from the training set and prediction results were obtained to the participants in the testing sets.

For comparison with the regression model, we also implemented four other commonly used machine learning approaches to predict the attrition status in the next cycle based on covariates at the current cycle: the Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression, Random Forest (RF), Gradient-Boosting Machine (GBM), and Artificial Neural Network (ANN). We selected these four machine learning methods because they are familiar and commonly used and have been shown to provide very good prediction results. (Chu et al., 2008; Hastie T, 2008) As we did for the logistic regression modeling, we split the data into training and testing sets, and the prediction results were evaluated in the testing set. The discrimination performance of the final multivariable logistic regression model and the four machine learning approaches were compared via their AUCs (areas under the Receiver Operating Characteristic curves). The AUCs were obtained by comparing the actual attrition status (yes/no) to the predicted probability of attrition among participants in the testing sets.

All statistical analyses were carried out in R version 3.5.1. (R Core Team, 2014)

RESULTS

Over ten years of follow-up, 77% of our original cohort was lost to follow-up. Of these, 36.9% were due to death, 21.5% were due to drop out/refusal, 20.2% were too sick (physically or cognitively) to participate, 11.4% relocated out of the study area, 9.6% were untraceable or unreachable, and 0.5% were lost for other reasons. For the current analyses, we combined all causes of attrition into a single outcome variable.

Table 1 shows the baseline characteristics of the participants by attrition status after ten years of follow-up (still in the study after ten years vs. loss before ten years). Attrition is significantly associated with being older, male, having ≤HS education, not residing with another study participant, not being married, living with others, not current working, leaving home less frequently, having lower MMSE scores, no physical activity, no hobbies, not using a computer, having no social engagement or activities, endorsing poor subjective health, having a history of cardiovascular disease, fewer confidants, more subjective memory complaints, taking more prescription medications, endorsing more depressive symptoms, being dependent in at least one IADL, and having a Clinical Dementia Rating (CDR) >0.

Table 1:

Baseline characteristics of those who had left or remained in the study at ten years.

Characteristic		Entire cohort (N=1982)	Remaining in the study after 10 years (N=452)	Lost to follow-up over 10 years (N=1530)	P value
		N (%)	N (%)	N (%)
Sex, n (%)	Female	1210 (61.0%)	299 (66.2%)	911 (59.5%)	0.013
Education, n (%)	≤ High school	1167 (58.9%)	213 (47.1%)	954 (62.4%)	<0.001
Education, n (%)	> High school	815 (41.1%)	239 (52.9%)	576 (37.6%)	<0.001
Resides with a study participant, n (%)	Yes	278 (14.0%)	82 (18.1%)	196 (12.8%)	0.005
Age in year, mean(SD)		77.6 (7.44)	73.4 (6.11)	78.9 (7.34)	<0.001
Marital status, n (%)	Married or living as married	979 (49.4%)	269 (59.5%)	710 (46.4%)	<0.001
Living alone, n (%)	Yes	777 (39.2%)	144 (31.9%)	633 (41.4%)	<0.001
Currently working, n (%)	Yes	216 (10.9%)	86 (19.0%)	130 (8.50%)	<0.001
Being a caregiver, n (%)	Yes	233 (11.8%)	63 (13.9%)	170 (11.1%)	0.125
Family history of dementia, n (%)	Yes	416 (21.1%)	110 (24.3%)	306 (20.2%)	0.067
CDR, n (%)	0	1413 (71.3%)	372 (82.3%)	1041 (68.0%)	<0.001
	0.5	546 (27.5%)	78 (17.3%)	468 (30.6%)
	≥1	23 (1.16%)	2 (0.44%)	21 (1.37%)
MMSE, mean (SD)		26.9 (2.43)	27.7 (1.91)	26.7 (2.52)	<0.001
IADL score, n (%)	≤3	1639 (82.8%)	430 (95.1%)	1209 (79.2%)	<0.001
IADL score, n (%)	>3	340 (17.2%)	22 (4.87%)	318 (20.8%)	<0.001
Number of subjective memory complaints, n(%)	0–2	1254 (63.8%)	323 (71.8%)	931 (61.4%)	<0.001
	3–4	400 (20.3%)	75 (16.7%)	325 (21.4%)
	≥5	312 (15.9%)	52 (11.6%)	260 (17.2%)
mCES-D score, n(%)	≤3	1808 (91.5%)	427 (94.5%)	1381 (90.7%)	0.014
mCES-D score, n(%)	>3	167 (8.46%)	25 (5.53%)	142 (9.32%)	0.014
Smoked in past year, n (%)	Yes	145 (7.32%)	32 (7.10%)	113 (7.39%)	0.914
Alcohol use in past year, n (%)	Yes	1298 (100%)	337 (100%)	961 (100%)	NA
Physical activity, n (%)	Yes	1667 (84.1%)	418 (92.5%)	1249 (81.6%)	<0.001
Having a hobby, n (%)	Yes	1834 (92.7%)	438 (96.9%)	1396 (91.5%)	<0.001
Computer use, n (%)	Yes	760 (38.4%)	264 (58.4%)	496 (32.5%)	<0.001
Social activity, n (%)	Yes	1954 (98.8%)	451 (99.8%)	1503 (98.6%)	0.061
Social engagement, n (%)	Yes	1613 (81.6%)	397 (88.0%)	1216 (79.7%)	<0.001
Self-rated health, n (%)	Poor	43 (2.17%)	3 (0.66%)	40 (2.62%)	<0.001
	Fair/good/very good	1693 (85.6%)	370 (81.9%)	1323 (86.7%)
	Excellent	242 (12.2%)	79 (17.5%)	163 (10.7%)
Stroke or TIA history, n (%)	Yes	265 (13.4%)	49 (10.8%)	216 (14.2%)	0.079
Diabetes, n (%)	Yes	432 (21.8%)	89 (19.7%)	343 (22.5%)	0.235
Cardiovascular disease history, n (%)	Yes	1501 (76.0%)	312 (69.0%)	1189 (78.0%)	<0.001
Sleep problems, n (%)	Yes	1215 (61.5%)	278 (61.5%)	937 (61.4%)	>0.999
Systolic blood pressure, n (%)	<140 mm Hg	1331 (67.6%)	315 (70.2%)	1016 (66.8%)	0.207
Systolic blood pressure, n (%)	≥140 mm Hg	638 (32.4%)	134 (29.8%)	504 (33.2%)	0.207
Diastolic blood pressure, n (%)	<90 mm Hg	1828 (92.9%)	417 (92.9%)	1411 (93.0%)	>0.999
Diastolic blood pressure, n (%)	≥90 mm Hg	139 (7.07%)	32 (7.13%)	107 (7.05%)	>0.999
Number of prescription medications, n (%)	≤3	881 (44.6%)	251 (55.5%)	630 (41.3%)	<0.001
Number of prescription medications, n (%)	>3	1096 (55.4%)	201 (44.5%)	895 (58.7%)	<0.001
Number of people in whom to confide, n (%)	<3	473 (24.6%)	87 (19.5%)	386 (26.1%)	0.005
Number of people in whom to confide, n (%)	≥3	1451 (75.4%)	359 (80.5%)	1092 (73.9%)	0.005
Frequency of leaving home, n (%)	Daily	1134 (57.4%)	313 (69.2%)	821 (53.9%)	<0.001
	2–6 times/week	713 (36.1%)	131 (29.0%)	582 (38.2%)
	less than weekly	128 (6.48%)	8 (1.77%)	120 (7.88%)
Having enough help, n (%)	Yes	1806 (91.4%)	419 (92.7%)	1387 (91.0%)	0.304

Open in a new tab

Abbreviations. CDR: Clinical Dementia Rating; MMSE: Mini-Mental State Exam; IADL: Instrumental Activities of Daily Living; mCES-D: modified Center for Epidemiologic Studies Depression Scale; SD: standard deviation; TIA: transient ischemic attack.

P values were obtained from two-sample t-test for continuous variables and chi-squared or Fisher’s exact test for categorical variables.

At baseline, 1.16% of the sample was rated as CDR≥1, i.e., as having at least mild dementia. The proportion of attrition in participants with a baseline CDR≥1 (91.3%) was much higher than those with a CDR=0.5 (85.7%) or 0 (73.7%); therefore, for all statistical analyses, we included only participants who had a CDR<1 at their baseline assessment (n=1,959).

After combining data points from each participant at each annual visit (aka study cycle), Table 2 (1,959 participants with 12,024 records) shows the characteristics of participants who remained in the study vs. those who were lost in the next cycle. The association between attrition and the variables are similar to Table 1, except that attrition is also significantly associated with consuming alcohol during the preceding year, having no social activities, not being a caregiver, and feeling that one is not receiving enough help from others.

Table 2:

Characteristics by attrition status at the next cycle (combining data points from each participant at each annual visit)

Characteristic		All observations (N=12,024)	Remaining in the next cycle (N=10,648)	Lost to follow-up in the next cycle (N=1,376)	P value
		N (%)	N (%)	N (%)
Sex, n (%)	Female	7627 (63.4%)	6813 (64.0%)	814 (59.2%)	0.001
Education, n (%)	≤ High school	6669 (55.5%)	5817 (54.6%)	852 (61.9%)	<0.001
Education, n (%)	> High school	5355 (44.5%)	4831 (45.4%)	524 (38.1%)	<0.001
Resides with a study participant, n (%)	Yes	1974 (16.4%)	1804 (16.9%)	170 (12.4%)	<0.001
Age in year, mean(SD)		80.1 (7.23)	79.8 (7.12)	82.4 (7.67)	<0.001
Marital status, n (%)	Married or living as married	5479 (45.6%)	4926 (46.3%)	553 (40.2%)	<0.001
Living alone, n (%)	Yes	5017 (41.7%)	4408 (41.4%)	609 (44.3%)	0.046
Current working, n (%)	Yes	1220 (10.1%)	1134 (10.6%)	86 (6.25%)	<0.001
Being a caregiver, n (%)	Yes	1178 (9.81%)	1068 (10.0%)	110 (8.03%)	0.021
Family history of dementia, n (%)	Yes	775 (6.65%)	694 (6.70%)	81 (6.22%)	0.545
CDR, n (%)	0	8944 (74.4%)	8130 (76.4%)	814 (59.2%)	<0.001
CDR, n (%)	0.5	3080 (25.6%)	2518 (23.6%)	562 (40.8%)	<0.001
MMSE, mean(SD)		27.2 (2.45)	27.3 (2.36)	26.2 (2.88)	<0.001
IADL score, n (%)	≤3	9105 (75.7%)	8287 (77.8%)	818 (59.5%)	<0.001
IADL score, n (%)	>3	2918 (24.3%)	2361 (22.2%)	557 (40.5%)	<0.001
Number of subjective memory complaints, n (%)	0–2	8091 (67.5%)	7279 (68.5%)	812 (59.4%)	<0.001
	3–4	2240 (18.7%)	1971 (18.6%)	269 (19.7%)
	≥5	1658 (13.8%)	1373 (12.9%)	285 (20.9%)
mCES-D score, n (%)	≤3	11273 (93.9%)	10027 (94.3%)	1246 (91.0%)	<0.001
mCES-D score, n (%)	>3	731 (6.09%)	608 (5.72%)	123 (8.98%)	<0.001
Smoked in past year, n (%)	Yes	678 (5.64%)	591 (5.55%)	87 (6.33%)	0.267
Alcohol use in past year, n (%)	Yes	6659 (58.7%)	5996 (59.5%)	663 (52.0%)	<0.001
Physical activity, n (%)	Yes	9645 (80.2%)	8663 (81.4%)	982 (71.4%)	<0.001
Having a hobby, n (%)	Yes	11520 (95.9%)	10275 (96.5%)	1245 (90.8%)	<0.001
Computer use, n (%)	Yes	5336 (44.4%)	4898 (46.0%)	438 (31.9%)	<0.001
Social activity, n (%)	Yes	11871 (98.8%)	10553 (99.1%)	1318 (96.2%)	<0.001
Social engagement, n (%)	Yes	10130 (84.4%)	9094 (85.5%)	1036 (75.6%)	<0.001
Self-rated health, n (%)	Poor	200 (1.66%)	144 (1.35%)	56 (4.08%)	<0.001
	Fair/good/very good	10591 (88.1%)	9360 (88.0%)	1231 (89.7%)
	Excellent	1224 (10.2%)	1138 (10.7%)	86 (6.26%)
Stroke or TIA history, n (%)	Yes	422 (3.51%)	348 (3.27%)	74 (5.40%)	<0.001
Diabetes, n (%)	Yes	2735 (22.8%)	2409 (22.6%)	326 (23.7%)	0.372
Cardiovascular disease history, n (%)	Yes	8811 (73.3%)	7764 (72.9%)	1047 (76.3%)	0.010
Sleep problems, n (%)	Yes	8055 (67.1%)	7143 (67.1%)	912 (66.6%)	0.718
Systolic blood pressure, n (%)	<140 mm Hg	8319 (70.8%)	7367 (70.6%)	952 (71.9%)	0.355
Systolic blood pressure, n (%)	≥140 mm Hg	3435 (29.2%)	3063 (29.4%)	372 (28.1%)	0.355
Diastolic blood pressure, n (%)	<90 mm Hg	11327 (96.4%)	10061 (96.5%)	1266 (95.7%)	0.176
Diastolic blood pressure, n (%)	≥90 mm Hg	425 (3.62%)	368 (3.53%)	57 (4.31%)	0.176
Number of prescription medications, n (%)	≤3	4831 (40.2%)	4372 (41.1%)	459 (33.5%)	<0.001
Number of prescription medications, n (%)	>3	7185 (59.8%)	6272 (58.9%)	913 (66.5%)	<0.001
Number of people in whom to confide, n (%)	<3	2484 (21.0%)	2130 (20.3%)	354 (26.4%)	<0.001
Number of people in whom to confide, n (%)	≥3	9364 (79.0%)	8377 (79.7%)	987 (73.6%)	<0.001
Frequency of leaving home, n (%)	Daily	5662 (47.2%)	5147 (48.4%)	515 (37.7%)	<0.001
	2–6 times/week	5453 (45.5%)	4820 (45.4%)	633 (46.3%)
	Less than weekly	876 (7.31%)	658 (6.19%)	218 (16.0%)
Having enough help, n (%)	Yes	11111 (92.5%)	9870 (92.8%)	1241 (90.6%)	0.005

Open in a new tab

Each participant can contribute to multiple records.

Table shows mean (sd) or frequency (%).

P values were based on t-test for continuous variables and Chi-square or Fisher exact test for categorical variables.

Table 3 shows the results of the multivariable logistic regression model. After adjusting for covariates, participants had a higher probability of leaving the study if they were older, male, not living with another MYHAT study participant, had no physical activities, no hobbies or interests, no social activities, no social engagement, left home less frequently, had lower MMSE scores, endorsed poor subjective health, had fewer confidants, fewer subjective memory complaints, were dependent in at least one IADL, and had a CDR >0.

Table 3:

Results of multivariable logistic regression model using stepwise variable selection

Variable	Reference group	Odds Ratio (OR)^*	95% CI for OR^*	P value^*
Sex: female	Male	0.763	0.659 – 0.867	<0.001
Education: > High school	≤ High school	0.905	0.78 – 1.03	0.156
Resides with another study participant: Yes	No	0.706	0.575 – 0.836	<0.001
Age (continuous, per year)	Mean age	1.028	1.017 – 1.039	<0.001
MMSE (continuous, per unit score)	Mean MMSE	0.94	0.914 – 0.966	<0.001
Physical activity: Yes	No	0.838	0.707 – 0.969	0.027
Hobby: Yes	No	0.651	0.484 – 0.819	0.001
Social activity: Yes	No	0.608	0.325 – 0.89	0.036
Social engagement: Yes	No	0.767	0.638 – 0.896	0.002
Subjective health rating: Fair/Good/Very good	Poor	0.48	0.296 – 0.664	<0.001
Subjective health rating: Excellent	Poor	0.347	0.19 – 0.505	<0.001
Stoke or TIA history: Yes	No	1.36	0.889 – 1.832	0.082
Number of people in whom to confide: ≥3	<3	0.825	0.7 – 0.949	0.012
Number of subjective memory complaints: 3–4	0–2	0.787	0.632 – 0.941	0.017
Number of subjective memory complaints: ≥5	0–2	0.771	0.593 – 0.949	0.027
IADL score: >0	=0	1.372	1.146 – 1.598	<0.001
CDR: 0.5	=0	1.532	1.214 – 1.851	<0.001
Frequency of leaving house: 2–6 times/week	Daily	0.975	0.829 – 1.12	0.736
Frequency of leaving house: less than weekly	Daily	1.343	1.001 – 1.685	0.023

Open in a new tab

Abbreviations. MMSE: Mini-Mental State Exam; IADL: Instrumental Activities of Daily Living; TIA: transient ischemic attack; CDR: clinical dementia rating

For the categorical variables (all variables except age and MMSE), the odds ratio is the odds of attrition at the next cycle for this group compared to that of the reference group; For the continuous variables (age and MMSE), the odds ratio is the odds of attrition at the next cycle for a participant with one unit increase in that variable compared to another participant with a measurement of that variable at mean value. 95% confidence intervals and p values are calculated based on the robust standard errors.

The post-hoc sensitivity analyses using four different GEE structures produced results very similar to the multivariable logistic regression model. Data are not shown here but are available as an appendix online (see Supplemental Tables 1–4).

Table 4 shows the areas under the ROC curve (AUC) allowing us to compare the discrimination performance of the final multivariable logistic regression model and four other commonly used machine learning approaches (LASSO, RF, GBM, and ANN). All 5 approaches produced a very similar discrimination performance (range: 0.623 – 0.681) in predicting the attrition status in the next cycle. The logistic regression model provided a relatively higher AUC compared to the machine learning approaches; it also provides the associations between each predictor and the outcome and estimates their effect sizes (Table 3). Therefore, we mainly present the final analysis outcome based on multivariable logistic regression model.

Table 4:

Classification performance via AUC using different machine learning methods

Method	^*AUC1	^*AUC2
Multivariable Logistic Regression	0.661	0.681
Least Absolute Shrinkage and Selection Operator (LASSO) logistic Regression	0.658	0.677
Random Forest (RF)	0.646	0.656
Gradient-Boosting Machine (GBM)	0.666	0.668
Artificial Neural Network (ANN)	0.667	0.623

Open in a new tab

Abbreviations. AUC: areas under the Receiver Operating Characteristic curves. We randomly split records into two sets (set 1 and set 2 with 1:1 splitting ratio). AUC1 used the training model from set 1 and calculating AUC by fitting the model to set 2 (the testing data). AUC2 used set 2 as the training data and set 1 as the testing data.

DISCUSSION

In this ten-year population-based longitudinal aging study in a group of communities of low socioeconomic status, we assessed participants annually. Although we employed several measures between visits to enhance retention, by the tenth follow-up we had lost 77% of the original cohort. Those who already had dementia at study entry had the highest subsequent attrition rate. Other longitudinal aging studies with a similar length of follow-up have reported lower rates of attrition but with younger cohorts (Cacioppo et al., 2018; Mein et al., 2012); while an attrition rate closer to ours, after eleven years, was reported by Burke and colleagues (Burke et al., 2019) for an older group.

At each visit, likelihood of leaving the study before the next annual visit was increased by several variables as shown in Table 3. Despite these associations having varying degrees of strength, we found several statistically significant predictors of attrition, some of which confirm the findings of previous studies, and some of which appear to be new, as we will highlight below.

In our cohort, every year of increasing age was associated with an OR of 1.028 (a 2.8% higher risk) of attrition. Every point on the MMSE was associated with an OR of 0.94 (a 6% lower risk) of attrition. Older age and lower cognition have been shown to increase risk of study attrition in most previous studies, including in a systematic review of attrition in longitudinal population-based studies on aging, (Burke et al., 2019; Cacioppo et al., 2018; Chatfield et al., 2005; Matthews et al., 2006; Van Beijsterveldt et al., 2002) and these results are confirmed in our analyses. While we found male gender to be significantly associated with risk of attrition, the effect size was relatively smaller (OR 0.763, CI 0.659–0.867, P<0.001). There is some consensus on this association, (Mein et al., 2012) but others have found the opposite. Van Beijsterveldt and colleagues (Van Beijsterveldt et al., 2002) assessed drop-out due to death and refusal separately and found that men were more likely to be lost due to death, while women were more likely to drop out due to refusal. Matthews et al. (Matthews et al., 2006) found that women were more likely to be lost but did not include attrition due to death in their analyses. We combined all forms of attrition for the current analyses as our primary objective is to understand, overall, who dropped out of our study for any reason over the first ten years of follow-up.

In our analysis, poor subjective health had the strongest significant association with attrition of all the predictors examined (OR 0.347, CI 0.19–0.505, P<0.001). This finding is in line with those several previous studies, (Kuh et al., 2016; Matthews et al., 2006; Salthouse, 2014; Young et al., 2006) although other groups have found those with long-term health issues are more likely to remain in a study. (Deeg et al., 2002; Mein et al., 2012) We also found that participants who endorsed fewer subjective memory complaints were more likely to be lost to follow-up, a finding which, to our knowledge, has not been reported previously.

Social isolation and lesser engagement in social activities were also associated with attrition in our study and in other groups. (Cacioppo et al., 2018; Mein et al., 2012) Furthermore, a lack of other non-social activities (having a hobby or engaging in physical activity) were found to be associated with study loss. We also found a statistically significant association between exercise and attrition, although the effect size was relatively smaller (OR 0.838, CI 0.707–0.969, P<0.027). To our knowledge, the association of having hobbies and interests with study attrition has not been previously examined, and very few have examined exercise and physical activity as an attrition predictor. One population-based study of a middle-aged Japanese cohort found those with lower physical activity were more likely to be lost after five years. (Hara et al., 2015) Although we could not determine why participants did not exercise or have hobbies, potential reasons could include apathy or depression, which are both common symptoms of MCI and dementia and may even precede cognitive decline. (Gallagher et al., 2017)

Education did not appear to predict attrition in our study population whose median educational level was high school graduate. While other studies have demonstrated that lower education is a risk for study loss, (Burke et al., 2019; Cacioppo et al., 2018; Kuh et al., 2016; Matthews et al., 2006; Young et al., 2006) the literature is mixed. Chatfield’s review of attrition in longitudinal aging studies failed to find an association between education and attrition, (Chatfield et al., 2005) and in the Whitehall II study, those with higher education were more likely to be lost due to non-response but not withdrawal. (Mein et al., 2012) Young adults with higher levels of education were less likely to continue participation in the Virginia Cognitive Aging Project, while there was no association in older adults who had a mean of 16 years of education. (Salthouse, 2014) Further investigation of education as a predictor for attrition is needed. This issue is challenging to compare across studies as not all reported a mean or median level of education, and the same grade or duration of education may represent different amounts and quality of education in different eras and generations or in different regions and populations. (Liu et al., 2015)

Residing with another MYHAT study participant appeared to reduce the likelihood of attrition. While this has not previously been examined in other population-based longitudinal aging studies, similar findings have been noted for proxy partners or informants in other types of studies. Burke and colleagues examined attrition predictors in 35 Alzheimer’s Disease Centers across the US and found that a co-participant/informant with “questionable reliability” increased the risk of loss. (Burke et al., 2019) Other, non-aging, studies have also found a similar association. In a cardiovascular prevention study of 2000 middle-aged adults, having a spouse or partner in the study was associated with lower odds of attrition after four years. (Bambs et al., 2013) Babatunde et al. observed that adult participants in a healthy eating and active living randomized controlled trial were more likely to continue participation after a year if they were enrolled with a partner. (Babatunde et al., 2017)

All longitudinal studies, particularly of older adults, will lose participants over time. Recognizing that attrition is not random and will bias study results if ignored, the challenge is increased when the study outcome is cognitive impairment or dementia, which share common risk factors with mortality and attrition in general. If the goal is to estimate the effect of an exposure variable on cognitive impairment or incident dementia, methods to address potential attrition biases are necessary. The commonly used statistical methods include joint modeling and competing risks modeling, which simultaneously model both the primary outcome as well as attrition. (Agogo et al., 2018; Ganguli et al., 2013; Henderson, 2000; Li et al., 2018) In such analyses, attrition is not treated as missing completely at random or noninformative; instead, covariates are used to model attrition and incorporate this into the main model of cognitive impairment or incident dementia (joint modeling) or take the method that has taken informative dropout into model specification (competing risks modeling). Other approaches via propensity score modeling (Dorsett, 2010; Ganguli et al., 2015; Wolinsky et al., 2010) and inverse probability weighting (Daza et al., 2017; Ganguli et al., 2020) can be used to reform the original dataset via matching or weighting in accordance with nonresponse or attrition bias, thus allowing results to be generalized back to the original cohort. We caution that these post-hoc methods only help to minimize attrition bias and do not “magically” repair serious biases in the data. We have previously published work classifying attrition as informative (death and illness) and non-informative/random (loss for other reasons, e.g. relocation) and accounted for only informative attrition in the joint models. (Ganguli et al., 2013) However, here we have combined all types of attrition since all types lead to potential bias and a loss of sample size and power, and our current goal is to identify its associated factors to minimize these challenges in the future.

Identifying the characteristics of those likely to leave the study also provides researchers with the opportunity, at the time of cohort recruitment, to oversample individuals with those characteristics, thus attempting to counter the inevitable subsequent attrition bias and loss of sample size. Recruitment for our study was based on random sampling and gave us no choice in selecting potential participants; however, studies recruiting volunteers might do well to recruit two or more eligible members of the same household to increase the chance of study retention. Additionally, if one of the cohabitants became too physically or cognitive ill to continue in the study, the remaining partner could continue to provide some information about that individual by proxy, as is done in the English Longitudinal Study of Ageing. (Steptoe et al., 2013) While it can be a challenge to oversample people who possess specific characteristics that are not known until after the study assessment is completed (e.g. cognitive status or self-rated health), opportunities can be taken to oversample the oldest-old and male participants.

Our study is novel in that, in addition to our standard regression modeling, we also employed four different machine learning approaches to predict attrition from our study cohort. It is gratifying to note that all produced similar discrimination results, thus internally validating the findings from multivariable logistic regression. Note the discrimination performance of the multivariable logistic regression and the four machine learning approaches was moderately low (AUCs <0.7 in all 5 methods). Other strengths of our study include its population-based nature, its large sample size at inception, its length of follow-up, as well as the relatively under-studied nature of the community, given its low socioeconomic status. Given these study design features, it was not possible to conduct the kinds of in-depth clinical and laboratory assessments, including neuroimaging, that have become more common in aging studies. For example, Burke et al. found participants from various Alzheimer’s Disease Centers with lower hippocampal volume measured by MRI were significantly more likely to be lost. (Burke et al., 2019) In another longitudinal study of cognitive aging, greater white matter lesion volume (WMLV) and declines in hippocampal volume were significantly associated with attrition. (Glymour et al., 2012) We did however include several variables which others have not, and which we found to be associated with attrition, including no reported physical activity or hobbies, and an increased number of subjective memory complaints; while residing with another study participant was protective against attrition. Finally, our study population is largely European American, reflecting the race and ethnicity of older adults in the targeted community. Our findings should be replicated in other cohorts with larger representation of ethnic minorities and a larger range of educational levels.

In conclusion, results from 10-year annual follow-up of an aging population-based cohort in a community of relatively low socioeconomic status revealed a set of factors that predicted attrition from the study. These findings have implications for the design of future studies, including both selection/inclusion criteria to minimize attrition, and appropriate weighting of study results to address potential bias from attrition.

Supplementary Material

NIHMS1576520-supplement-1.docx^{(25.7KB, docx)}

ACKNOWLEDGEMENTS

The authors thank all MYHAT participants and the MYHAT project staff for their contributions to the study.

CONFLICT OF INTEREST

The work reported here was supported in part by research grant R01 AG023651 from the National Institute on Aging, National Institutes of Health, US DHHS. The National Institute of Aging had no role in the concept, design, methods, subject recruitment, data collection, preparation or approval of the manuscript.

Dr. Ganguli served on the “AD Patient Journey Working Group” for Biogen, Inc., in 2016 and 2017. All other authors have no conflicts to disclose.

REFERENCES

Agogo GO, Ramsey CM, Gnjidic D, Moga DC and Allore H (2018). Longitudinal associations between different dementia diagnoses and medication use jointly accounting for dropout. Int Psychogeriatr, 30, 1477–1487. 10.1017/S1041610218000017. [DOI] [PMC free article] [PubMed] [Google Scholar]
Amalakuhan B, et al. (2012). A prediction model for COPD readmissions: catching up, catching our breath, and improving a national problem. J Community Hosp Intern Med Perspect, 2. 10.3402/jchimp.v2i1.9915. [DOI] [PMC free article] [PubMed] [Google Scholar]
Babatunde OA, et al. (2017). Predictors of Retention among African Americans in a Randomized Controlled Trial to Test the Healthy Eating and Active Living in the Spirit (HEALS) Intervention. Ethn Dis, 27, 265–272. 10.18865/ed.27.3.265. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bambs CE, et al. (2013). Sociodemographic, clinical, and psychological factors associated with attrition in a prospective study of cardiovascular prevention: the Heart Strategies Concentrating on Risk Evaluation study. Ann Epidemiol, 23, 328–333. 10.1016/j.annepidem.2013.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
Burke SL, et al. (2019). Factors influencing attrition in 35 Alzheimer’s Disease Centers across the USA: a longitudinal examination of the National Alzheimer’s Coordinating Center’s Uniform Data Set. Aging Clin Exp Res, 31, 1283–1297. 10.1007/s40520-018-1087-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cacioppo JT and Cacioppo S (2018). The Population-Based Longitudinal Chicago Health, Aging, and Social Relations Study (CHASRS): Study Description and Predictors of Attrition in Older Adults. Arch Sci Psychol, 6, 21–31. 10.1037/arc0000036. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chang CC, Yang HC, Tang G and Ganguli M (2009). Minimizing attrition bias: a longitudinal study of depressive symptoms in an elderly cohort. Int Psychogeriatr, 21, 869–878. 10.1017/S104161020900876X. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chatfield MD, Brayne CE and Matthews FE (2005). A systematic literature review of attrition between waves in longitudinal studies in the elderly shows a consistent pattern of dropout between differing studies. J Clin Epidemiol, 58, 13–19. 10.1016/j.jclinepi.2004.05.006. [DOI] [PubMed] [Google Scholar]
Chu A, et al. (2008). A decision support system to facilitate management of patients with acute gastrointestinal bleeding. Artif Intell Med, 42, 247–259. 10.1016/j.artmed.2007.10.003. [DOI] [PubMed] [Google Scholar]
Daza EJ, Hudgens MG and Herring AH (2017). Estimating inverse-probability weights for longitudinal data with dropout or truncation: The xtrccipw command. Stata J, 17, 253–278. [PMC free article] [PubMed] [Google Scholar]
Deeg DJ, van Tilburg T, Smit JH and de Leeuw ED (2002). Attrition in the Longitudinal Aging Study Amsterdam. The effect of differential inclusion in side studies. J Clin Epidemiol, 55, 319–328. 10.1016/s0895-4356(01)00475-9. [DOI] [PubMed] [Google Scholar]
Dorsett R (2010). Adjusting for nonigorable sample attrition using survey substitutes identified by propensity score matching: An empirical investigation using labour market data. J of Offic Stat, 26, 105–125. [Google Scholar]
Fillenbaum GG (1988). Multidimensional Functional Assessment of Older Adults: The Duke Older Americans Resources and Services Procedures. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. [Google Scholar]
Folstein MF, Folstein SE and McHugh PR (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res, 12, 189–198. 10.1016/0022-3956(75)90026-6. [DOI] [PubMed] [Google Scholar]
Gallagher D, Fischer CE and Iaboni A (2017). Neuropsychiatric Symptoms in Mild Cognitive Impairment. Can J Psychiatry, 62, 161–169. 10.1177/0706743716648296. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganguli M, et al. (2020). Aging, diabetes, obesity, and cognitive decline: A population-based study. J Am Geriatr Soc, 68, 991–998. 10.1111/jgs.16321 [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganguli M, Fu B, Snitz BE, Hughes TF and Chang CC (2013). Mild cognitive impairment: incidence and vascular risk factors in a population-based cohort. Neurology, 80, 2112–2120. 10.1212/WNL.0b013e318295d776. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganguli M, Gilby J, Seaberg E and Belle S (1995). Depressive Symptoms and Associated Factors in a Rural Elderly Population: The MoVIES Project. Am J Geriatr Psychiatry, 3, 144–160. 10.1097/00019442-199500320-00006. [DOI] [PubMed] [Google Scholar]
Ganguli M, et al. (2015). Who wants a free brain scan? Assessing and correcting for recruitment biases in a population-based sMRI pilot study. Brain Imaging Behav, 9, 204–212. 10.1007/s11682-014-9297-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ganguli M, Snitz B, Vander Bilt J and Chang CC (2009). How much do depressive symptoms affect cognition at the population level? The Monongahela-Youghiogheny Healthy Aging Team (MYHAT) study. Int J Geriatr Psychiatry, 24, 1277–1284. 10.1002/gps.2257. [DOI] [PMC free article] [PubMed] [Google Scholar]
Glymour MM, Chene G, Tzourio C and Dufouil C (2012). Brain MRI markers and dropout in a longitudinal study of cognitive aging: the Three-City Dijon Study. Neurology, 79, 1340–1348. 10.1212/WNL.0b013e31826cd62a. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hara M, et al. (2015). Factors associated with non-participation in a face-to-face second survey conducted 5 years after the baseline survey. J Epidemiol, 25, 117–125. 10.2188/jea.JE20140116. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hastie T TR, Friedman J (2008). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer. [Google Scholar]
Henderson RD, P.; Dobson A (2000). Joint modelling of longitudinal measurements and event time data. Biostatistics, 1, 465–480. [DOI] [PubMed] [Google Scholar]
Hsich E, Gorodeski EZ, Blackstone EH, Ishwaran H and Lauer MS (2011). Identifying important risk factors for survival in patient with systolic heart failure using random survival forests. Circ Cardiovasc Qual Outcomes, 4, 39–45. 10.1161/CIRCOUTCOMES.110.939371. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kuh D, et al. (2016). The MRC National Survey of Health and Development reaches age 70: maintaining participation at older ages in a birth cohort study. Eur J Epidemiol, 31, 1135–1147. 10.1007/s10654-016-0217-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
Li Q and Su L (2018). Accommodating informative dropout and death: a joint modelling approach for longitudinal and semi-competing risks data. J R Stat Soc Ser C Appl Stat, 67, 145–163. 10.1111/rssc.12210. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu SY, Manly JJ, Capistrant BD and Glymour MM (2015). Historical Differences in School Term Length and Measured Blood Pressure: Contributions to Persistent Racial Disparities among US-Born Adults. PLoS One, 10, e0129673. 10.1371/journal.pone.0129673. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lo-Ciganic WH, et al. (2019). Evaluation of Machine-Learning Algorithms for Predicting Opioid Overdose Risk Among Medicare Beneficiaries With Opioid Prescriptions. JAMA Netw Open, 2, e190968. 10.1001/jamanetworkopen.2019.0968. [DOI] [PMC free article] [PubMed] [Google Scholar]
Matthews FE, Chatfield M, Brayne C, Medical Research Council Cognitive, F. and Ageing, S. (2006). An investigation of whether factors associated with short-term attrition change or persist over ten years: data from the Medical Research Council Cognitive Function and Ageing Study (MRC CFAS). BMC Public Health, 6, 185. 10.1186/1471-2458-6-185. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mein G, et al. (2012). Predictors of two forms of attrition in a longitudinal health study involving ageing participants: an analysis based on the Whitehall II study. BMC Med Res Methodol, 12, 164. 10.1186/1471-2288-12-164. [DOI] [PMC free article] [PubMed] [Google Scholar]
Morris JC (1993). The Clinical Dementia Rating (CDR): current version and scoring rules. Neurology, 43, 2412–2414. 10.1212/wnl.43.11.2412-a. [DOI] [PubMed] [Google Scholar]
Mungas D, Marshall SC, Weldon M, Haan M and Reed BR (1996). Age and education correction of Mini-Mental State Examination for English and Spanish-speaking elderly. Neurology, 46, 700–706. 10.1212/wnl.46.3.700. [DOI] [PubMed] [Google Scholar]
R Core Team (2014). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org. [Google Scholar]
Radloff LS (1977). The CES-D Scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401. [Google Scholar]
Salthouse TA (2014). Selectivity of attrition in longitudinal studies of cognitive functioning. J Gerontol B Psychol Sci Soc Sci, 69, 567–574. 10.1093/geronb/gbt046. [DOI] [PMC free article] [PubMed] [Google Scholar]
Snitz BE, et al. (2012). Subjective cognitive complaints of older adults at the population level: an item response theory analysis. Alzheimer Dis Assoc Disord, 26, 344–351. 10.1097/WAD.0b013e3182420bdf. [DOI] [PMC free article] [PubMed] [Google Scholar]
Steptoe A, Breeze E, Banks J and Nazroo J (2013). Cohort profile: the English longitudinal study of ageing. Int J Epidemiol, 42, 1640–1648. 10.1093/ije/dys168. [DOI] [PMC free article] [PubMed] [Google Scholar]
Thottakkara P, et al. (2016). Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications. PLoS One, 11, e0155705. 10.1371/journal.pone.0155705. [DOI] [PMC free article] [PubMed] [Google Scholar]
Van Beijsterveldt CE, et al. (2002). Predictors of attrition in a longitudinal cognitive aging study: the Maastricht Aging Study (MAAS). J Clin Epidemiol, 55, 216–223. 10.1016/s0895-4356(01)00473-5. [DOI] [PubMed] [Google Scholar]
Weuve J, et al. (2015). Guidelines for reporting methodological challenges and evaluating potential bias in dementia research. Alzheimers Dement, 11, 1098–1109. 10.1016/j.jalz.2015.06.1885. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wolinsky FD, et al. (2010). Speed of processing training protects self-rated health in older adults: enduring effects observed in the multi-site ACTIVE randomized controlled trial. Int Psychogeriatr, 22, 470–478. 10.1017/S1041610209991281. [DOI] [PMC free article] [PubMed] [Google Scholar]
Young AF, Powers JR and Bell SL (2006). Attrition in longitudinal studies: who do you lose? Aust N Z J Public Health, 30, 353–361. 10.1111/j.1467-842x.2006.tb00849.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS1576520-supplement-1.docx^{(25.7KB, docx)}

[R1] Agogo GO, Ramsey CM, Gnjidic D, Moga DC and Allore H (2018). Longitudinal associations between different dementia diagnoses and medication use jointly accounting for dropout. Int Psychogeriatr, 30, 1477–1487. 10.1017/S1041610218000017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Amalakuhan B, et al. (2012). A prediction model for COPD readmissions: catching up, catching our breath, and improving a national problem. J Community Hosp Intern Med Perspect, 2. 10.3402/jchimp.v2i1.9915. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Babatunde OA, et al. (2017). Predictors of Retention among African Americans in a Randomized Controlled Trial to Test the Healthy Eating and Active Living in the Spirit (HEALS) Intervention. Ethn Dis, 27, 265–272. 10.18865/ed.27.3.265. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Bambs CE, et al. (2013). Sociodemographic, clinical, and psychological factors associated with attrition in a prospective study of cardiovascular prevention: the Heart Strategies Concentrating on Risk Evaluation study. Ann Epidemiol, 23, 328–333. 10.1016/j.annepidem.2013.02.007. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Burke SL, et al. (2019). Factors influencing attrition in 35 Alzheimer’s Disease Centers across the USA: a longitudinal examination of the National Alzheimer’s Coordinating Center’s Uniform Data Set. Aging Clin Exp Res, 31, 1283–1297. 10.1007/s40520-018-1087-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Cacioppo JT and Cacioppo S (2018). The Population-Based Longitudinal Chicago Health, Aging, and Social Relations Study (CHASRS): Study Description and Predictors of Attrition in Older Adults. Arch Sci Psychol, 6, 21–31. 10.1037/arc0000036. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Chang CC, Yang HC, Tang G and Ganguli M (2009). Minimizing attrition bias: a longitudinal study of depressive symptoms in an elderly cohort. Int Psychogeriatr, 21, 869–878. 10.1017/S104161020900876X. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Chatfield MD, Brayne CE and Matthews FE (2005). A systematic literature review of attrition between waves in longitudinal studies in the elderly shows a consistent pattern of dropout between differing studies. J Clin Epidemiol, 58, 13–19. 10.1016/j.jclinepi.2004.05.006. [DOI] [PubMed] [Google Scholar]

[R9] Chu A, et al. (2008). A decision support system to facilitate management of patients with acute gastrointestinal bleeding. Artif Intell Med, 42, 247–259. 10.1016/j.artmed.2007.10.003. [DOI] [PubMed] [Google Scholar]

[R10] Daza EJ, Hudgens MG and Herring AH (2017). Estimating inverse-probability weights for longitudinal data with dropout or truncation: The xtrccipw command. Stata J, 17, 253–278. [PMC free article] [PubMed] [Google Scholar]

[R11] Deeg DJ, van Tilburg T, Smit JH and de Leeuw ED (2002). Attrition in the Longitudinal Aging Study Amsterdam. The effect of differential inclusion in side studies. J Clin Epidemiol, 55, 319–328. 10.1016/s0895-4356(01)00475-9. [DOI] [PubMed] [Google Scholar]

[R12] Dorsett R (2010). Adjusting for nonigorable sample attrition using survey substitutes identified by propensity score matching: An empirical investigation using labour market data. J of Offic Stat, 26, 105–125. [Google Scholar]

[R13] Fillenbaum GG (1988). Multidimensional Functional Assessment of Older Adults: The Duke Older Americans Resources and Services Procedures. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. [Google Scholar]

[R14] Folstein MF, Folstein SE and McHugh PR (1975). “Mini-mental state”. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res, 12, 189–198. 10.1016/0022-3956(75)90026-6. [DOI] [PubMed] [Google Scholar]

[R15] Gallagher D, Fischer CE and Iaboni A (2017). Neuropsychiatric Symptoms in Mild Cognitive Impairment. Can J Psychiatry, 62, 161–169. 10.1177/0706743716648296. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Ganguli M, et al. (2020). Aging, diabetes, obesity, and cognitive decline: A population-based study. J Am Geriatr Soc, 68, 991–998. 10.1111/jgs.16321 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Ganguli M, Fu B, Snitz BE, Hughes TF and Chang CC (2013). Mild cognitive impairment: incidence and vascular risk factors in a population-based cohort. Neurology, 80, 2112–2120. 10.1212/WNL.0b013e318295d776. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Ganguli M, Gilby J, Seaberg E and Belle S (1995). Depressive Symptoms and Associated Factors in a Rural Elderly Population: The MoVIES Project. Am J Geriatr Psychiatry, 3, 144–160. 10.1097/00019442-199500320-00006. [DOI] [PubMed] [Google Scholar]

[R19] Ganguli M, et al. (2015). Who wants a free brain scan? Assessing and correcting for recruitment biases in a population-based sMRI pilot study. Brain Imaging Behav, 9, 204–212. 10.1007/s11682-014-9297-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Ganguli M, Snitz B, Vander Bilt J and Chang CC (2009). How much do depressive symptoms affect cognition at the population level? The Monongahela-Youghiogheny Healthy Aging Team (MYHAT) study. Int J Geriatr Psychiatry, 24, 1277–1284. 10.1002/gps.2257. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Glymour MM, Chene G, Tzourio C and Dufouil C (2012). Brain MRI markers and dropout in a longitudinal study of cognitive aging: the Three-City Dijon Study. Neurology, 79, 1340–1348. 10.1212/WNL.0b013e31826cd62a. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] Hara M, et al. (2015). Factors associated with non-participation in a face-to-face second survey conducted 5 years after the baseline survey. J Epidemiol, 25, 117–125. 10.2188/jea.JE20140116. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Hastie T TR, Friedman J (2008). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, NY: Springer. [Google Scholar]

[R24] Henderson RD, P.; Dobson A (2000). Joint modelling of longitudinal measurements and event time data. Biostatistics, 1, 465–480. [DOI] [PubMed] [Google Scholar]

[R25] Hsich E, Gorodeski EZ, Blackstone EH, Ishwaran H and Lauer MS (2011). Identifying important risk factors for survival in patient with systolic heart failure using random survival forests. Circ Cardiovasc Qual Outcomes, 4, 39–45. 10.1161/CIRCOUTCOMES.110.939371. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Kuh D, et al. (2016). The MRC National Survey of Health and Development reaches age 70: maintaining participation at older ages in a birth cohort study. Eur J Epidemiol, 31, 1135–1147. 10.1007/s10654-016-0217-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Li Q and Su L (2018). Accommodating informative dropout and death: a joint modelling approach for longitudinal and semi-competing risks data. J R Stat Soc Ser C Appl Stat, 67, 145–163. 10.1111/rssc.12210. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Liu SY, Manly JJ, Capistrant BD and Glymour MM (2015). Historical Differences in School Term Length and Measured Blood Pressure: Contributions to Persistent Racial Disparities among US-Born Adults. PLoS One, 10, e0129673. 10.1371/journal.pone.0129673. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Lo-Ciganic WH, et al. (2019). Evaluation of Machine-Learning Algorithms for Predicting Opioid Overdose Risk Among Medicare Beneficiaries With Opioid Prescriptions. JAMA Netw Open, 2, e190968. 10.1001/jamanetworkopen.2019.0968. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Matthews FE, Chatfield M, Brayne C, Medical Research Council Cognitive, F. and Ageing, S. (2006). An investigation of whether factors associated with short-term attrition change or persist over ten years: data from the Medical Research Council Cognitive Function and Ageing Study (MRC CFAS). BMC Public Health, 6, 185. 10.1186/1471-2458-6-185. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] Mein G, et al. (2012). Predictors of two forms of attrition in a longitudinal health study involving ageing participants: an analysis based on the Whitehall II study. BMC Med Res Methodol, 12, 164. 10.1186/1471-2288-12-164. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Morris JC (1993). The Clinical Dementia Rating (CDR): current version and scoring rules. Neurology, 43, 2412–2414. 10.1212/wnl.43.11.2412-a. [DOI] [PubMed] [Google Scholar]

[R33] Mungas D, Marshall SC, Weldon M, Haan M and Reed BR (1996). Age and education correction of Mini-Mental State Examination for English and Spanish-speaking elderly. Neurology, 46, 700–706. 10.1212/wnl.46.3.700. [DOI] [PubMed] [Google Scholar]

[R34] R Core Team (2014). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org. [Google Scholar]

[R35] Radloff LS (1977). The CES-D Scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385–401. [Google Scholar]

[R36] Salthouse TA (2014). Selectivity of attrition in longitudinal studies of cognitive functioning. J Gerontol B Psychol Sci Soc Sci, 69, 567–574. 10.1093/geronb/gbt046. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Snitz BE, et al. (2012). Subjective cognitive complaints of older adults at the population level: an item response theory analysis. Alzheimer Dis Assoc Disord, 26, 344–351. 10.1097/WAD.0b013e3182420bdf. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Steptoe A, Breeze E, Banks J and Nazroo J (2013). Cohort profile: the English longitudinal study of ageing. Int J Epidemiol, 42, 1640–1648. 10.1093/ije/dys168. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Thottakkara P, et al. (2016). Application of Machine Learning Techniques to High-Dimensional Clinical Data to Forecast Postoperative Complications. PLoS One, 11, e0155705. 10.1371/journal.pone.0155705. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] Van Beijsterveldt CE, et al. (2002). Predictors of attrition in a longitudinal cognitive aging study: the Maastricht Aging Study (MAAS). J Clin Epidemiol, 55, 216–223. 10.1016/s0895-4356(01)00473-5. [DOI] [PubMed] [Google Scholar]

[R41] Weuve J, et al. (2015). Guidelines for reporting methodological challenges and evaluating potential bias in dementia research. Alzheimers Dement, 11, 1098–1109. 10.1016/j.jalz.2015.06.1885. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] Wolinsky FD, et al. (2010). Speed of processing training protects self-rated health in older adults: enduring effects observed in the multi-site ACTIVE randomized controlled trial. Int Psychogeriatr, 22, 470–478. 10.1017/S1041610209991281. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] Young AF, Powers JR and Bell SL (2006). Attrition in longitudinal studies: who do you lose? Aust N Z J Public Health, 30, 353–361. 10.1111/j.1467-842x.2006.tb00849.x. [DOI] [PubMed] [Google Scholar]

PERMALINK

Predictors of Attrition in a Longitudinal Population-Based Study of Aging

Erin Jacobsen, MS

Xinhui Ran, MS

Anran Liu, MS

Chung-Chou H Chang, PhD

Mary Ganguli, MD MPH

Abstract

Background

Objective

Methods

Results

Conclusions

INTRODUCTION

METHODS

Study setting and participants

Study assessment and predictor variables

Demographic and personal information

Cognition

Clinical Dementia Rating

Health history

Blood pressure

Lifestyle and activities based on self-report

Social support

Depression

Functional ability

Subjective memory complaints

Outcome variables

Attrition

Tracking

Statistical Analysis

RESULTS

Table 1:

Table 2:

Table 3:

Table 4:

DISCUSSION

Supplementary Material

ACKNOWLEDGEMENTS

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases