Abstract
Objectives
The Health and Retirement Study Telephone Interview for Cognitive Status (HRS TICS) score and its associated Langa–Weir cutoffs are widely used as indicators of cognitive status for research purposes in population-based studies. The classification is based on in-person and phone interviews of older individuals. Our purpose was to develop a corresponding classification for web-based self-administered assessments.
Methods
Participants were 925 members of a nationally representative internet panel, all aged 50 and older. We conducted (a) a phone interview comprised of cognitive items used to construct the HRS TICS score, and (b) a web counterpart with self-administered cognitive items, while also considering (c) other already administered web-based cognitive tests and instrumental activities of daily living survey questions, all from the same respondents.
Results
The web-administered HRS TICS items have only modest correlations with the same phone items, although neither mode showed universally higher scores than the other. Using latent variable modeling, we created a probability of cognitive impairment score for the web-based battery that achieved good correspondence to the phone Langa–Weir classification.
Discussion
The results permit analyses of predictors, correlates, and consequences of cognitive impairment in web surveys where relevant cognitive test and functional abilities items are available. We discuss challenges and caveats that may affect the findings.
Keywords: Cognitive function, Cognitive testing, Dementia, Internet, Questionnaires and surveys
Self-administered web-based assessment of cognitive function has become increasingly common in population research (e.g., Patel et al., 2017; Perin et al., 2020; Selwood, 2022). Many investigators are interested in using such assessments in order to classify participants as having normal cognition, cognitive impairment short of dementia, or dementia. Even though these categories are not clinical diagnoses, they serve as a classification to enhance the interpretation of cognitive function. Existing classification algorithms, however, were developed with face-to-face or telephone-administered assessments. It cannot be assumed that these extant cutoffs transfer directly to web-based administration.
One of the most widely used classification algorithms was developed by the Health and Retirement Study (HRS), which uses the Langa–Weir criteria to categorize respondents as cognitively normal, cognitively impaired/not dementia (CIND), or dementia (Crimmins et al., 2011; Langa et al., 2017). Cognitive items from the HRS-adapted version of the Telephone Interview for Cognitive Status (TICS; Welsh et al., 1993) were administered to respondents in person or by telephone. The classification cutoffs were constructed by calibrating the TICS items against gold-standard in-person clinical assessments with a subsample of HRS participants through the Aging, Demographics, and Memory Study (ADAMS; Langa et al., 2005). In ADAMS, selected HRS participants received a full clinical evaluation for dementia and CIND. Subsequently, Langa and colleagues (Alzheimer’s Association, 2010, Appendix 15) developed cut points for the HRS TICS items to correspond to the ADAMS diagnoses and reproduce the same population distribution of cognitive categories. The advantage of the TICS score with Langa–Weir cutoffs is that it is available for all HRS respondents, not just the much smaller ADAMS subsample, and it is available in each assessment wave. The algorithm can also be applied to other studies administering the same TICS items via telephone. The main drawback is that it is not a clinical diagnosis and therefore subject to some misclassification. However, the classification is very useful for research purposes, for example, to estimate dementia prevalence rates, and to investigate protective and risk factors (Langa et al., 2017).
Even though there could be many benefits to translating the Langa–Weir criteria to web-based assessments, we cannot simply use the same algorithm and apply the existing cut points to web-based data. While in-person and phone batteries have been found largely equivalent in the HRS (Herzog et al., 1999), McClain et al. (2018) found higher scores when HRS tests—including number series, verbal analogies, word recall/recognition, and serial 7s—were self-administered on the web rather than interviewer-administered by phone or in-person.
Given this context, the purpose of the present study was to develop a classification algorithm for self-administered web interviews, applicable to the U.S. population aged 50 or older. This study was conducted in the Understanding America Study (UAS), a probability-based internet panel. It would be resource-prohibitive to conduct a study within the UAS similar to ADAMS, in which a sample would be visited and evaluated clinically, with their diagnoses then calibrated against their web-based UAS score profile. Therefore, we collected a crosswalk sample within the UAS where participants were administered the HRS cognitive module in a phone interview, and the HRS web adaptation of the cognitive module by web. This design takes the HRS TICS cognitive measures administered by phone as the reference and develops a crosswalk between participants’ phone and web data to create a web-based classification of UAS participants’ cognitive status.
We further considered whether predicting the phone classification could be enhanced by adding additional web-based cognitive tests and functional ability items from the UAS. Finally, we considered the correspondence of the eventual classification to respondents’ self-reports of having been told by a physician that they had dementia or Alzheimer’s disease.
Method
The UAS
The UAS is a probability-based (nationally representative) internet panel. Respondents are drawn from post-office delivery sequence files and provided with internet-enabled tablets if needed. It currently comprises more than 9,600 panel members—nearly 5,000 aged 50 years or older. The UAS collects a large set of background variables on all panel members: 14 core surveys on topics such as well-being, retirement planning, and personality are administered with an annual or biennial frequency alongside repeated assessments of cognitive functioning. Thus, UAS participants become experienced at completing online surveys.
Sample Selection and Randomization to Condition
For this study, we selected a calibration sample of English-speaking panel members using a design similar to comparable studies (Gross et al., 2020). We defined strata based on age (50–64, 65–74, 75+), cognitive test scores collected in prior UAS waves, and difficulties with instrumental activities of daily living (IADLs). We created eight strata: (1) everyone age 75+; (2) those aged <75 with difficulty on one or more IADLs; (3)–(8) among those <75 with no difficulty on any IADL, a cross-classification of age group (50–64; 65–74) and cognition group (lowest quartile; second; upper two combined). We selected all panel members in the oldest age group and in the group with difficulty on one or more IADLs, and randomly sampled equal numbers of respondents in each of the remaining strata, creating a gross sample of 2,000. This design oversampled individuals with a higher likelihood of having cognitive decline, which would increase the power of the analysis. Our target for the final sample was informed by the slightly over 800 participants used for constructing the Langa–Weir classification (Crimmins et al., 2011).
We randomly assigned half the gross sample into being administered the phone interview first and the other half into being administered the web interview first. For the word recall test that has four versions, we randomized for which version to receive first, and again for which version to receive second, while assuring that the word list for the web interview was different from the word list in the phone interview.
Because of lower than expected initial response, we supplemented the sample with 400 additional participants using the same selection algorithm. This study was approved by the University of Southern California (USC) University Park Campus Institutional Review Board.
Following typical UAS protocol, the 2,400 selected UAS panel members were notified by e-mail that a study was available for them, and directed to the online consent survey for this study. Figure 1 shows the number responding to the consent survey (allocation); the number who successfully answered the questions designed to assure informed consent, described in the Consent survey section below (eligibility); the number agreeing to take part in the study (enrollment); the number who completed both telephone and web surveys (participation); and the number excluded from analyses. The final analytic sample comprised 925 participants.
Figure 1.
Sample flow chart.
Protocol
The TICS items were administered in accordance with the HRS phone protocol. The HRS web adaptation of the TICS was administered on the web to the same UAS sample, in accordance with the HRS web protocol, within 1 month before or after the phone interview. The TICS has shown high test–retest reliability, for example, Desmond et al. (1994) showed 1-month test–retest correlations of r = 0.90; we considered a 1-month interval between tests as sufficiently brief to minimize biases from true changes in cognitive functioning between assessments, while simultaneously minimizing potential practice effects.
The telephone interviews were conducted by Davis Research, with training of interviewers by both USC and Davis staff and ongoing monitoring of data collection and scoring.
Telephone and Web Instrument for This Study
The HRS TICS telephone interview includes serial 7s subtraction, immediate and delayed word recall, and counting backwards from 20. The interview also includes object naming, recall of today’s date, and naming the president and vice president, which are not included in calculating the Langa–Weir cutoff and are not analyzed here.
The web survey included serial 7s and immediate and delayed word recall. Counting backwards was omitted as it is infeasible as a web-based test. (Instead, respondents were asked to click on four boxes as fast as possible and to do a typing speed test, neither analyzed here.) For each serial 7s subtraction, the respondent was given a line on which to type the answer. For immediate and delayed word recall, the respondent was given a large box in which to type as many words as were remembered.
In the serial 7s task, respondents, starting at 100, sequentially subtracted 7 for a total of five trials (scored 0–5 for the number of correct steps). Immediate and delayed word recall were each scored 0–10 based on the number of words correctly recalled. For the telephone survey, the interviewer recorded answers and scored during the interview, with scores subsequently reviewed and adjudicated as needed. For the web survey, misspelling was permitted following the guideline that oral pronunciation of the misspelled word should correspond to the correct answer.
From the telephone instrument, a composite measure was calculated comprised of immediate and delayed recall, serial 7s, and counting backwards (scores as 0 = incorrect, 1 = correct on second attempt, or 2 = correct on first attempt), with a score range of 0–27. Following Langa–Weir, scores of 0–6 are categorized as “dementia,” 7–11 as “cognitively impaired/not dementia,” and 12–27 as “normal cognition.”
Other Cognitive Tests in the UAS
The UAS has administered Number Series and Verbal Analogies tests (McArdle et al., 2015); a Picture Vocabulary test (McArdle et al., 2015); the Stop and Go Switching Task (SGST; Lachman et al., 2014) to capture reaction time and executive processes—adapted from telephone for web by UAS (Liu et al., 2022); and the Figure Identification task (Berg, 1980) as a measure of processing speed—adapted from paper and pencil for the web by UAS (Liu et al., 2022). The Number Series, Verbal Analogies, and Picture Vocabulary tests have scaled scores based on item response theory. SGST scores are median response times in seconds on four conditions: baseline, reversing which stimulus corresponds to which response, responses to trials immediately after being asked to switch conditions, and responses after nonswitch trials. The Figure Identification task scores are the number of correctly identified figures out of 30 within a 90-s time limit. Scores on SGST and Figure Identification are only calculated if over 70% of items are correctly answered. A separate variable indicates whether the respondent achieved that threshold or not.
Self-Reported IADLs and Self-Reported Dementia
As in the HRS, members of the UAS report limitations with IADLs every two years. Questions ask for difficulties using a telephone, taking medication, handling money, shopping, and preparing meals (score range 0–5).
Also following the HRS, UAS panelists are asked if they had ever been told by a doctor that they had Alzheimer’s disease. A separate question asks about “dementia, senility, or any other serious memory impairment” (McGrath et al., 2021).
Consent Survey
Although everyone invited to this study was already an enrolled and consented member of the UAS panel, the USC institutional review board required an expanded consent survey, as well as explicitly stating in the consent survey that the person was not eligible to participate if they had been diagnosed by a physician with a progressive cognitive impairment that would interfere with ability to consent to this study. Participants were asked three multiple choice questions—whether it is ok to refuse to participate in the survey, what the person will be asked to do in this study, and whether refusing to participate in this survey affects future eligibility for other UAS surveys—all of which had to be answered correctly before the potential participant was asked if they wished to consent.
Data Analysis
Initial analyses addressed the correspondence of the TICS subtests across phone and web administration modes. We examined correlations and mean differences (using paired-samples t tests) for subtests across the modes. To examine potential order of administration effects, we tested the interaction between assignment order (between-subjects factor) and mode of administration (within-subjects factor) in two-way analysis of variance models.
A latent variable modeling (structural equation modeling) strategy was used to create a crosswalk between the phone and web TICS scores. A benefit of this approach over classical equating techniques is that it accounts for unreliability in the observed sum scores due to measurement error and the associated uncertainty about the true cognitive scores. The immediate recall, delayed recall, and serial 7s scores obtained from web administration served as indicators of a latent factor. To place the web cognitive factor on the same metric as the composite (0–27) TICS scores derived from the phone administration, the web factor was correlated with the phone TICS scores in the same measurement model, and the web factor mean and variance were held equal to those of the phone TICS scores, while allowing all loadings and intercepts of the web factor indicators to be freely estimated. To account for order of administration effects, the factor indicators and phone TICS scores were regressed on a dummy variable coded as 0 = “web administration first” and “1 = phone administration first.” Model fit was evaluated with the chi-square test of global model fit, comparative fit index (CFI; values >0.95 indicate acceptable fit), Tucker–Lewis index (TLI; >0.95 for good fit), root mean square error of approximation (RMSEA; <0.06 for good fit), and standardized root mean square residuals (SRMR; <0.08 for acceptable fit; Hu & Bentler, 1999).
Because only two (0.2%) respondents fell below the TICS cut point of 7 to be classified as dementia in this sample, it was not feasible to develop a crosswalk specifically for the dementia category. For this reason, we only developed a crosswalk mapping the (continuous) web-based scores onto the distinction between “normal cognition” and “cognitively impaired or dementia.” To that end, we generated 100 multiple imputations (i.e., “plausible values”) of the latent web factor per respondent (see Asparouhov & Muthén, 2010), and computed the proportion of multiply imputed scores that were above versus below the Langa–Weir cutoff for CIND for each respondent (a value of 11.5 was chosen as cutoff because it is between the integer values of 11 and 12 separating normal cognition from CIND on the TICS). Rather than assigning each person categorically to a cognitive category, this procedure gives a “probability of cognitive impairment” (PCI) score per respondent. We opted for this approach because it preserves the information contained in the continuous factor and accounts for uncertainty in the cognitive classifications from the web assessment. We used receiver operating characteristic (ROC) analysis and examined the area under the ROC curve to quantify the extent to which the PCI scores successfully discriminated the Langa–Weir categories based on the phone TICS scores.
We also examined whether the linkage (i.e., correlation) between the web cognitive factor and telephone TICS scores, and, correspondingly, the quality of the crosswalk, could be improved by expanding the measurement model of the web assessment. One model added the number of IADLs with difficulty as an indicator of the latent factor. Additional sets of models added tests of executive functioning and processing speed. We used Mplus version 8.7 (Muthén & Muthén, 2017) for the latent variable models, employing maximum likelihood parameter estimation for model testing and Bayesian parameter estimation with default diffuse priors to generate multiply imputed factor scores for calculation of the PCI scores.
Results
Nonresponse
As shown in Supplementary Table S1, the strata comprised of those who were younger, less functionally impaired, and with higher prior cognitive scores were overrepresented in the analytic sample compared to the initial sample of 2,400 UAS panel members. Supplementary Table S2 shows that individuals in the final sample on average had higher cognitive scores, and were younger and less likely to be male. All differences, although statistically significant, are relatively modest and are not expected to influence the validity of the main analyses.
Summary Statistics
Table 1 describes the demographic characteristics of the 925 participants. Mean age was 66.1 years (SD = 9.3). Tests of randomization to assignment order identified one statistically significant difference with those randomized to receiving the phone interview first being somewhat older than those randomized to receiving the web survey first (67.1 vs 65.1 years, t = 3.26, p = .0012). There were no assignment order differences by gender, education, income level, or race/ethnicity. Self-assessed computer skills by age group are pictured in Supplementary Figure S1. Compared to younger respondents, the oldest respondents were more likely to call their skills “moderate” than “competent.”
Table 1.
Demographic Characteristics of Analytic Sample (N = 925)
| Characteristic | Frequency | Percent | |
|---|---|---|---|
| Age group | 50s | 257 | 27.8 |
| 60s | 341 | 36.9 | |
| 70s and older | 327 | 35.3 | |
| Gender | Men | 367 | 39.7 |
| Women | 558 | 60.3 | |
| Annual household income | Less than $25,000 | 190 | 20.6 |
| $25,000–$50,000 | 224 | 24.3 | |
| $50,000–$75,000 | 173 | 18.8 | |
| $75,000–$100,000 | 132 | 14.3 | |
| Over $100,000 | 203 | 22.0 | |
| Race/ethnicity | Non-Hispanic white | 735 | 79.6 |
| Black | 76 | 8.2 | |
| Hispanic | 62 | 6.7 | |
| Native American | 7 | 0.8 | |
| Asian and Pacific Islander | 14 | 1.5 | |
| Mixed race | 29 | 3.1 | |
| Education | High school graduate or less | 170 | 18.4 |
| More than high school/less than college graduate | 378 | 40.9 | |
| College graduate | 377 | 40.8 |
Note: Some frequencies sum to less than 925 due to missing data.
Table 2 compares scores on the TICS cognitive items across administration modes. Correlations between scores on the same test administered by different modes ranged between 0.32 and 0.40. On mean scores, there were significant mode effects for all subtests, Word list recall scores were significantly higher for phone than for web administration (Cohen’s d = 0.40 for immediate recall and d = 0.38 for delayed recall), while serial 7s scores were significantly higher for web than for phone administration (d = 0.28). Part of the mode effects were due to some participants on the web skipping over the box where they were asked to type the words that they remembered. Mean differences were smaller but remained significant when those participants (n = 15 immediate recall; n = 76 delayed recall) were excluded. Further, there was a statistically significant interaction between assignment order and mode of administration for immediate (F(1,908) = 8.32, p = .0040) and delayed word list recall (F(1,847) = 14.13, p = .0002), whether or not excluding participants who skipped response box on the web. Specifically, as also seen in Supplementary Figure S2, those who were administered the phone interview first scored higher on the web test than those who took the web test first. Although different lists of words were used at each administration, when the test was administered first by phone and second by web, the web scores showed a significant practice effect.
Table 2.
Means, Distributions, and Pairwise Correlations of the Cognitive Tests in the Phone Interview and Web Survey (N = 925)
| Cognitive test | Means (SD) | Difference | Paired-sample t test | p | Correlation across administration mode | |
|---|---|---|---|---|---|---|
| Phone | Web | |||||
| Immediate word recall | 6.1 (1.6) | 5.4 (1.8) | 0.68 | 10.49 | <.0001 | 0.34 |
| Delayed word recall | 5.1 (1.9) | 4.3 (2.1) | 0.75 | 10.22 | <.0001 | 0.40 |
| Serial 7s | 4.1 (1.2) | 4.4 (1.0) | −0.31 | −7.17 | <.0001 | 0.32 |
When the Langa–Weir criteria were applied to the telephone interview, from the 925 participants who completed both modes, 2 (0.2%) were categorized as having dementia, and 58 (6.3%) were categorized as cognitively impaired/not dementia. The other 93.5% were classified as cognitively normal.
Calibration of Crosswalk Between Phone- and Web-Based Cognitive Test Scores
The latent variable models are shown in Supplementary Figure S3 and the parameter estimates of the models are in Supplementary Tables S3a and S3b. The primary model with web-administered immediate recall, delayed recall, and serial 7s as indicators of a latent web cognitive factor fit the data well (χ 2[1] = 0.88, p = .35; CFI = 1.0; TLI = 1.0; RMSEA = 0.00 [90% CI = 0.00–0.09]; SRMR = 0.003; note that the model included a residual correlation between immediate and delayed recall). The web factor was correlated with the telephone TICS scores at r = 0.67.
Figure 2 shows the distribution of the PCI scores calculated for the respondents from this model, plotted against the Langa–Weir categories. The PCI scores increased with decreasing TICS scores as expected, but showed a wider range for lower compared to higher TICS scores. The median PCI score among respondents classified as cognitively normal was 0.00 (mean = 0.04, interquartile range = 0.00–0.04), and the median score for those classified as CIND or dementia was 0.36 (mean = 0.40, interquartile range = 0.17–0.57). The effect size for the mean difference in PCI scores between these cognitive categories was d = 3.15 (p < .001). In ROC analysis, the PCI scores distinguished the Langa–Weir categories with area under curve (AUC) = 0.947 (see Supplementary Figure S4 for ROC curve). Table 3 shows PCI scores corresponding to sum scores calculated from web-administered immediate recall, delayed recall, and serial 7s. The table may be used as a scoring table to convert the sum scores into PCI scores.
Figure 2.
Scatterplot of probability of cognitive impairment (PCI) scores from web administration against TICS scores from phone administration (left) and box-and-whisker plots of probability of PCI scores by cognitive categories (right). Notes: CIND = cognitively impaired, not dementia; TICS = telephone interview of cognitive status. In the box-and-whisker plots, black filled circles represent the mean and black vertical lines represent the median score.
Table 3.
Crosswalk for Probability of Cognitive Impairment (PCI) Scores Corresponding to the Sum Score of Web-Administered Immediate Recall, Delayed Recall, and Serial 7s
| Sum score | Expected PCI score |
|---|---|
| 0 | 0.96 |
| 1 | 0.86 |
| 2 | 0.75 |
| 3 | 0.64 |
| 4 | 0.52 |
| 5 | 0.41 |
| 6 | 0.33 |
| 7 | 0.27 |
| 8 | 0.22 |
| 9 | 0.18 |
| 10 | 0.13 |
| 11 | 0.09 |
| 12 | 0.06 |
| 13 | 0.04 |
| 14 | 0.02 |
| 15 | 0.01 |
| 16 | 0.005 |
| 17 | 0.004 |
| 18 | 0.003 |
| 19 | 0.002 |
| 20 | 0.001 |
| 21 | 0.001 |
| 22 | 0 |
| 23 | 0 |
| 24 | 0 |
| 25 | 0 |
Expanded Crosswalk Models
We tested several model expansions, starting with the addition of IADLs. On average, IADL questions were answered 7.5 months (median = 7.23 months) before the web cognitive tests. Two panelists had missing values because they did not complete an IADL assessment and IADL scores for 16 panelists who had completed the assessment more than 1.5 years before the web cognitive tests were excluded. Adding IADLs as an indicator variable to the web latent factor yielded a model with acceptable fit (χ 2[5] = 24.49, p < .001; CFI = 0.98; TLI = 0.95; RMSEA = 0.06 [90% CI = 0.04–0.09]; SRMR = 0.029). The factor correlated with the telephone TICS scores at r = 0.76. For a conversion of sum scores of immediate recall, delayed recall, serial 7s, and IADLs into PCI scores, see Supplementary Table S4.
When scores from the four SGST conditions and the Figure Identification test were added as indicators of the latent factor, a model with acceptable fit was obtained only after fitting a bifactor model in which these five cognitive tests loaded on the web cognitive factor and also formed their own common factor (see Supplementary Figure S2); the fit indices were χ 2(25) = 128.99, p < .001; CFI = 0.96; TLI = 0.92; RMSEA = 0.07 (90% CI = 0.06–0.08); SRMR = 0.031. The web cognitive factor and telephone TICS scores were correlated r = 0.69 in this model.
When the SGST, Figure Identification, and IADL scores were all added as indicators, acceptable model fit was obtained when using a bifactor model as above, χ 2(34) = 155.50, p < .001; CFI = 0.95; TLI = 0.92; RMSEA = 0.06 (90% CI = 0.05–0.07); SRMR = 0.033. The web cognitive factor and telephone TICS scores were correlated r = 0.74.
The PCI scores derived from each of these expanded models distinguished the phone-based Langa–Weir categories with AUC = 0.981 (for the model expansion by IADLs), AUC = 0.960 (expansion by SGST and Figure Identification), and AUC = 0.962 (SGST, Figure Identification, and IADLs).
Self-Reported Alzheimer’s Disease and Dementia
In the analytic sample of 925, 908 individuals answered items about self-reported dementia in at least one prior survey. Of those who had answered, 1.9% reported that they had been told that they had Alzheimer’s disease or another form of dementia or serious memory impairment. Neither of the two individuals meeting the Langa–Weir cut point for dementia responded affirmatively to any self-reported dementia item. However, the scores on the composite TICS telephone measure were significantly lower for those self-reporting dementia, mean = 14.8 (SD = 5.3), than for those not self-reporting dementia, mean = 17.3 (SD = 3.6); t(906) = 2.81, p = .005. A corresponding pattern was evident for the PCI scores, which were significantly higher for those self-reporting dementia, mean = 0.21 (SD = 0.14) compared to those not self-reporting dementia, mean = 0.06 (SD = 0.26); t(906) = 4.32, p < .001, when using PCI scores derived from the primary latent variable model (results were similar for the expanded models).
Consent Survey
A total of 1,910 individuals completed the questions on the consent survey. Supplementary Table S5 shows percent who gave an incorrect answer to each question. As seen in Figure 1, 365 (19.1%) of respondents answered at least one of the three questions incorrectly. Supplementary Table S6 shows the demographic predictors of missing one or more questions on the consent survey. Those most likely to answer incorrectly and therefore to be screened out from continuing in the study were of an ethnic/racial group other than non-Hispanic White, had an educational attainment of high school or less, were lower in household income, and were in their 50s versus being 60 or older.
Supplementary Table S7 shows the scores on the UAS cognitive tests at the wave just prior to the consent survey. Across all tests except the SGST, those who had missed one or more questions on the consent survey scored significantly lower than those who successfully answered all three consent questions. On the SGST as well as on Figure Identification, those who had missed one or more questions on the consent survey were significantly more likely to fail to get at least 70% of the trials correct. Although mean scores on the tests differed significantly, the magnitude of the difference between groups was less than 1 SD.
Discussion
We set out to create a seemingly simple crosswalk between telephone and web administration of the TICS items used by the Langa–Weir categorization of cognitive status. Consistent with prior research (McClain et al., 2018), we found mean differences across the two modes with small to medium effect sizes for each of the TICS items. However, developing a crosswalk turned out to be more challenging than anticipated. As we already knew, asking participants to count backwards was not feasible on the web, reducing the number of items and precluding an exact replication of the Langa–Weir categorization. The word list recall and serial 7s scores showed only modest correlations between the modes of administration, in part due to differential practice effects in 1-month repeated testing, which renders it difficult to establish a precise alignment between the scores. Further, despite our efforts to oversample lower functioning individuals, there were too few participants whose cognitive deficits were suggestive of dementia to develop a crosswalk for dementia categorization.
Despite these challenges, we were able to create a PCI score for web administration that indicates a person’s PCI corresponding to the Langa–Weir cut point for “CIND or dementia” on telephone scores. The PCI score is consistent with the idea of a latent dementia index, the benefits of which have been increasingly recognized (Gavett et al., 2015; Peh et al., 2017; Royall et al., 2012). Despite being a continuous score, the PCI can be used by researchers similarly to a categorical score to estimate group differences or change in the PCI. The PCI score can be calculated based on only word list recall and serial 7s scores; our results suggest good correspondence with the Langa–Weir categorization, even though the resulting PCI scores will have modest reliability and considerable measurement error. A more robust probability score results when functional abilities or a combination of functional abilities, executive function, and processing speed are included—consistent with the experiences of other HRS efforts reported in Crimmins et al. (2011). Overall, based on fit, complexity, and ability to apply to other studies, in our assessment, the model including functional abilities but without the additional cognitive tests is the preferred PCI score. The PCI score is usable for research entailing analyses of cognitive impairment in web-based administrations of the UAS TICS, and, pending replication in other studies, should be transferrable to other web surveys with the same cognitive and functional ability items.
The number of panelists who failed to get past the consent survey is troubling. Based on the comparison of characteristics of those who successfully completed the consent items compared to those who did not, we speculate that loss of participants to the consent process may have reduced the diversity of the sample and perhaps inadvertently eliminated participants with cognitive impairment who nonetheless were competent to consent. On average, for those who did not pass all three consent items, the scores on prior cognitive tests, albeit significantly lower, were within the range that would be considered normal cognition. Because poorer performance on the consent questions occurred more often in younger than older respondents and because of the relationship of failing to answer the consent questions correctly to inattentiveness on the Stop and Go Switching Task and Figure Identification, the consent questions may have functioned more as an instructional manipulation check than as an assessment of capacity to consent. In brief, the consent process truncated the sample, without evidence that it was accurately detecting those who were not capable of making an informed decision about participation in the project.
Potential UAS panel members go through informed consent before agreeing to join the online survey panel. UAS members do not provide additional consent to do each online survey. Those who are recruited for projects with any additional elements, such as this study, are provided information about the study in a consent survey specific to that project. The underlying assumption for panel participants is that those who are able to consent to panel participation, read and respond to recruitment e-mails, and navigate the process of maintaining their UAS account are capable of consenting to projects that involve survey procedures. We suggest that consent to a project that involves more than just an online survey might include features that reduce the possibility of inattention. For example, after each component of the consent survey, the participant might be asked to check a box indicating if they understood the information.
The small number of those likely to have dementia probably accurately reflects inability to participate in self-administered internet surveys if one has a substantial degree of cognitive impairment. However, dementia represents a continuum, and even being told that one has Alzheimer’s disease need not signal inability to sign into a web-based survey and choose to participate in research. A limitation of self-administered web panels such as the UAS is that reports from proxy respondents are not available. In HRS, most people are classified as cognitively impaired based on reports of functioning problems by proxies rather than cognitive testing (Crimmins et al., 2011). Follow-up with nonresponders as part of attrition analyses will be informative of reasons for discontinuing UAS participation.
We note additional limitations and potentials of web-based cognitive screening. Environmental distractions cannot be controlled, although we recommend setting uninterrupted time aside (Liu et al., 2022). Confidence in one’s computer skills varies among individuals, including by age, although the tasks have a low demand for computer expertise. Unlike an interviewer-administered session, it is not possible to prompt the participant to answer the questions, potentially leading to more missing data. Future versions might add a pop-up prompt, for example, for those who elect to skip word recall asking if they would like to try to remember some words. Taking advantage of features available with a computer, including timed tasks in the crosswalk models might be considered.
In conclusion, we report a PCI score that allows researchers to use the UAS panel to study correlates of cognitive impairment and offers an approach that can be adopted by other similar internet panels.
Supplementary Material
Acknowledgments
The project described in this paper relies on data from surveys administered by the Understanding America Study (UAS), which is maintained by the Center for Economic and Social Research (CESR) in the Dornsife College of Letters, Arts and Sciences at the University of Southern California. There was no preregistration. For access to UAS data, go to https://uasdata.usc.edu/.
Contributor Information
Margaret Gatz, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Stefan Schneider, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Erik Meijer, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Jill E Darling, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Bart Orriens, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Ying Liu, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Arie Kapteyn, Center for Economic and Social Research, University of Southern California, Los Angeles, California, USA.
Funding
This work was supported by the National Institutes of Health (U01 AG054580 and R01 AG068190) and the Social Security Administration.
Conflict of Interest
M. Gatz reports being a Medical and Scientific Advisory Committee member (no financial relationship) of Alzheimer’s Los Angeles. No other conflicts of interest reported.
References
- Alzheimer’s Association. (2010). 2010 Alzheimer’s disease facts and figures. Alzheimer’s & Dementia, 6, 158–194. doi: 10.1016/j.jalz.2010.01.009 [DOI] [PubMed] [Google Scholar]
- Asparouhov, T., & Muthén, B. (2010). Plausible values for latent variables using Mplus. Technical report. https://www.statmodel.com/download/Plausible.pdf
- Berg, S. (1980). Psychological functioning in 70- and 75-year-old people: A study in an industrialized city. Acta Psychiatrica Scandinavica, 62(Suppl. 288), 5–47. PMID: 6935934. [PubMed] [Google Scholar]
- Crimmins, E. M., Kim, J. K., Langa, K. M., & Weir, D. R. (2011). Assessment of cognition using surveys and neuropsychological assessment: The Health and Retirement Study and the Aging, Demographics, and Memory Study. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences, 66(Suppl_1), i162–i171. doi: 10.1093/geronb/gbr048 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desmond, D. W., Tatemichi, T. K., & Hanzawa, L. (1994). The Telephone Interview for Cognitive Status (TICS): Reliability and validity in a stroke sample. International Journal of Geriatric Psychiatry, 9(10), 803–807. doi: 10.1002/gps.930091006 [DOI] [Google Scholar]
- Gavett, B. E., Vudy, V., Jeffrey, M., John, S. E., Gurnani, A. S., & Adams, J. W. (2015). The δ latent dementia phenotype in the uniform data set: Cross-validation and extension. Neuropsychology, 29, 344–352. doi: 10.1037/neu0000128 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gross, A. L., Khobragade, P. Y., Meijer, E., & Saxton, J. A. (2020). Measurement and structure of cognition in the Longitudinal Aging Study in India–Diagnostic Assessment of Dementia (LASI-DAD). Journal of the American Geriatrics Society, 68 (Suppl. 3), S11–S19. doi: 10.1111/jgs.16738 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Herzog, A. R., Rodgers, W. L., Schwarz, N., Park, D. C., Knauper, B., & Sudman, S. (1999). Cognitive performance measures in survey research on older adults. In Schwarz N., Park D., Knauper B., & Sudman S. (Eds.), Cognition, aging, and self-reports (pp. 327–340). Psychology Press. [Google Scholar]
- Hu, L. T., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 1–55. doi: 10.1080/10705519909540118 [DOI] [Google Scholar]
- Lachman, M. E., Agrigoroaei, S., Tun, P. A., & Weaver, S. L. (2014). Monitoring cognitive functioning: Psychometric properties of the Brief Test of Adult Cognition by Telephone. Assessment, 21(4), 404–417. doi: 10.1177/1073191113508807 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langa, K. M., Larson, E. B., Crimmins, E. M., Faul, J. D., Levine, D. A., Kabeto, M. U., & Weir, D. R. (2017). A comparison of the prevalence of dementia in the United States in 2000 and 2012. JAMA Internal Medicine, 177(1), 51–58. doi: 10.1001/jamainternmed.2016.6807 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Langa, K. M., Plassman, B. L., Wallace, R. B., Herzog, A. R., Heeringa, S. G., Ofstedal, M. B., Burke, J. R., Fisher, G. G., Fultz, N. H., Hurd, M. D., Potter, G. G., Rodgers, W. L., Steffens, D. C., Weir, D. R., & Willis, R. J. (2005). The Aging, Demographics, and Memory Study: Study design and methods. Neuroepidemiology, 25(4), 181–191. doi: 10.1159/000087448 [DOI] [PubMed] [Google Scholar]
- Liu, Y., Schneider, S., Orriens, B., Meijer, E., Darling, J. E., Gutsche, T., & Gatz, M. (2022). Self-administered web-based tests of executive functioning and perceptual speed: Measurement development study with a large probability-based survey panel. Journal of Medical Internet Research, 24(5), e34347. doi: 10.2196/34347 [DOI] [PMC free article] [PubMed] [Google Scholar]
- McArdle, J., Rodgers, W., & Willis, R. (2015). Cognition and Aging in the USA (CogUSA) 2007–2009. Inter-university Consortium for Political and Social Research; [distributor]. doi: 10.3886/ICPSR36053.v1 [DOI] [Google Scholar]
- McClain, C., Ofstedal, M., & Couper, M. (2018). Measuring cognition in a multi-mode context. Survey Research Center, Institute for Social Research, University of Michigan. https://hrs.isr.umich.edu/publications/biblio/9606 [Google Scholar]
- McGrath, R., Robinson-Lane, S. G., Clark, B. C., Suhr, J. A., Giordan, B. J., & Vincent, B. M. (2021). Self-reported dementia-related diagnosis underestimates the prevalence of older Americans living with possible dementia. Journal of Alzheimer’s Disease, 82(1), 373–380. doi: 10.3233/JAD-201212 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muthén, L. K., & Muthén, B. O. (2017). Mplus: Statistical analysis with latent variables: user’s guide (version 8). Muthén & Muthén. [Google Scholar]
- Patel, S. K., Meier, A. M., Fernandez, N., Lo, T., Moore, C., & Delgado, N. (2017). Convergent and criterion validity of the CogState computerized brief battery cognitive assessment in women with and without breast cancer. The Clinical Neuropsychologist, 31(8), 1375–1386. doi: 10.1080/13854046.2016.1275819 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peh, C. X., Abdin, E., Vaingankar, J. A., Verma, S., Chua, B. Y., Sagayadevan, V., Seow, E., Zhang, Y., Shahwan, S., Ng, L. L., Prince, M., Chong, S. A., & Subramaniam, M. (2017). Validation of a latent construct for dementia in a population-wide dataset from Singapore. Journal of Alzheimer’s Disease, 55, 823–833. doi: 10.3233/JAD-160575 [DOI] [PubMed] [Google Scholar]
- Perin, S., Buckley, R. F., Pase, M. P., Yassi, N., Lavale, A., Wilson, P. H., Schembri, A., Maruff, P., & Lim, Y. Y. (2020). Unsupervised assessment of cognition in the Healthy Brain Project: Implications for web-based registries of individuals at risk for Alzheimer’s disease. Alzheimer’s & Dementia, 6(1), e12043. doi: 10.1002/trc2.12043 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Royall, D. R., Palmer, R. F., & O’Bryant, S. E. (2012). Validation of a latent variable representing the dementing process. Journal of Alzheimer’s Disease, 30, 639–649. doi: 10.3233/JAD-2012-120055 [DOI] [PubMed] [Google Scholar]
- Selwood, A. (2022, March). Barriers to administering, retrieving and interpreting online cognitive testing in an older adult cohort. Paper presented at the Current Innovations in Probability-based Household Internet Panel Research (CIPHER) Conference [virtual]. https://cesr.usc.edu/cipher_2022_presentations.
- Welsh, K. A., Breitner, J. C., & Magruder-Habib, K. M. (1993). Detection of dementia in the elderly using telephone screening of cognitive status. Neuropsychiatry, Neuropsychology, & Behavioral Neurology, 6, 103–110. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


