Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Feb 20.
Published in final edited form as: J Clin Exp Neuropsychol. 2025 Feb 20;47(1-2):67–89. doi: 10.1080/13803395.2025.2464633

Usability of the Mayo Test Drive remote self-administered web-based cognitive screening battery in adults ages 35 to 100 with and without cognitive impairment

Jay S Patel 1, Teresa J Christianson 3, Logan T Monahan 1, Ryan D Frank 3, Winnie Z Fan 3, John L Stricker 2, Walter K Kremers 3, Aimee J Karstens 1, Mary M Machulda 1, Julie A Fields 1, Jason Hassenstab 4, Clifford R Jack Jr 5, Hugo Botha 6, Jonathan Graff-Radford 6, Ronald C Petersen 6, Nikki H Stricker 1
PMCID: PMC12129114  NIHMSID: NIHMS2056696  PMID: 39976252

Abstract

Background.

Mayo Test Drive (MTD): Mayo Test Development through Rapid Iteration, Validation and Expansion, is a web-based remote cognitive assessment platform for self-administered neuropsychological measures with previously demonstrated validity and reliability. We examined the usability of MTD and hypothesized that completion rates would be greater than 90%. We explored whether completion and participation rates differed by cognitive status and demographic factors.

Methods.

1,950 Mayo Clinic Study of Aging and Mayo Alzheimer’s Disease Research Center participants (97% White, 99% Non-Hispanic) were invited to participate in this ancillary, uncompensated remote study. Most invitees were cognitively unimpaired (CU; n=1,769; 90.7%) and 9.3% were cognitively impaired (CI; n=181). Usability was objectively defined as the percentage of participants who completed a session after initiating a session for a given timepoint (i.e., completion rates).

Results.

Baseline session completion rates were 98.5% (n=1199/1217 participants, mean age 71, SD=12, range 35–100) and were comparable between CU (98.7%) and CI (95.0%) groups (p=.23). Completion rates did not significantly differ by age groups (p>.10) and remained high in individuals 80+ (n=251, 97.3%). Participation rates were higher in the CU (n=1142, 65.4%) versus CI (n=57, 33.1%) group (p<.001); participants were younger and had more years of education (p’s < .001). Adherence (i.e., retention) rate for a 7.5-month follow-up session was 89%. Average session duration was 16 minutes. Most participants used a personal computer (62.7%), followed by a smartphone (22.2%) or tablet (14.8%). Comments entered by 36.4% of participants reflected several themes including acceptability, face validity, usability, and comments informative for session context.

Conclusions.

MTD demonstrated high usability as defined by completion rates in this research sample that includes a broad age range, though participation rates are lower in individuals with cognitive impairment. Results support good adherence at follow-up, feasibility through mean session duration, and acceptability based on qualitative analysis of participant comments.

Keywords: digital health, neuropsychology, smartphone, aging, computerized neuropsychological assessment, feasibility

Introduction

Self-administered digital cognitive measures have the potential to increase access to cognitive testing relative to traditional person-administered cognitive screening and assessment (Sabbagh et al., 2020). The degree to which digital cognitive tests are usable, feasible and acceptable is important for successful implementation and scalability. For example, many digital tools are designed for a broad target population but ultimately require supervision and have high technological demands that result in a similar degree of administration resources as paper-and-pencil tests.

Amirpour et al. (2022) tailored definitions of usability, feasibility, and acceptability for digital cognitive assessment technologies that we adopt for this study. Usability addresses the extent to which self-administered cognitive measures can be used by the target population and was characterized in this study through completion rates and participant feedback. Feasibility explores factors that may influence the successful implementation of the tool for the intended purpose in the target population, including user experience factors like duration, test session context (e.g., location at time of test, any noise or interruptions, and other non-cognitive factors that may influence performance), and device used when taking the test. Acceptability reflects factors that may influence willingness of the target population to use a digital, self-administered cognitive assessment (e.g., participation rates, adherence/retention rates, and user feedback). Characterizing these factors for both person-administered and self-administered computerized measures is critical for identifying potential implementation and scalability barriers. For example, Hackett et al. (2018) reported that the Picture Sequence Memory and List Sorting Working Memory subtests from the NIH Cognition Toolbox were too challenging for participants with cognitive impairment (i.e., low completion rates) and were ultimately removed from their study that used in-person, interactive cognitive testing. They also reported 21% of participants had missing data on remaining subtests from the NIH Cognition Toolbox. Thus, the utility of digital tools, regardless of a paradigm’s reliability and validity, can be limited by whether the target population is able and willing to engage with the testing platform and complete the test(s) as intended.

During the COVID-19 pandemic, Jacobs et al. (2021) surveyed cognitively unimpaired (CU) participants and the informant/study partner of participants with cognitive impairment enrolled in an Alzheimer’s Disease Research Center and found that 84% of cognitively unimpaired (CU) individuals and 74% of study partners responding on behalf of participants with cognitive impairment (but without dementia) were willing to engage in interactive, video-based remote cognitive assessment (e.g., telehealth). Most (85%) reported device access. In contrast, only 39% of study partners with dementia reported willingness of the participant to engage in a single remote cognitive assessment and 52% reported device access. Slightly fewer participants/study partners expressed willingness to engage in multiple sessions of very brief cognitive testing (i.e., “burst” model) on a smartphone (66% CU, 59% cognitive impairment, 39% dementia). These survey results provide insights about attitudes toward remote cognitive testing agnostic of any specific platform but are limited to a theoretical “what if” survey response by or on behalf of participants (mean age 77, SD=8; mean education = 16, SD=3; 58% female; 14% Latino) rather than observed behavioral outcomes at the time of an actual invitation to engage in remote assessment. Further, the survey was completed either online or via phone, and it is possible that study partners responding on behalf of participants may have over-estimated the likelihood of engagement in remote assessment. Study partners were also less likely to complete the survey (78% and 69% completing on behalf of individuals with cognitive impairment and dementia, respectively) than CU participants (89%). Several studies have directly explored usability, feasibility, and acceptability in older-adult samples, with most demonstrating satisfactory usability rates (Edgar et al., 2021; Kochan et al., 2022; Ohman et al., 2022; Perin et al., 2020; Skirrow et al., 2022; Thompson et al., 2022). However, adherence rates, defined as continued participation over longitudinal follow-up, decrease over time, limiting the utility of these remote assessments for longitudinal monitoring (Berron et al., 2022; Ohman et al., 2022; Thompson et al., 2022; Walter et al., 2020). Cognitive status, demographic variables, and subjective memory concerns influence elements of usability and acceptability including engagement, adherence, and retention rates (Berron et al., 2022; Hischa et al., 2022; Skirrow et al., 2022).

Mayo Test Drive (MTD): Mayo Test Development through Rapid Iteration, Validation and Expansion was intentionally designed for high usability and ease of access. Design needs of older adults and individuals with cognitive impairment prioritized in platform development included low technology demands [e.g., no app download or log-in requirements, no swiping/dragging responses that are associated with decreased acceptability and usability (Hackett et al., 2018; Young et al., 2023), no app navigation, no speaker/microphone use], multi-device compatibility, easy-to-follow instructions, visual optimization (e.g., high contrast displays, large font size), and consideration of the high base rate of hearing impairment in older adults (Marinelli et al., 2022). MTD is a web-based platform that is compatible with a variety of devices, including smartphones, tablets, and personal computers. Users receive a unique, one-click url link with an embedded study name and participant ID. MTD has a simple user interface that only requires single touch (or click) responses. The platform can be used without personally identifiable information to decrease privacy concerns.

The MTD screening battery includes (1) a computer-adaptive word-list memory test, the Stricker Learning Span (SLS) (J. L. Stricker et al., 2023; Stricker et al., 2024), that shows similar ability to differentiate Alzheimer’s disease biomarker-defined groups as the in-person administered Rey’s Auditory Verbal Learning Test (Stricker et al., 2024) and (2) a measure of processing speed/executive functioning that also requires visual discrimination (Symbols Test) (Boots et al., 2024; N. H. Stricker, J. L. Stricker, et al., 2022). The MTD screening battery composite shows robust associations with an in-person administered Mayo Preclinical Alzheimer’s disease Cognitive Composite (Mayo-PACC; N. H. Stricker, E. L. Twohy, et al., 2022) and with core Alzheimer’s disease positron emission tomography (PET) imaging biomarkers and hippocampal volume (Boots et al., 2024), as well as large effect sizes for differentiating individuals with and without cognitive impairment (Boots et al., 2024). Preliminary support for the feasibility, validity, and reliability of MTD was previously reported in an initial all-remote pilot study in 96 women aged 55–79 without dementia; 98% of participants completed a test session when a session was initiated (N. H. Stricker, J. L. Stricker, et al., 2022). Normative data based on remote, unsupervised sessions are available and show sensitivity to mild cognitive impairment (MCI) and dementia (Stricker, Frank et al., 2025).

The primary aim of the current study was to examine the usability of MTD in a large sample of adults and older adults with and without cognitive impairment. Usability was objectively defined as the percentage of participants who completed a session after initiating a session for a given timepoint (i.e., completion rates). We hypothesized that completion rates would be greater than 90%. We also explored whether completion rates differed by cognitive status, age and device type. We included additional descriptive aims. First, we report participation rates to describe the demographic and clinical characteristics of individuals willing vs. not willing to initiate participation in an ancillary, uncompensated, online remote cognitive assessment study. Second, we report adherence (i.e., retention) rates for those due for a follow-up MTD session. Third, we characterize factors that inform feasibility to implement MTD in research and clinical settings such as session and subtest durations (i.e., efficiency), frequency of interruptions and noise during test sessions, and device type use across demographic and clinical characteristics for completed sessions. Finally, we qualitatively analyzed themes relating to acceptability, usability and feasibility through voluntarily free-text user comments provided at the end of completed test sessions.

Methods

Participants and Recruitment Procedures

The Mayo Clinic Study of Aging (MCSA) is the primary source of participants for this study. The MCSA is a population-based study of individuals aged 30 years and older living in Olmsted County, MN who are randomly sampled to meet sex- and age-stratification goals using the resources of the Rochester Epidemiology Project medical records-linkage system (St Sauver et al., 2012). Exclusion criteria are a terminal illness or hospice. Over 60% of residents contacted enroll in the MCSA and follow-up retention is 80%. Study visits include neurological examination with medical history review and administration of the Short Test of Mental Status (STMS) (Kokmen et al., 1991) by a study physician, clinical interview and completion of the Clinical Dementia Rating® (CDR) scale by a study coordinator (Morris, 1993), and in-person neuropsychological testing (Roberts et al., 2008). After each study visit, a diagnosis of CU, MCI (Petersen, 2004), or dementia (American Psychiatric Association, 1994) is established after the examining physician, interviewing study coordinator, and neuropsychologist make independent diagnostic determinations and then reach consensus agreement (Roberts et al., 2008). Prior visit data and MTD data are not considered for diagnosis. CU individuals aged 50 or older and participants with MCI or dementia complete in-person study visits every 15 months. CU individuals aged 50 or younger complete in-person study visits every 30 months. Because of the population-based sampling design, most participants in the MCSA are CU. To enrich the sample for individuals with cognitive impairment, additional participants were recruited from the Mayo Clinic Alzheimer’s Disease Research Center (ADRC; Rochester, MN).

The MTD study in the MCSA and ADRC began with a pilot phase (5/25/21–9/3/21). A limited number of participants were initially invited to ensure study tasks could be completed by select groups that included newly enrolled (first visit) MCSA participants, individuals 70+ previously participating in a Cogstate Brief Battery (CBB) home-based option (for details, see Stricker et al., 2020), individuals aged 80+ not previously participating in a CBB home-based option, and individuals with cognitive impairment. During this phase, the study coordinator attempted to make phone calls to all participants who did not complete MTD after reminder emails. From 9/4/21–10/3/21 we invited all new MCSA enrollees, but no phone follow-up was provided due to limited study coordinator resources. An initial phase of large-scale recruitment occurred 10/4/21–5/9/22, wherein all participants with MCSA visits (new and return) were invited to participate; limited phone follow-up and support was provided. As of 5/10/22 study coordinator support was increased so that we could continue to offer some phone follow-up support and a small number of in-clinic visits (upon request).

The primary recruitment method in the MCSA was via email. Participants were provided an MTD information sheet at the time of the in-person study visit that provided the study name and study contact information and alerted them that they may receive an email or phone call inviting them to participate in this study. An email invitation that explained the study and contained oral consent elements was sent the following week. Each participant received two reminder emails about one week apart, and participants were placed on a “to call” list if they did not respond to the final reminder email (when study coordinator resources allowed for phone call reminders). For any individuals requiring a legally authorized representative (LAR), an interactive consent conversation with the LAR and participant was completed.

ADRC recruitment started 6/10/21 and focused on individuals with MCI and dementia due to AD to increase representation of cognitively impaired participants because the MCSA includes predominantly CU participants. Because of this recruitment focus in the ADRC, the primary recruitment method for that parent study was an oral consent conversation at the time of the in-person visit. If consented, then emails with instructions and the test link were sent the following week. Of note, relatively few participants were recruited from the ADRC parent study due to prioritization of another NIH-funded study targeting similar participants during this time frame (Nosheny et al., 2023).

The pattern of contacts was the same for already accrued participants regarding email reminders for initial and follow-up MTD sessions as it was for recruitment emails for both parent studies (first email, two reminder emails if needed, then placed on phone follow-up list). Our ability to complete phone call reminders once participants were placed on the phone follow-up list varied depending on study coordinator resources as noted above. Qualitatively, we also observed that many participants do not answer or respond to phone calls, and this seemed most prominent among younger participants.

Ethical Approvals

This study was approved by the Mayo Clinic Institutional Review Board (IRB). Parent studies were approved by the Mayo Clinic IRB and the MCSA was additionally approved by the Olmsted Medical Center IRB. Written consent was obtained for participation in the parent study protocols (MCSA or ADRC) and oral consent (that includes review of consent elements via email) was obtained for participation in the ancillary MTD remote study protocol. This study was conducted in accordance with the Declaration of Helsinki. No remuneration was provided for the ancillary MTD study.

MTD Procedures

MTD emails contained links that provided direct access to the assessment without any log-in requirements. Emails also included a QR code link; if clicked, users are taken to a website with a QR code that can be used if the participant is reading their email on a personal computer and would prefer to take the test on a mobile device (smartphone or tablet). Emails provided general instructions, including that participants should take the tests when they have 15–20 minutes in a quiet environment; that they can use a smartphone, tablet, or personal computer to take the tests; to complete the tests in one sitting; to avoid closing their web browser; and that they can receive help getting to the testing website or with any technological questions but that they should take the tests alone without any interruptions (e.g., no television, radio, or conversation). Participants are told not to share the email or the links with others or to allow someone else to take the test using their link. They are informed they will be invited to provide comments when they reach the end of testing.

Participants received additional instructions within the MTD test session after a brief welcome screen. Participants are again instructed to complete the tests by themselves in a quiet area where they will not be distracted for 15–20 minutes and are again asked to complete the tests without direct assistance from another person. Participants report their location on the next screen (multiple-choice format). The test is considered “initiated” when participants make a response on this location-selection screen. For the current study, participants are considered to have completed the MTD session for a given timepoint if, after initiating a session, a session is subsequently fully completed.

The MTD screening battery consists of two subtests: the Stricker Learning Span (SLS) and the Symbols test (see Figure 1 for example screen shots). The SLS is a computer adaptive list learning task, previously described in detail (Boots et al., 2024; J. L. Stricker et al., 2023; Stricker et al., 2024; N. H. Stricker, J. L. Stricker, et al., 2022). Briefly, single words are visually presented sequentially across five learning trials (range of words presented is 8–23 following adaptive rules) (Stricker et al., 2024). After each list presentation, memory for each word presented is tested with four-choice recognition. The delay trial of the SLS occurs following completion of the Symbols Test; all words presented during any of the 5 learning trials are tested at delay (words presented at delay range from 8–23 following adaptive rules). The Symbols Test is an open-source measure of processing speed/executive functioning with previously demonstrated validity and reliability (Boots et al., 2024; Nicosia et al., 2022; Sliwinski et al., 2016). For each item, participants identify which of two symbol pairs on the bottom of the screen matches one of three symbol pairs presented at the top of the screen. There are four 12-item trials administered sequentially in MTD. Although MTD does not currently provide participants with normative feedback, participants receive basic feedback about test performance immediately after each subtest trial. For example, at the end of each SLS trial participants are told how many words were identified correctly in that trial (they are also told how many words will be presented prior to each trial). Similarly, at the end of each Symbols trial participants are told how many seconds it took them to complete the trial.

Figure 1.

Figure 1.

Example Mayo Test Drive screening battery screen shots.

Note. Mayo Test Drive subtest screen shots are depicted here on a smartphone. A. Stricker Learning Span (SLS) is a computer adaptive word list memory test. Practice item SLS stimuli are displayed. Copyright © 2020 Mayo Foundation for Medical Education and Research. Used with permission from Mayo Foundation for Medical Education and Research, all rights reserved. B. Symbols Test is a measure of processing speed/executive functioning that also requires visuospatial processing. Copyright © 2017 Washington University in St. Louis. Used with permission from J. Hassenstab. Figure from: Stricker et al. A novel computer adaptive word list memory test optimized for remote assessment: Psychometric properties and associations with neurodegenerative biomarkers in older women without dementia. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring, 2022; used with permission from Mayo Foundation for Medical Education and Research, all rights reserved.

Both the SLS and Symbols start with a practice (called a “warm-up”). A single item is presented (single word to remember followed by 4-choice recognition or single Symbols item). If correct the first time, participants continue to the full subtest and are assigned a warm-up score of 4/4. If incorrect, another practice item is presented (score of 3/4 if passed on second attempt or 2/4 if passed on third attempt). If a participant cannot pass the warm-up within three attempts, the test is discontinued and there is one additional task to determine ability to follow basic instructions (1/4 if correct; 0/4 if incorrect).

Each subtest is followed by a question asking if anything interfered with performance during that test (SLS 1–5, Symbols, SLS Delay). If endorsed, a follow-up item presents several response options to select from. After all subtests are completed, participants are asked to self-report their device type, method of response input (e.g., touch, mouse, etc.) and whether it was noisy when they completed the tests. If noise is endorsed, a follow-up item presents several response options to select from.

Qualitative analyses of baseline sessions examined voluntary free-text data from comment boxes embedded in MTD at the end of completed sessions. All comments were first reviewed (NHS, JSP, LTM) and potential coding categories with example responses were generated and subsequently used to assign ratings. Comments were coded into the following categories: acceptability, face validity (a subset of acceptability), usability, and behavioral observations proxy. After initial training and double rating to ensure fidelity, LTM served as the primary rater. JSP and NHS reviewed ratings for accuracy. For comments that were hard to categorize or for which there was disagreement, consensus meetings were held to review rating options and reach group consensus (LTM, JSP, NHS, AJK).

Inclusion Criteria

This study included MCSA and ADRC participants invited to complete MTD between 5/25/21 and 10/4/22. Participants had to be actively enrolled in the MCSA or ADRC parent studies to be eligible for this ancillary MTD study. Because we allowed individuals to request to complete MTD in clinic, device ownership and internet access were not a specific inclusion requirement. Exclusionary criteria for the MTD ancillary study were: 1) unable to read and speak English, and 2) unable to complete study activities. We did not screen for vision loss but if an individual was too visually impaired per the study physician to attempt the copy of the cube on the STMS in the parent study we did not send a recruitment email (considered unable to complete study activities). Available linked parent study data as of 11/13/22 were included, allowing a 6-week window to complete MTD following the first invitation. Nearly all participants completed MTD remotely except for 9 individuals who requested to come to clinic to complete MTD in-person for their baseline visit and 1 who requested to come to clinic for the follow-up (2nd) MTD session. Individuals completing MTD in clinic were assisted in initiating the test session and then were left in a quiet room alone to complete the test session but could request help if needed. All other participants self-administered MTD remotely. Participants were able to call or email a study coordinator for questions or assistance, as needed.

Statistical Analyses

Data were descriptively summarized using means and standard deviations for continuous variables and counts and percentages for categorical variables. Comparisons of data distributions across participation and completion status were performed using chi-square/Fisher exact tests for categorical variables (where appropriate), and linear regression models for continuous variables. P values adjusted for the effects of age, sex, and education were calculated from logistic regression models for dichotomous outcomes (clinical diagnosis) and multinomial logistic regression for categorical variables with 3 levels (e.g., categorized age). When categorized age was the outcome, we only adjusted for sex and education. All P values are 2-sided; all statistics were performed using SAS version 9.4 (SAS Institute Inc., Cary, NC).

Results

Participant characteristics

A total of 1,950 individuals were invited to participate in this study. Most invitees were from the Mayo Clinic Study of Aging (MCSA; 99%), and a subsample were from the Mayo Clinic Alzheimer’s Disease Research Center (ADRC; 1%). Mean age of invitees was 73 years, 50% were female, average education was 15 years, and invitees were predominately White (97%) and non-Hispanic/Latinx/e/o/a (99%; see Table 1). Most invitees were CU (n=1,769; 90.7%). Those diagnostically categorized as cognitively impaired (n=181; 9.3%) consisted of participants with mild cognitive impairment (MCI; n=144) and dementia (n=36).

Table 1.

Sociodemographics and cognitive status by group. Mean (SD) presented except where otherwise indicated.

Invitees Non-Participants Participants Non-Completers Completers
Sample size N=1950 N=733 N=1217 P N=18 1199 P
Age (years) 72.7 (12.4) 75.0 (13.5) 71.3 (11.5) <.001 80.1 (8.4) 71.2 (11.5) <.001
 Range 34.4 – 101.1 34.4 – 101.1 35.8 – 100.3 62.76 – 89.7 35.8 – 100.3
Sex, n (%) .462 .147
 Female 968 (49.6) 356 (48.6) 612 (50.3) 6 (33.3) 606 (50.5)
 Male 982 (50.4) 377 (51.4) 605 (49.7) 12 (66.7%) 593 (49.5)
Education (years) 15.3 (2.4) 14.9 (2.5) 15.6 (2.4) <.001 15.6 (2.6) 15.6 (2.4) .931
 Range 6 – 20 6 – 20 6 – 20 12–20 6 – 20
Race, n (%) .396 1.00
 White 1883 (97.0) 701 (96.3) 1182 (97.4) 18 (100.0) 1164 (97.3)
 Black 18 (0.9) 10 (1.4) 8 (0.7) 0 (0) 8 (0.7)
 Asian 25 (1.3) 12 (1.6) 13 (1.1) 0 (0) 13 (1.1)
 Native American/Alaskan 2 (0.1) 1 (0.1) 1 (0.1) 0 (0) 1 (0.1)
 Native Hawaiian/Pacific Islander 1 (0.1) 0 (0) 1 (0.1) 0 (0) 1 (0.1)
 More than one 13 (0.7) 4 (0.5) 9 (0.7) 0 (0) 9 (0.8)
 Unknown/not reported 8 (0.4) 5 (0.7) 3 (0.2) 0 (0) 3 (0.3)
Ethnicity, n (%) .031 1.00
 Non-Hispanic/Latine 1935 (99.2) 728 (99.3) 1207 (99.2) 18 (100.0) 1189 (99.2)
 Hispanic or Latine 7 (0.4) 0 (0) 7 (0.6) 0 (0) 7 (0.6)
 Unknown 8 5 (0.7) 3 (0.2) 0 (0) 3 (0.3)
Kokmen Short Test of Mental Status 34.9 (3.3) 34.0 (4.1) 35.5 (2.6) <.001 33.9 (4.6) 35.5 (2.5) .010
 Missing 45 20 25 0 25
 Range 1.0 – 38.0 1.0 – 38.0 12.0 – 38.0 19.0 – 38.0 12.0 – 38.0
Estimated Mini Mental Status Exam a 28.1 (2.1) 27.6 (2.6) 28.5 (1.5) <.001 27.4 (2.9) 28.5 (1.5) .004
 Missing 45 20 25 0 25
 Range 1.0 – 30.0 1.0 – 30.0 12.0 – 30.0 18.0 – 30.0 12.0 – 30.0
CDR Global score <.001 .242
 Missing 11 6 5 1 4
 0 (n, %) 1766 (91.1) 619 (85.1) 1147 (94.6) 15 (88.2) 1132 (94.7)
 0.5 (n, %) 154 (7.9) 91 (12.5) 63 (5.2) 2 (11.8) 61 (5.1)
 1 (n, %) 11 (0.6) 10 (1.4) 1 (0.1) 0 (0) 1 (0.1)
 2 (n, %) 8 (0.4) 7 (1.0) 1 (0.1) 0 (0) 1 (0.1)
a

Mini Mental Status Exam (MMSE) score is estimated based on the Kokmen Short Test of Mental Status.

Note: CDR = Clinical Dementia Rating ® scale. Table used with permission of Mayo Foundation for Medical Education and Research; all rights reserved.

Participation

Of all participants invited (n=1,950), 62.4% initiated an MTD session (n=1,217, i.e., participated). Most non-participants (67%) did not respond and 32.8 % declined (also see Supplemental Table 1). Participation rates were higher in the CU group (65.4%) compared to the cognitively impaired group (33.1%, p < 0.001). MTD participants (n=1217) were slightly younger (mean age 71.3 vs 75.0, p < .001) with slightly higher years of education (15.6 vs 14.9, p < .001) compared to those not participating (n=733; Table 1). Sex and race were not significantly different across participation groups (p’s > .40) though we note that all invited (n=7) Hispanic/Latinx/e/o/a participants did participate, resulting in a statistical difference across ethnicity groups (p=0.03); race/ethnicity analyses are limited by sample size as we were not adequately powered to detect reliable differences by race/ethnicity group. Participants scored slightly higher on the STMS global cognitive screening measure (35.5 vs 34.0, p < .001; estimated MMSE 28.5 vs 27.6, p < .001) (Tang-Wai et al., 2003) and were more likely to have CDR global score of zero compared to non-participants (94.6% vs 85.1%, p < 0.001). Baseline MTD sessions were completed an average of 0.54 months (SD=0.38) after the most recent in person parent study visit when diagnostic status and in-person assessments were completed, including the STMS and CDR.

Participation rates varied by age in the CU group (adjusted p <.001). Individuals between ages 34–64 and 65–79 demonstrated similar participation rates (68.9% and 70.2%, respectively), whereas the participation rate for those over 80 years was 53.8% (Supplemental Table 2). Participation rates were not significantly different across MCI and dementia subgroups (34.9% vs 25.7%, adjusted p =0.30, Supplemental Table 3), though the small number of dementia participants limits power to detect any differences by cognitive impairment subgroup. Less than 1% of participants requested to come to the clinic to complete MTD (n=9); 99.3% of participants engaged in MTD remotely.

Usability

Usability was examined through baseline completion rates. Most participants who initiated an MTD baseline test session completed a session (“completers”), with 98.5% overall completion rates. Completion rates were slightly higher in the CU (1142/1157=98.7%) than cognitively impaired (57/60=95.0%) groups, but this difference did not reach significance (adjusted p = 0.23; Figure 2, Supplemental Table 1). In comparison to non-completers (n = 18), completers (n = 1199) were younger (mean 71.2 vs 80.1, p = .001). There were no significant differences in other sociodemographic characteristics across completion groups (all p’s > 0.15). Completers scored slightly higher on a cognitive screening measure compared to non-completers (STMS: 35.5 vs 33.9, p = .01; estimated MMSE: 28.5 vs 27.4, p = .004). However, the CDR global score was not significantly different across completers and non-completers (p = .24).

Figure 2.

Figure 2.

Usability: participant completion rates.

Note. Figure used with permission of Mayo Foundation for Medical Education and Research, all rights reserved.

There were no significant differences in completion rates across age groups among CU participants (99.7%, 98.8%, and 97.3% in ages 34–64, 65–79, and 80+ respectively; adjusted p=0.10, Supplemental Table 2). Completion rates were not significantly different across MCI and dementia subgroups (96.1% vs 88.9%, p = .36, Supplemental Table 3).

Because our primary device type variable is based on a self-report selection of device type at the end of the test session, comparison of completion rates by device type could not be completed. However, for incomplete sessions, we examined information about device type that can be inferred based on user agent data to allow a qualitative description. Based on this information, of the 18 of participants who initiated but did not complete a baseline session, 15 used a PC (11 Windows, 4 Mac), one used an iPad, one used an iPhone, and one used an unknown device type (suspected Chromebook, which can be classified as either a tablet or a PC depending on how it is used).

We also examined the number of discontinued sessions present for individuals who met the objective definition of a completed session (completed a session after initiating a session for a given timepoint); in other words, more than one session attempt was made en route to completion. Of individuals with a completed baseline session, 3.3% (n=38) also had a discontinued session. The frequency of discontinued sessions did not differ by cognitive status group (CU vs. cognitively impaired, p = 0.63; MCI vs dementia, p > 0.99; Supplemental Tables 1 and 3). However, there were higher rates of discontinued sessions in older age groups among CU participants (p = 0.008), with 0.9% for those ages 34–64, 3.3% for those ages 65–79, and 5.6% for ages 80+ (Supplemental Table 2). In CU participants, individuals completing MTD on a tablet were more likely to have a discontinued session prior to completion (6%) relative to those using a computer (2.5%) or smartphone (3.1%; adjusted p = 0.02).

Adherence

A subsample of 583 participants were asked to complete a second session of MTD approximately 7–8 months after the baseline MTD session. All participants enrolled in the MTD ancillary study who remain active participants in the associated parent study will be invited to complete follow-up MTD sessions; the subset reported here represent those who were due for a follow-up session by the time of data analysis. Adherence (i.e., retention), defined as individuals who initiated a test session among those due for a follow-up session, was 89.0% (519/583) at follow-up. Completion rates at the follow-up session were high (99.0%, 514/519 of those initiating a follow-up session completed a session). See Supplemental Table 4 for additional details.

Feasibility

Mean session duration was 16.4 minutes (all completers, baseline session). The CU group demonstrated faster total session duration time (mean=16.2 minutes) than the cognitively impaired group (mean=19.3 minutes, adjusted p < .001). SLS duration times were comparable across CU and cognitively impaired groups (SLS Trials 1–5 mean: 8.7 minutes vs 8.7 minutes, adjusted p =0.61; SLS Delay mean: 1.4 minutes vs 1.5 minutes, adjusted p =0.76), despite significantly lower performance in the cognitively impaired group relative to the CU group on all SLS performance variables (adjusted p’s<.001; Table 2). Symbols duration time was slower in the cognitively impaired group (mean=5.3 minutes) relative to the CU group (mean=3.5 minutes, adjusted p < .001; see Table 2); this slower symbols duration time is also reflected in significantly slower speed on a primary Symbols test performance variable, average response time on correct items, in the cognitively impaired group (mean=5.5 seconds) relative to the CU group (mean=3.5 seconds, adjusted p < .001). Among completers, the cognitively impaired group had a higher frequency of CDR Global score > 0 and significantly lower performances on the STMS, estimated MMSE, MTD composite score, and all MTD subtest scores compared to the CU group (adjusted p’s<.001; Table 2). In addition, dementia participants had a higher frequency of CDR Global score > 0 and significantly lower performances on the STMS, estimated MMSE, MTD Composite and all SLS scores compared to MCI participants (adjusted p’s<.05; Supplemental Table 5). MCI (n=49) and dementia participants (n=8) showed comparable MTD duration times (total session and subtest durations) and comparable performance on Symbols test performance variables (adjusted p’s>.05), though these analyses are limited by sample size as we were not adequately powered to detect reliable differences by cognitive impairment subgroup. Among CU participants, older age was associated with longer duration times even after adjustment for sex and education (adjusted p < .001 for all; see Table 3).

Table 2.

Session characteristics for all participants who completed a baseline MTD session by diagnostic subgroup. Mean (SD) presented except where otherwise indicated.

Total Completers Cognitively Unimpaired Cognitively Impaired Unadjusted Adjusted
N=1199 N=1142 N=57 P a P b
Age at MTD (years) 71.2 (11.5) 70.8 (11.4) 78.1 (10.1) < 0.001 <.001
 Range 35.8 – 100.3 35.8 – 100.3 51.7 – 93.8
Male Sex, n (%) 593 (49.5) 563 (49.3) 30 (52.6) 0.623 0.34
Education (years) 15.6 (2.4) 15.6 (2.3) 14.4 (2.6) < 0.001 < 0.001
 Range 6 – 20 6 – 20 11 – 20
Kokmen STMS 35.5 (2.5) 35.8 (2.0) 29.5 (4.1) < 0.001 < 0.001
Estimated MMSE c 28.5 (1.5) 28.7 (1.2) 24.9 (2.8) < 0.001 < 0.001
CDR Global score < 0.001 < 0.001
 Missing 4 1 3
 0 (n, %) 1132 (94.7) 1111 (97.4) 21 (38.9)
 0.5 (n, %) 61 (5.1) 30 (2.6) 31 (57.4)
 1 (n, %) 1 (0.1) 0 (0.0) 1 (1.9)
 2 (n, %) 1 (0.1) 0 (0.0) 1 (1.9)
Mayo Test Drive (MTD) scores
 MTD Composite d 104.6 (22.6) 106.4 (20.9) 67.5 (23.3) < 0.001 < 0.001
 SLS Trials 1–5 Correct 59.3 (14.5) 60.4 (13.6) 37.8 (14.7) < 0.001 < 0.001
 SLS Delay Correct 14.6 (4.7) 15 (4.5) 8.2 (4.3) < 0.001 < 0.001
 SLS Sum of Trials e 73.9 (18.7) 75.3 (17.6) 46.0 (18.3) < 0.001 < 0.001
 SYM Average Response Time Correct Items, seconds f 3.6 (1.3) 3.5 (1.1) 5.5 (2.3) < 0.001 < 0.001
 SYM Accuracy-Weighted Score g 30.6 (7.1) 31.1 (6.6) 21.5 (9.7) < 0.001 < 0.001
Device Type 0.964 0.649
 Desktop computer or laptop, n (%) 751 (62.7) 715 (62.7) 36 (63.2)
 Smartphone, n (%)  266 (22.2) 254 (22.3) 12 (21.1)
 Tablet, n (%) 177 (14.8) 168 (14.7) 9 (15.8)
 Other/not sure, n (%) 4 (0.3) 4 (0.3) 0 (0.0)
 Missing, n (%) 1 (0.1) 1 (0.1) 0 (0.0)
Input Source 0.310 0.108
 Mouse (n, %) 636 (53.1) 602 (52.8) 34 (59.6)
 Touch (n, %) 476 (39.7) 453 (39.7) 23 (40.4)
 Touchpad / trackpad (n, %) h 58 (4.8) 58 (8.1) 0 (0.0)
 Stylus / digital pen (n, %) 20 (1.7) 20 (1.8) 0 (0.0)
 Other / Not Sure (n, %) 8 (0.7) 8 (0.7) 0 (0.0)
 Missing, n (%) 1 (0.0) 1 (0.1) 0 (0.0)
Session duration, minutes 16.4 (3.8) 16.2 (3.6) 19.3 (4.9) < 0.001 < 0.001
 Range 9.0 – 40.2 9.0 – 40.2 11.6 – 31.4
SLS Warm-Up duration, minutes 0.3 (0.3) 0.3 (0.3) 0.4 (0.3) 0.010 0.06
 Range 0.1 – 8.1 0.1 – 8.1 0.1 – 2.2
SLS Trials 1–5 duration, minutes 8.7 (2.1) 8.7 (2.1) 8.7 (2.5) 0.828 0.61
 Range 3.7 – 23.6 3.7 – 23.6 4.5 – 15.4
SLS Delay duration, minutes 1.4 (0.6) 1.4 (0.6) 1.5 (0.6) 0.828 0.76
 Range 0.5 – 10.0 0.5 – 10.0 0.6 – 3.7
Symbols Warm-Up duration, minutes 0.5 (0.3) 0.4 (0.3) 0.7 (0.5) < 0.001 < 0.001
 Range 0.1 – 3.6 0.1 – 3.6 0.2 – 3.1
Symbols Test duration, minutes 3.5 (1.1) 3.5 (1.0) 5.3 (1.9) < 0.001 < 0.001
 Range 1.5 – 11.5 1.5 – 11.5 2.9 – 11.3
a

Continuous variable p-values from linear model ANOVAs, categorical p-values from Pearson’s Chi-Squared test.

b

Adjusted p-values are adjusted for the effects of age, sex, and education using multivariable logistic regression models.

c

Mini Mental Status Exam (MMSE) score is estimated based on the Kokmen Short Test of Mental Status.

d

MTD Composite = SLS Sum of Trials + SYM Accuracy-Weighted Score; see Boots et al., 2024 for details. This raw score composite functions like a total score.

e

SLS Sum of Trials (0–108) = SLS Trials 1–5 Total Correct (0–85) + SLS Delay Correct (0–23)

f

Symbols average response time in seconds for correct items across all 4 Symbols trials (lower value indicates faster performance)

g

SYM Accuracy-Weighted Score = Symbols average response time in seconds for correct items is inversed and weighted by accuracy; see Boots et al., 2024 for computation details; higher value indicates better performance)

h

Touchpad/trackpad was added as an option part way through data collection on 3/20/2022, thus this is the minimum number of individuals using touchpad/trackpad and prior to this implementation date participants may have selected touch or other as an alternative.

Note. STMS = Kokmen Short Test of Mental Status. MTD = Mayo Test Drive, SLS = Stricker Learning Span, SYM = Symbols. See supplemental online materials for duration definitions. Table used with permission of Mayo Foundation for Medical Education and Research; all rights reserved.

Table 3.

Session characteristics for Cognitively Unimpaired (CU) participants who completed a baseline MTD session by age subgroups. Mean (SD) presented except where otherwise indicated.

CU 35–64 CU 65–79 CU 80+ Unadjusted Adjusted
N=320 N=571 N=251 P a P b

Age at MTD (years) 56.7 (6.9) 72.3 (4.2) 85.6 (3.9) < 0.001
 Range 35.8 – 65.0 65.0 – 80.0 80.0 – 100.3
Sex, n (%) 0.821 0.55
 Female 167 (52.2) 286 (50.1) 126 (50.2)
 Male 153 (47.8) 285 (49.9) 125 (49.8)
Education (years) 15.9 (2.2) 15.5 (2.3) 15.3 (2.6) 0.011 < 0.001
 Range 6 – 20 12 – 20 8 – 20
Device Type, n (%) < 0.001 < 0.001
 Desktop computer or laptop 189 (59.1) 347 (60.8) 179 (71.6)
 Smartphone 103 (32.2) 122 (21.4) 29 (11.6)
 Tablet 27 (8.4) 100 (17.5) 41 (16.4)
 Other / not sure 1 (0.3) 2 (0.4) 1 (0.4)
 Missing 0 (0.0) 0 (0.0) 1 (0.4)
Input Source, n (%) 0.003 0.002
 Mouse 158 (49.4) 287 (50.3) 157 (62.8)
 Touch 137 (42.8) 237 (41.5) 79 (31.6)
 Touchpad / trackpad c 21 (6.6) 30 (5.3) 7 (2.8)
 Stylus / digital pen 3 (0.9) 10 (1.8) 7 (2.8)
 Other / Not Sure 1 (0.3) 7 (1.2) 0 (0.0)
 Missing 0 (0.0) 0 (0.0) 1 (0.0)
Session duration, minutes 14.8 (2.8) 16.2 (3.5) 18.0 (4.1) < 0.001 < 0.001
 Range 9.0 – 28.3 9.6 – 40.2 9.9 – 35.4
SLS Warm-Up duration, minutes 0.2 (0.2) 0.3 (0.4) 0.3 (0.3) < 0.001 < 0.001
 Range 0.1 – 2.2 0.1 – 8.1 0.1 – 2.4
SLS Trials 1–5 duration, minutes 8.3 (1.6) 8.7 (2.1) 9.1 (2.5) < 0.001 < 0.001
 Range 4.6 – 16.6 4.3 – 23.6 3.7 – 21.5
SLS Delay duration, minutes 1.3 (0.4) 1.4 (0.6) 1.6 (0.6) < 0.001 < 0.001
 Range 0.6 – 4.5 0.6 – 10.0 0.5 – 5.4
Symbols Warm-Up duration, minutes 0.4 (0.3) 0.4 (0.3) 0.5 (0.3) < 0.001 < 0.001
 Range 0.1 – 3.6 0.1 – 2.6 0.1 – 2.2
Symbols Test duration, minutes 2.9 (0.6) 3.4 (0.9) 4.3 (1.2) < 0.001 < 0.001
 Range 1.5 – 5.7 1.6 – 11.5 2.3 – 8.9
a

Continuous variable p-values from linear model ANOVAs, categorical p-values from Pearson’s Chi-Squared test.

b

Adjusted p-values are adjusted for the effects of sex, and education using multinomial logistic regression models

Note. MTD = Mayo Test Drive, SLS = Stricker Learning Span. See supplemental online materials for duration definitions. Table used with permission of Mayo Foundation for Medical Education and Research; all rights reserved.

Table 4 reports several session characteristics relevant for understanding the test environment for remote sessions. Most participants completed testing at home (93%), 6% completed testing at work and <1% completed testing in a clinic or public space. Noise was endorsed in 4% of sessions; approximately half was noise that did not distract the participant. Subtests with longer durations had a higher percentage of participants reporting interference during that subtest (8.8% SLS learning trials, 5.7% Symbols, 1.8% SLS delay; see Table 4 for response categories endorsed after a yes response about potential interference). All participants passed the SLS warm-up on the first (98%) or second (2%) attempt, suggesting participants were able to understand and follow instructions. While most also passed the Symbols warm-up on the first (92%) or second (6%) attempt, a few required a third attempt (0.5%) and one individual failed the warm-up and thus was not administered the full subtest (met discontinue rule).

Table 4.

Location of testing and frequency of noise and subtest interference during initiated baseline sessions (N=1217).

Variable N (%)

Context of session
 Location reported (N=1217)
  At home 1126 (92.5%)
  At work 71 (5.8%)
  In a clinic (medical or research center) 11 (0.9%) a
  In a public space (park, library) 9 (0.7%)
 Noise in testing environment (N=1198) 50 (4.2%)
  There was some noise in the background, but it did not distract me 27 (2.3%)
  There was some noise in the background and it was distracting 17 (1.4%)
  People were talking to me while I tried to take the test 2 (0.2%)
  People were talking in the background 4 (0.3%)
 Validity/screen size concerns b 4 (0.3%)
 SLS practice “Warm-Up” pass rates (N=1217)
  SLS warm-up passed on first try (4/4) 1196 (98.3%)
  SLS warm-up passed on second try (3/4) 21 (1.7%)
  SLS warm-up passed on third try (2/4) 0 (0.0%)
  SLS warm-up failed (0 or 1 / 4) 0 (0.0%)
 SYM practice “Warm-up” pass rates (N=1201)
  SYM warm-up passed on first try (4/4) 1123 (92.3%)
  SYM warm-up passed on second try (3/4) 71 (5.8%)
  SYM warm-up passed on third try (2/4) 6 (0.5%)
  SYM warm-up failed (0 or 1 / 4) 1 (0.1%)
Interference endorsed during subtests
 SLS Trials 1–5 interference endorsed (N=1200) 105 (8.8%)
  I am not comfortable using technology 2 (0.2%)
  I had technical problems 5 (0.4%)
  I was confused about the instructions 1 (0.1%)
  I was interrupted during this test 56 (4.6%)
  Sometimes my selection did not register 5 (0.4%)
  The words were hard for me to see 1 (0.1%)
  Other (there will be a comments box at the end of the session) 35 (2.9%)
 SLS Delay interference endorsed (N=1198) 22 (1.8%)
  I am not comfortable using technology 1 (0.1%)
  I had technical problems 1 (0.1%)
  I was confused about the instructions 0 (0.0%)
  I was interrupted during this test 2 (0.2%)
  Sometimes my selection did not register 4 (0.3%)
  The words were hard for me to see 1 (0.1%)
  Other (there will be a comments box at the end of the session) 13 (1.1%)
 Symbols Test Interference endorsed (N=1198) 69 (5.7%)
  I am not comfortable using technology 1 (0.1%)
  I had technical problems 5 (0.4%)
  I was confused about the instructions 3 (0.3%)
  I was interrupted during this test 15 (1.3%)
  Sometimes my selection did not register 13 (1.1%)
  The symbols were hard for me to see 1 (0.1%)
  Other (there will be a comments box at the end of the session) 30 (2.5%)
  Missing 1 (0.1%)
a

Nine participants elected to come into the clinic for a scheduled visit to do their MTD session. One of those individuals did not complete the session after they accidentally opened another window during SLS warm-up and then discontinued. We assume that the additional individuals who endorsed their location as in clinic may work in a clinic or research center.

b

Validity concerns are defined as present if there are no responses to items in the fourth position on any SLS learning trials (% responses in 4th position across all 5 trials = 0). This suggests that the 4th word displayed may not have been viewed. The response order for all warm-up trials was subsequently changed on 11/13/22 to help ensure the 4th word is always viewed by moving the correct response for all warm-up trials to the 4th position, making it impossible to pass the warm-up without selecting the response from the 4th position.

Note. SLS = Stricker Learning Span; SYM = Symbols Test. Table used with permission of Mayo Foundation for Medical Education and Research, all rights reserved.

Most participants who completed an MTD session used a personal computer (n=751, 62.7%). Many used a smartphone (n=266, 22.2%) or tablet (n = 177, 14.8%). Device use rates did not differ across cognitively impaired and unimpaired groups (Table 2, see Supplemental Table 5 for MCI and dementia subgroups). However, there were differences in device use rates across age groups among CU participants (p < 0.001), with higher PC use and lower smartphone use in older age groups (Figure 3, Table 3). Among CU participants, noise in the environment was more frequently endorsed for individuals using a smartphone (8.7%) or tablet (6.5%) relative to those using a computer (2.1%; adjusted p < 0.001); individuals reporting device type as “Other/Not Sure” were excluded from these analyses (n=4). There were no significant differences across device types in the frequency of subtest interference endorsed, the frequency of a validity flag, or the total session duration (adjusted p’s > .05; see Supplemental Table 7).

Figure 3.

Figure 3.

Device use for all participants who completed a session (large circle) and for Cognitively Unimpaired (CU) participants by age group (small circles).

Note. Figure used with permission of Mayo Foundation for Medical Education and Research, all rights reserved.

Several session characteristics for the MTD follow-up session are presented in Supplemental Table 6 and show a similar pattern of results.

Qualitative Analysis of Participant Comments

Comments were entered voluntarily by 36.4% of participants (437/1199 participants completing a baseline session). Most comments were coded under one theme. Some longer comments contained multiple components that were coded into more than one theme, resulting in a total number of 481 comment components coded from 437 comments. See Figure 4 for an overview of themes with examples, and Table 5 for additional details and number of comments within theme subcategories.

Figure 4.

Figure 4.

Comment rating categories overview and examples.

Note. Figure used with permission of Mayo Foundation for Medical Education and Research, all rights reserved.

Table 5.

Quantitative summary of qualitative review and groupings for optional comments provided during baseline session.

Completed Baseline Session, N 1199
 Total number of comments 437 (36.4%)
  Number of comments coded into 1 theme 396
  Number of comments coded into 2 themes 39
  Number of comments coded into 3 themes 1
  Number of comments coded into 4 themes 1
 Total number of comment elements coded into themes 481
  Comments coded as relating to behavioral observations proxy (theme 1) 86 (17.9%)
  Comments coded as relating to usability (theme 2) 67 (13.9%)
  Comments coded as relating to face validity (theme 3) 35 (7.3%)
  Comments coded as relating to acceptability (theme 4) 293 (60.1%)
Comments coded as relating to Behavioral observations proxy (theme 1) 86
 Behavioral 1:
  Comment that provided useful information about the environment, context of the session, relevant internal or external distractions, etc. (e.g., there was a loud truck outside; my phone rang, usage due to health problems)
61
 Behavioral 2:
  Explained an incorrect selection (sometimes people endorsed interference when there was none; e.g., info that helps us understand their remote data better, participant stated they hit “yes” instead of “no” for interference.)
7
 Behavioral 3:
  Discussed their strategy / approach to the testing
4
 Behavioral 4:
  Comment that explained a response selection made, such as more specifics about response input used or how their choice of device/response input may have impacted performance
14
Comments coded as relating to usability (theme 2) 67
 Usability 1:
  Indication that instructions were easy to understand
5
 Usability 2:
  Indication that instructions were hard to understand
0
 Usability 3:
  Indication that they were slow to understand the task, were confused about instructions or expectations
16
 Usability 4:
  Easy to use comments
12
 Usability 5:
  Technical problems described
20
 Usability 6:
   Sensory or motor (health condition) impacts use of MTD
3
 Usability 7:
   Helpful constructive feedback
20
 Usability 8:
   Comments about length of testing
1
Comments coded as relating to face validity (theme 3) 35
 Face Validity 1:
   Indication that the test appeared to assess memory
25
 Face Validity 2:
   Complaint that the test was not assessing memory well
1
 Face Validity 3:
  Indication that Symbols test assesses speed of thinking, attention/working memory, or visual matching
7
 Face Validity 4:
  Requests for brain training exercises or other resources after completing the test, or indication that this test would be good for brain training exercise
3
 Face Validity 5:
   Other endorsement of face validity not falling into a category above
3
Comments coded as relating to acceptability (theme 4) 293
Comments coded as relating to acceptability (positive valence) 268
 Acceptability 1:
  Interesting
37
 Acceptability 2:
  Fun, enjoyed
63
 Acceptability 3:
  Good test (or similar)
21
 Acceptability 4:
  Reference to test being engaging or useful
1
 Acceptability 5:
  Hard, tough, challenging
66
 Acceptability 6:
  Thank you
47
 Acceptability 7:
  Positive comment about digital or remote format (nice to do at home; like doing on smartphone)
22
 Acceptability 8:
  Other positive response
17
 Acceptability 15:
  Comment about feedback provided being helpful
3
 Acceptability 17:
  Preference for digital self-administered test compared to in-person testing
2
 Acceptability 18:
  Preference for the new tests (MTD) or statement that they perceive the new tests as better compared to the “old tests” (cards, Cogstate)
24
Comments coded as relating to acceptability (negative valence) 12
 Acceptability 9:
  Comment suggesting test made them anxious
4
 Acceptability 11:
  Frustrating, discouraging
2
 Acceptability 12:
  Negative comment about digital or remote format (don’t like doing it at home, etc.)
0
 Acceptability 13:
  Other negative response
3
 Acceptability 19:
  Preference for “old tests” (cards, Cogstate)
3
Comments coded as relating to acceptability with ambiguous significance 25
 Acceptability 14:
  Neutral response (or subjective judgement required and unclear how to rate, like with sarcasm)
10
 Acceptability 16:
  Requests for normative feedback; wondering how they did compared to others
2
 Acceptability 21:
  Ambiguous
9
 Acceptability 22:
  Other, comment not related to the test session, does not fit with any other rating options
4

Note. There are no acceptability 10 or 20 categories; those numbers were avoided to facilitate coding. Table used with permission of Mayo Foundation for Medical Education and Research, all rights reserved.

Most comments (61%) were related to acceptability; these were further categorized into perceived valence categories. A majority were positive (73% of acceptability comments, e.g., interesting, fun, challenging, enjoying digital or remote format, etc.), some were negative (12%; e.g., frustrating, feelings of anxiousness, etc.), and remaining comments were neutral (8.5%; e.g., requests for normative feedback, ambiguous comment, etc.).

About 18% of comments were categorized under the behavioral observations proxy theme. This theme represented comment categorizations relating to participants explaining relevant factors that may have influenced their performance during testing, their testing approach, and explanations regarding response selection.

Approximately 14% of comment categorization represented usability. Some participants spontaneously stated instructions were easy to understand (n=5). Several described technical problems (n=20; 4.2% of all comment categorizations or 1.7% of all baseline sessions completed). Several provided helpful constructive feedback (n=20), some of which were used to make platform enhancements (see lessons learned section of discussion).

Finally, several comments related to face validity (n=35, 7.3% of comments) and predominantly supported the face validity of the subtests.

Discussion

This study supports the usability, feasibility, and acceptability of the remotely administered MTD screening battery. We demonstrate high usability of MTD, with 98.5% completion rates and no significant differences in completion rates in those with and without cognitive impairment or across age groups. Follow-up adherence is also high (89%). MTD self-administered sessions are completed efficiently (mean 16 minutes) using multiple device types. Voluntary participant comments provide additional support for MTD acceptability and face validity.

Objective usability results support our primary aim, with 98.5% of participants who initiated a session completing a session. This exceeded our hypothesized goal of 90% completion rates. Our high completion rates are similar to those reported for a younger research cohort using the Cogstate Brief Battery (CBB); Perin et al. (2020) reported 98% completion rates in a sample of participants aged 40–65 (mean=56, SD 7). Our completion rates are also comparable to a self-administered computerized screening measure administered unsupervised in clinic waiting rooms; specifically, the Cleveland Clinic Cognitive Battery (C3B) showed >92% completion rates in a primary care clinic (Rao et al., 2023). Ashford, Aaronson, et al. (2024) found that 80.8% of Brain Health Registry (BHR) participants who started the Paired Associates Learning Task from the Cambridge Neuropsychological Test Automated Battery® (CANTAB PAL) completed the task in a sample with a broad age range (18–99, mean=66, SD=11). The most common reasons for starting but not completing the PAL were technical difficulties and lack of device support (e.g., test was not able to be completed in the BHR on smartphones or tablets at that time).

MTD completion rates were comparable across cognitively unimpaired and cognitively impaired groups, supporting the high usability of the platform. This finding is notable because completion rates for some examiner-administered computerized tasks completed in clinic have shown usability limitations in cognitively impaired participants (Hackett et al., 2018). Results also showed generally comparable completion rates across age groups, ranging from 97.3% in individuals aged 80+ to 99.7% in individuals aged 34–64. Several MTD design characteristics likely helped maintain these high completion rates in older adults, including multi-device compatibility and no use of audio. Hischa et al. (2022) highlighted that use of headphones for a tablet-based, self-administered battery of working memory tests frequently interfered with hearing aids, and this may have resulted in reduced completion rates across age groups in that study (e.g., 32% of oldest-old participants completed the working memory tests vs. 97% of young adults). Requiring audio could similarly lead to unexpected difficulties in individuals with hearing aids for measures developed for primary care settings that use headphones to facilitate administration in waiting rooms (Rao et al., 2023). While there are some benefits to presenting auditory instructions, we chose to avoid use of audio for the MTD screening battery to enhance usability because hearing loss is common in older adults. Marinelli et al. (2022) reported that among 1200 CU individuals in the Mayo Clinic Study of Aging (MCSA; mean age 76, SD=9) who volunteered to undergo formal behavioral audiometric evaluation by an audiologist, only 36% had normal hearing and other participants had mild (32%), moderate (30%) or severe/profound (1%) hearing loss. Further, Gorman & Lin (2016) showed that rates of moderate hearing loss increase substantially in individuals aged 80+ (38% prevalence) relative to younger age groups (16% in 70–79 year-olds; 6% in 60–69 year-olds and 2% or below in those <60 years). Requiring audio also increases technological complexity and may be a barrier to participation if audio is not set up correctly. The differing quality of output of different devices and peripherals may also lead to added variability. A tradeoff of the decision to avoid use of audio to increase accessibility for people with hearing impairment is that MTD therefore requires vision. Because participants can choose their preferred device to take MTD, individuals with vision loss responsive to accommodations may elect to use a desktop computer or tablet that allows for a larger screen size and text size than a smartphone. In addition, we considered design needs for older adults during test development and prioritized large font (e.g., memory test items) and simple high contrast visual displays. Another factor that may have contributed to high completion rates across groups was our decision not to deactivate links provided once the links were used. In this way, if a participant inadvertently closed the web application prior to completion or if they were significantly interrupted by their environment and needed to discontinue prior to completion, the participant was able to restart a session at their leisure without facing potential technology frustrations that can occur with a non-functioning link. We additionally examined the number of participants with a discontinued session prior to completion (3.3% total); there were no differences in CU vs cognitively impaired groups, but we did observe higher rates of discontinued sessions prior to completion with higher age and with use of a tablet device. In the future, we will consider allowing participants to resume a discontinued session if within a pre-specified time frame.

Although results show that the MTD platform is easy to use, our data illustrate a common pattern observed in computerized and remote cognitive assessment studies wherein individuals with cognitive impairment and those aged 80+ years are less likely to participate in technology-based studies, particularly on their own without assistance. Similar to survey data showing that only 39% of participants with dementia reported willingness to engage in remote testing (Jacobs et al., 2021), our results showed that the percentage of participants willing to attempt participation in this uncompensated remote cognitive assessment ancillary study was lower for those with cognitive impairment (33%) compared to CU individuals (65%). This is consistent with other studies examining observed rates of engagement with voluntary unsupervised, self-administered neuropsychological measures. For example, Weiner et al. showed that a self-reported diagnosed memory problem was associated with decreased likelihood of completing self-administered neuropsychological tests in the BHR (Weiner et al., 2018). Even among individuals registering for an online platform such as the BHR, many do not complete available self-administered neuropsychological tests. In a recent update, Weiner et al., reported that 56% of BHR participants completed at least one neuropsychological test and 22% completed at least two neuropsychological tests at baseline (Weiner et al., 2023). Ashford, Aaronson, et al. (2024) reported that 50.6% of BHR participants who were provided an opportunity to complete the CANTAB PAL did not attempt it. In our prior work with the CBB in the MCSA, we administered the CBB during each in person study visit and allowed participants the option to complete an interim session(s) in clinic or remotely. Using that study design, only 18% of MCI participants ever completed an at home Cogstate session and 17% of all MCSA participants never completed a CBB session (in clinic or at home; unpublished data).

Once enrolled, we found that the adherence/retention rate for the 7–8 month MTD follow-up session was 89% for the subsample of 583 participants who were due to complete a second MTD session, with 99% completion rates for initiated follow-up sessions. This suggests individuals are willing to re-engage with the MTD platform, supporting feasibility of longitudinal monitoring. These retention rates are promising given that retention is often lower than desired in remote cognitive assessment studies. For example, in an ADNI pilot study, after initially completing CBB in clinic, 79.4% of CU and MCI participants completed the first follow-up (within 2 weeks) and 37.1% completed a 6-month follow-up. In the Alzheimer Prevention Trials web-study, Walter et al. (2020) similarly reported that a relatively small number of individuals completed follow-up remote computerized testing, with 29.5% completing a second CBB session, and 23% completing a third CBB session. Ashford, Jin, et al. (2024) recently demonstrated that the presence and number of self-reported medical conditions impacts longitudinal completion of questionnaires and the CBB in the BHR. Overall, they found that 75% of individuals 55+ completed a questionnaire at least twice, and 45% completed the CBB at least twice. The number of self-reported medical conditions was negatively associated with likelihood of completing at least two cognitive assessments but was not associated with likelihood of longitudinal questionnaire completion. Those specifically self-reporting a history of Alzheimer’s disease and related disorders were less likely to complete either the questionnaire or the CBB twice.

Results from registry studies suggest that platform flexibility in terms of device compatibility may be a critical factor underlying both willingness to participate and usability (Ashford, Aaronson, et al., 2024; Walter et al., 2020). For example, Walter et al. (2020) reported that 97% of participants consenting to the APT web-study completed a self-report questionnaire of cognitive symptoms (the Cognitive Function Index, CFI) but that only 65% completed initial remote self-administered computerized testing (CBB). The difference in rates of self-report versus computerized testing completion in that study was described as due in part to technical challenges and lack of compatibility of the CBB with smartphones. Our group previously administered the CBB in the MCSA and study coordinators received many requests for a smartphone compatible option. Despite this feedback, a majority (63%) of participants in our current study used a personal computer to complete MTD, and 37% used a mobile device (22% smartphone, 15% tablet). Smartphone use for completing MTD was higher in younger relative to older age groups; conversely, personal computer use increased with age. These results suggest that the target demographics of a study or clinic may influence device preference. Device flexibility may be important to ensure ability of users to complete assessments (Ashford, Aaronson, et al., 2024; Walter et al., 2020).

“Bring your own device” (BYOD) approaches may engage the most users, even though there can be some minor psychometric disadvantages with use of multiple device types (Nicosia et al., 2023; Stricker et al., 2020). BYOD often implies choosing a device type within a given device class, such as choosing either android or iOS devices in a smartphone-only study. The ability to choose the device participants are most comfortable with, which we suggest represents a “Choose Your Preferred Device” (CYPD) approach, is likely particularly important for individuals over 80 and for those with cognitive impairment. These groups are likely to be more hesitant to participate in self-administered remote cognitive assessment studies based on our results; therefore, the ability to use a familiar device may increase participation rates. Many studies directly supply devices to participants. Although this helps overcome any barriers regarding device ownership and usage, this may inadvertently introduce differential impact of ability or willingness to adapt to an unfamiliar device and add burden to study personnel or participants for device delivery. A CYPD approach ensures participants can complete cognitive testing on a device they are most comfortable with. This approach also best aligns with real-world settings such as clinical practices making use of remote assessment technology or pragmatic clinical trials. Another potential explanation for the higher-than-expected use of personal computers in the current study is that participants may be most likely to use the device they use to check email, since email was the primary method of providing links to the test. An option to allow participants without an available device to come to the clinic to complete testing may be one strategy to mitigate potential bias; this is a strategy employed in the current study, though very few participants made use of this option when it was available.

Several additional factors beyond device flexibility inform feasibility for implementing unsupervised remote digital cognitive assessments in research and clinical settings, including session duration and frequency of interruptions during test sessions. Our prior MTD pilot study showed that an MTD session was typically completed in 15 minutes (median) in older adult females, whose mean age was 5 years younger than the current sample. The current results are similar, with mean session duration of 16 minutes. To aid group comparisons, we reported means instead of medians in the current study, though means will show slightly longer durations than medians. Mean subtest durations were 8.7 minutes for SLS learning trials, 3.5 minutes for Symbols trials, and 1.4 minutes for SLS delay. There were significantly longer durations for the total session and each subtest in older age groups. The cognitively impaired group showed slower completion times for the total session and for Symbols relative to the CU group. However, there was no difference in duration across CU and cognitively impaired groups for SLS learning and delay trials, likely due to the adaptive nature of the SLS that often results in fewer items presented to cognitively impaired participants. As expected, the cognitively impaired group showed lower performance on all MTD performance variables (e.g., MTD Composite, SLS learning and delay correct, Symbols average response time on correct items) relative to the CU group. Most participants reported completing sessions at home (93%). Noise was infrequently reported (4% of session). Participants endorsed interference during subtests in 2–9% of sessions; the frequency of this varied by subtest duration, with longer subtests having a higher likelihood of interference. Most participants were able to correctly answer the practice item on the first attempt, suggesting they were able to understand task instructions.

Qualitative analysis of voluntary comments left at the end of MTD sessions support acceptability, with a majority of positive comments. Qualitative analyses also provided evidence for face validity of subtests. As described by Soobiah et al. (2019), face validity is important for implementation of self-administered measures because participants may be more likely to complete the measure if they perceive it to have value/face validity. Comment analyses showed that 25 participants spontaneously made a comment indicating that the test appeared to assess memory. Many participants provided comments helpful for individual level interpretation of their test performance (behavioral observations proxy), such as describing a distraction that occurred or stating they accidentally endorsed an interruption when in fact there was none. A minority of participant comments described technical problems (1.7% of all baseline sessions). Many participants also provided helpful constructive feedback, including some that led to platform enhancements (see below). Thus, in addition to an avenue for learning important information about the context of sessions directly from participants in these remote, unsupervised sessions, free-text comments also provide a method of ongoing participant feedback. While cognitive interviewing approaches during the test development phase can accomplish similar goals (Young et al., 2023), free-text comments provide a low burden opportunity for all individuals taking the tests across all use cases to provide this ongoing, valuable input.

Recommendations and Lessons Learned

We offer several recommendations and lessons learned from our initial pilot (Stricker et al., 2022) and current work. We include both enhancements we chose to iteratively implement, as well as additional considerations that may be helpful for others or that will be considered during future iterative updates:

  1. User agent data alone is insufficient. We used self-reported device type and input source in the current study for several reasons. For our initial pilot (Stricker et al., 2022) we only used user agent data and found that this was not highly scalable and not a completely reliable indicator of device type as it often required manual review of information when data were ambiguous and some “best guesses” (https://developer.mozilla.org/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent). Also, the format of the user agent variable and available parsing programs change over time. We therefore added self-report of device type and response input source prior to beginning data collection for the current study, while additionally capturing user agent if it is needed for further reference. There is no current way to automatically detect the response input source that could be a more important potential source of variability than the device type itself. For example, based on participant comments, we realized that some individuals were using a trackpad or touchpad when touch was selected (comments of “no comments, other than I had a sensitive finger touch pad,” and “I was using a touch pad on a laptop, not a touch screen.” We updated our self-report response input selections to include touchpad / trackpad on 3/20/22 because of this participant feedback.

  2. Comments by participants can help identify issues to address. Participant feedback in comments helped alert us that some participants had screen size issues that may have impacted their ability to see all 4 word choices on the screen during recognition testing without having to scroll (e.g., comment: “My screen needed to be moved up and down to see the total word list”). Following this feedback, we examined the data to determine the percent of responses that represented a selection of the word in the 4th position. We saw that a small number of sessions had 0% of their recognition responses in this 4th position across all 5 learning trials, which we interpreted as suggesting a high likelihood of the participants not realizing there was a 4th position word. In the current analytic data set, we examined the 9 total sessions that met this validity flag (4 baseline and 5 follow-up sessions). Most sessions (8/9) were completed on a PC, and one was completed on an iPhone. We took several actions in response to the participant feedback and identification of this issue. First, we created a validity variable to flag individuals with 0% of responses in this 4th position. Second, we adjusted the order of the warm-up responses to only place the correct item in the 4th position (implemented on 11/13/22); unlike the SLS test items that are always randomized, the position order in the warm-up is fixed. Figure 1 shows the original order of one of the memory practice item selection screens, with the target word cat in the second position. In the update, the target word cat is now in the fourth position. Third, we made a minor adjustment to the wording in the warm-up instructions to specify that participants select “the word you saw from 4 options” instead of “the word you saw.” We expect this change to the warm-up will alert any participants who may need to adjust their screen size (e.g., if on PC and are using a partial window) or to scroll down to see all response options, or participants will fail the warm-up and discontinue the SLS (preventing an invalid test). Third, we implemented an update that allowed a greater degree of accordion functioning with dynamic screen changes, such that the four word options would move closer together vertically when the height of the window shrinks.

  3. Participants sometimes select an incorrect response option on accident. Participants are asked at the end of each subtest if anything interfered with test performance. If there is a yes response, they are presented with multiple options to select from. Numerous participants selected “Other (there will be a comments box at the end of the session)” and then made a comment that they had selected the incorrect response option and that nothing interfered with test performance. In other words, they hit yes when they meant to hit no. Because of the frequency of this comment type, we have now added an additional response option at the top of the selection page of “Nothing interfered” to address this issue in v2. All other non-test selection screens aside from this test interference question have a yes/no confirmation screen to confirm the selection made. Back functionality is disabled since that would interfere with test performance.

  4. Expect connectivity issues and use programming logic to help address. Early in the study we saw that SLS trial 5 data failed to save for one participant on 5/27/21 who fully completed the session. We suspect that connectivity issues during the session led to a failure to save the data for that trial only. We subsequently improved a queuing system that addresses intermittent internet disconnections. This change largely prevents any impact of the quality of internet on data saving or session completion, as long as connectivity is available prior to the session window being closed. There is a prompt on the next to the last screen that states “Please wait while we upload your data,” shows the percent uploaded as it progresses, and prompts, “Make sure you have an active internet connection.” This screen only shows if there is slow or poor connectivity.

  5. Not all web browsers function the same, and changes can be implemented to browsers at any time without notification. For example, when iOS 15 was released on 9/20/21, a new pull-to-refresh functionality was introduced that was specific to Safari only. With this update, if a swipe down motion was made, the browser would refresh, which would end any MTD session in progress. No participant feedback led to identification of this issue, but it was eventually discovered during periodic quality assurance testing and a fix was implemented to prevent the pull-to-refresh functionality on 11/13/22. Most (93%) of the sessions represented in the current manuscript were completed while the pull-to-refresh problem was present, which may have led to a higher number of discontinued sessions in the current study for Safari web browsers.

  6. Sometimes technological challenges have nothing to do with a specific platform or test. We are aware that one incomplete session represented in this manuscript was due to a widespread outage. Specifically, we were alerted to a broad outage on 8/23/22 that led numerous websites to go down, including Amazon web services that hosts v1 of MTD. One participant was mid-session when this outage started, resulting in an incomplete session that had nothing to do with participant-, device- or platform-level factors.

Additional Limitations

The current study has some limitations to consider. First, the sample is predominantly White, Non-Hispanic, and college-educated; this is a significant limitation as we do not know whether results will generalize to a more sociodemographically diverse sample. We are currently taking steps to address this limitation by adding additional recruitment sites focused on increasing representation of African American participants and of participants with lower levels of education and area-level socioeconomic disadvantage. We also recently completed a Spanish adaptation of MTD using a community engaged approach that included focus group feedback from monolingual and bilingual Spanish speakers (Karstens et al., 2024). This Spanish adaptation has been programmed and is currently undergoing quality assurance checks and then will be ready for initial validation. Second, examining usability, feasibility and acceptability in research participants asked to complete a session of MTD for a voluntary research study without compensation may not generalize to broader contexts such as those seeking medical attention surrounding cognitive concerns. For example, while the participation rates observed among cognitively impaired participants in this study may raise some concerns about the acceptability of any digital cognitive measure being regularly implemented in clinical settings, participation rates in this study reflect willingness to voluntarily engage in this ancillary research study and may not reflect willingness to engage with a remote cognitive assessment when it is requested by a clinical provider. Clinically referred remote cognitive assessments may have higher participation rates due to higher motivation to complete a remote, cognitive test (e.g., in settings with low base rates of cognitive impairment such as primary care clinics). Conversely, lower clinical participation and completion rates could be seen because of higher base rates of cognitive impairment in some settings (e.g., dementia clinic). Future studies are needed to examine usability, feasibility, and acceptability findings in clinical care settings. Based on cognitive interview data, Young et al. (2023) found that clinicians, administrators, and healthy older adults preferred remote, pre-visit cognitive screening because it saved time relative to cognitive screening measures that require person administration and scoring. However, this was based on theoretical feedback and opinions and not on objective data inviting patients to complete pre-visit remote cognitive assessments. It is also possible that participants with cognitive impairment may be less likely to check their email, which was the primary method of communication in this remote study. Because there was some variability in the degree of study coordinator support, this may have impacted participation and adherence rates since not all participants routinely benefitted from reminder calls. Future studies are also needed to examine whether usability and participation rates vary by digital literacy. It may be that individuals with low digital literacy are less likely to participate in research that requires use of technology. While we tried to circumvent the need for device ownership to participate in this research by allowing individuals without devices or who desired assistance to come into the clinic to complete the MTD session, very few participants (0.7%) elected to make use of this option. Finally, while participants are told not to share the email/link with others or allow someone else to take MTD using their link, there is no way to guarantee these instructions were followed.

Summary

Overall, the MTD platform demonstrates high usability in this research sample that includes representation of a broad age range, with good representation of individuals over 80 and inclusion of some individuals with cognitive impairment. Though likelihood of participation in remote cognitive assessment studies appears to vary by cognitive status and age, further research is needed to understand generalizability in clinical settings. The current results add to our prior work establishing the concurrent and criterion validity of MTD (Boots et al., 2024; J. L. Stricker et al., 2023; Stricker et al., 2024; N. H. Stricker, J. L. Stricker, et al., 2022) and show support for the feasibility, acceptability, and face validity of the MTD platform as well as the Stricker Learning Span and Symbols subtests.

Supplementary Material

1

Funding and Acknowledgments

This research was supported by the National Institute on Aging of the National Institutes of Health (R01 AG081955, R21 AG073967, P30 AG062677, U01 AG006786, RF1 AG069052), the Kevin Merszei Career Development Award in Neurodegenerative Diseases Research IHO Janet Vittone, MD, the Rochester Epidemiology Project (NIH R01 AG034676), the GHR Foundation, and the Mayo Foundation for Education and Research. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or other funding sources. The authors wish to thank the participants and staff at the Mayo Clinic Study of Aging and Mayo Alzheimer’s Disease Research Center.

Footnotes

Conflict of Interest

Dr. Patel has nothing to disclose.

Ms. Christianson reports grants from NIH during the conduct of the study.

Mr. Monahan has nothing to disclose.

Mr. Frank reports grants from National Institutes of Health (NIH) during the conduct of the study.

Ms. Fan reports grants from NIH during the conduct of the study.

Dr. John Stricker reports grants from NIH during the conduct of the study; and a Mayo Clinic invention disclosure has been submitted for the Stricker Learning Span and the Mayo Test Drive platform.

Dr. Kremers reports grants from NIH during the conduct of the study.

Dr. Karstens reports grants from NIH during the conduct of the study.

Dr. Machulda reports grants from National Institutes of Health during the conduct of the study.

Dr. Fields reports grants from NIH and grants from the Mangurian Foundation outside the submitted work.

Dr. Hassenstab reports grants from NIH during the conduct of the study; personal fees from Parabon Nanolabs, personal fees from Roche, personal fees from AlzPath, personal fees from Prothena, personal fees and other (serves on Data Safety Monitoring Board/Advisory Board) from Caring Bridge (National Institute on Aging sponsored), personal fees and other (serves on Data Safety Monitoring Board/Advisory Board) from Wall-E (National Institute on Aging sponsored) outside the submitted work.

Dr. Jack reports grants from NIH and grants from GHR Foundation during the conduct of the study; and Dr. Jack receives research support from the Alexander Family Alzheimer’s Disease Research Professorship of the Mayo Clinic.

Dr. Botha reports grants from NIH outside the submitted work.

Dr. Graff-Radford reports grants from NIH outside the submitted work; and serves as the site-PI for a clinical trial co-sponsored by Eisai, cognition therapeutics and NIH, and serves on the Data Safety and Monitoring Board for StrokeNET.

Dr. Petersen reports grants from NIH during the conduct of the study; personal fees from Oxford University Press, personal fees from UpToDate, personal fees from Roche, Inc., personal fees from Genentech, Inc., personal fees from Eli Lilly and Co., and personal fees from Nestle, Inc., outside the submitted work.

Dr. Nikki Stricker reports grants from NIH during the conduct of the study; and a Mayo Clinic invention disclosure has been submitted for the Stricker Learning Span and the Mayo Test Drive platform. She receives no personal compensation from any commercial entity.

Data Availability

The data supporting the findings of this study are available upon reasonable request to the corresponding author and with approval from Mayo Clinic Study of Aging investigators.

References

  1. American Psychiatric Association. (1994). Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) (4th ed.). American Psychiatric Association. [Google Scholar]
  2. Amirpour A, Bergman L, Liander K, Eriksson LI, Eckerblad J, & Nilsson U. (2022). Is the analogue cognitive test from the ISPOCD equivalent to the digital cognitive test Mindmore? A protocol for a randomised cross-over study including qualitative interviews with self-reported healthy seniors. BMJ Open, 12(9), e062007. 10.1136/bmjopen-2022-062007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Ashford MT, Aaronson A, Kwang W, Eichenbaum J, Gummadi S, Jin C, Cashdollar N, Thorp E, Wragg E, Zavitz KH, Cormack F, Banh T, Neuhaus JM, Ulbricht A, Camacho MR, Fockler J, Flenniken D, Truran D, Mackin RS,…Nosheny RL (2024). Unsupervised Online Paired Associates Learning Task from the Cambridge Neuropsychological Test Automated Battery (CANTAB(R)) in the Brain Health Registry. J Prev Alzheimers Dis, 11(2), 514–524. 10.14283/jpad.2023.117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Ashford MT, Jin C, Neuhaus J, Diaz A, Aaronson A, Tank R, Eichenbaum J, Camacho MR, Fockler J, Ulbricht A, Flenniken D, Truran D, Mackin RS, Weiner MW, Mindt MR, & Nosheny RL (2024). Participant completion of longitudinal assessments in an online cognitive aging registry: The role of medical conditions. Alzheimers Dement (N Y), 10(1), e12438. 10.1002/trc2.12438 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Berron D, Ziegler G, Vieweg P, Billette O, Gusten J, Grande X, Heneka MT, Schneider A, Teipel S, Jessen F, Wagner M, & Duzel E. (2022). Feasibility of Digital Memory Assessments in an Unsupervised and Remote Study Setting. Front Digit Health, 4, 892997. 10.3389/fdgth.2022.892997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Boots EA, Frank RD, Fan WZ, Christianson TJ, Kremers WK, Stricker JL, Machulda MM, Fields JA, Hassenstab J, Graff-Radford J, Vemuri P, Jack CR, Knopman D, Petersen RC, & Stricker NH (2024). Continuous Associations Between Remote Self-Administered Cognitive Measures and Imaging Biomarkers of Alzheimer’s Disease. J Prev Alzheimers Dis. 10.14283/jpad.2024.99 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Edgar CJ, Siemers E, Maruff P, Petersen RC, Aisen PS, Weiner MW, Albala B, & Alzheimer’s Disease Neuroimaging I. (2021). Pilot Evaluation of the Unsupervised, At-Home Cogstate Brief Battery in ADNI-2. Journal of Alzheimer’s Disease, 83(2), 915–925. 10.3233/JAD-210201 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Goman AM, & Lin FR (2016). Prevalence of Hearing Loss by Severity in the United States. American Journal of Public Health, 106(10), 1820–1822. 10.2105/AJPH.2016.303299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Hackett K, Krikorian R, Giovannetti T, Melendez-Cabrero J, Rahman A, Caesar EE, Chen JL, Hristov H, Seifan A, Mosconi L, & Isaacson RS (2018). Utility of the NIH Toolbox for assessment of prodromal Alzheimer’s disease and dementia. Alzheimers Dement (Amst), 10, 764–772. 10.1016/j.dadm.2018.10.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Hischa V, Oesterlen E, & Seitz-Stein K. (2022). Feasibility of EI-MAG, a Working Memory App, in Younger and Older Adults. Gero Psych, 35(4), 202–210. 10.1024/1662-9647/a000282 [DOI] [Google Scholar]
  11. https://developer.mozilla.org/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent. Retrieved October 22, 2024 from https://developer.mozilla.org/en-US/docs/Web/HTTP/Browser_detection_using_the_user_agent
  12. Jacobs DM, Peavy GM, Banks SJ, Gigliotti C, Little EA, & Salmon DP (2021). A survey of smartphone and interactive video technology use by participants in Alzheimer’s disease research: Implications for remote cognitive assessment. Alzheimers Dement (Amst), 13(1), e12188. 10.1002/dad2.12188 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Karstens AJ, Aduen P, Luna NA, Diaz Galven P, Barber M, Woolston D, Patel P, Lucas J, Lachner C, & Stricker NH (2024). Adaptation of Mayo Test Drive for Monolingual and Bilingual Spanish Speakers Using a Community Engaged Approach. [Poster Presentation]. International Neuropsychological Society 52nd Annual Meeting, New York, NY. [Google Scholar]
  14. Kochan NA, Heffernan M, Valenzuela M, Sachdev PS, Lam BCP, Fiatarone Singh M, Anstey KJ, Chau T, & Brodaty H. (2022). Reliability, Validity, and User-Experience of Remote Unsupervised Computerized Neuropsychological Assessments in Community-Living 55- to 75-Year-Olds. Journal of Alzheimer’s Disease, 90(4), 1629–1645. 10.3233/JAD-220665 [DOI] [PubMed] [Google Scholar]
  15. Kokmen E, Smith GE, Petersen RC, Tangalos E, & Ivnik RC (1991). The short test of mental status: Correlations with standardized psychometric testing. Archives of Neurology, 48(7), 725–728. 10.1001/archneur.1991.00530190071018 [DOI] [PubMed] [Google Scholar]
  16. Marinelli JP, Lohse CM, Fussell WL, Petersen RC, Reed NS, Machulda MM, Vassilaki M, & Carlson ML (2022). Association between hearing loss and development of dementia using formal behavioural audiometric testing within the Mayo Clinic Study of Aging (MCSA): a prospective population-based study. Lancet Healthy Longev, 3(12), e817–e824. 10.1016/S2666-7568(22)00241-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Morris JC (1993). The Clinical Dementia Rating (CDR): Current version and scoring rules. Neurology, 43(11), 2412–2414. 10.1212/WNL.43.11.2412-a [DOI] [PubMed] [Google Scholar]
  18. Nicosia J, Aschenbrenner AJ, Balota D, Sliwinski M, Tahan M, Adams S, Stout SH, Wilks H, Gordon B, Benzinger T, Fagan A, Xiong C, Bateman R, Morris J, & J H. (2022). Unsupervised High-frequency Smartphone-based Cognitive Assessments Are Reliable, Valid, and Feasible in Older Adults at Risk for Alzheimer Disease. PsyArXiv. 10.31234/osf.io/wtsyn. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Nicosia J, Wang B, Aschenbrenner AJ, Sliwinski MJ, Yabiku ST, Roque NA, Germine LT, Bateman RJ, Morris JC, & Hassenstab J. (2023). To BYOD or not: Are device latencies important for bring-your-own-device (BYOD) smartphone cognitive testing? Behavior Research Methods, 55(6), 2800–2812. 10.3758/s13428-022-01925-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Nosheny RL, Yen D, Howell T, Camacho M, Moulder K, Gummadi S, Bui C, Kannan S, Ashford MT, Knight K, Mayo C, McMillan M, Petersen RC, Stricker NH, Roberson ED, Chambless C, Gersteneker A, Martin R, Kennedy R,…Li Y. (2023). Evaluation of the Electronic Clinical Dementia Rating for Dementia Screening. JAMA Netw Open, 6(9), e2333786. 10.1001/jamanetworkopen.2023.33786 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Ohman F, Berron D, Papp KV, Kern S, Skoog J, Hadarsson Bodin T, Zettergren A, Skoog I, & Scholl M. (2022). Unsupervised mobile app-based cognitive testing in a population-based study of older adults born 1944. Front Digit Health, 4, 933265. 10.3389/fdgth.2022.933265 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Perin S, Buckley RF, Pase MP, Yassi N, Lavale A, Wilson PH, Schembri A, Maruff P, & Lim YY (2020). Unsupervised assessment of cognition in the Healthy Brain Project: Implications for web-based registries of individuals at risk for Alzheimer’s disease. Alzheimers Dement (N Y), 6(1), e12043. 10.1002/trc2.12043 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Petersen RC (2004). Mild cognitive impairment as a diagnostic entity. Journal of Internal Medicine, 256(3), 183–194. 10.1111/j.1365-2796.2004.01388.x. [DOI] [PubMed] [Google Scholar]
  24. Rao SM, Galioto R, Sokolowski M, Pierce M, Penn L, Sturtevant A, Skugor B, Anstead B, Leverenz JB, Schindler D, Blum D, Alberts JL, & Posk L. (2023). Cleveland Clinic Cognitive Battery (C3B): Normative, Reliability, and Validation Studies of a Self-Administered Computerized Tool for Screening Cognitive Dysfunction in Primary Care. Journal of Alzheimer’s Disease, 92(3), 1051–1066. 10.3233/JAD-220929 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Roberts RO, Geda YE, Knopman DS, Cha RH, Pankratz VS, Boeve BF, Ivnik RJ, Tangalos EG, Petersen RC, & Rocca WA (2008). The Mayo Clinic Study of Aging: Design and sampling, participation, baseline measures and sample characteristics. Neuroepidemiology, 30(1), 58–69. 10.1159/000115751 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Sabbagh M, Boada M, Borson P, Doraiswamy PM, Dubois B, Ingram J, Iwata A, Porsteinsson AP, Possin KL, Rabinovici GD, Vellas B, Chao S, Vergallo A, & Hampel H. (2020). Early Detection of Mild Cognitive Impairment MCI in an At Home Setting. J Prev Alzheimers Dis. 10.14283/jpad.2020.21 [DOI] [PubMed] [Google Scholar]
  27. Skirrow C, Meszaros M, Meepegama U, Lenain R, Papp KV, Weston J, & Fristed E. (2022). Validation of a Remote and Fully Automated Story Recall Task to Assess for Early Cognitive Impairment in Older Adults: Longitudinal Case-Control Observational Study. JMIR Aging, 5(3), e37090. 10.2196/37090 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Sliwinski MJ, Mogle JA, Hyun J, Munoz E, Smyth JM, & Lipton RB (2016). Reliability and Validity of Ambulatory Cognitive Assessments. Assessment, 25(1), 14–30. 10.1177/1073191116643164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Soobiah C, Tadrous M, Knowles S, Blondal E, Ashoor HM, Ghassemi M, et al. (2019) Variability in the validity and reliability of outcome measures identified in a systematic review to assess treatment efficacy of cognitive enhancers for Alzheimer’s Dementia. PLoS ONE 14(4): e0215225. 10.1371/journal.pone.0215225 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. St Sauver JL, Grossardt BR, Yawn BP, Melton LJ 3rd, Pankratz JJ, Brue SM, & Rocca WA (2012). Data resource profile: the Rochester Epidemiology Project (REP) medical records-linkage system. International Journal of Epidemiology, 41(6), 1614–1624. 10.1093/ije/dys195 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Stricker JL, Corriveau-Lecavalier N, Wiepert DA, Botha H, Jones DT, & Stricker NH (2023). Neural network process simulations support a distributed memory system and aid design of a novel computer adaptive digital memory test for preclinical and prodromal Alzheimer’s disease. Neuropsychology, 37(6), 698–715. 10.1037/neu0000847 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Stricker NH, Lundt ES, Alden EC, Albertson SM, Machulda MM, Kremers WK, Knopman DS, Petersen RC, & Mielke MM (2020). Longitudinal Comparison of in Clinic and at Home Administration of the Cogstate Brief Battery and Demonstrated Practice Effects in the Mayo Clinic Study of Aging. The Journal of Prevention of Alzheimer’s Disease, 7(1), 21–28. 10.14283/jpad.2019.35 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Stricker NH, Frank RD, Boots EA, Fan WZ, Christianson TJ, Kremers WK, Stricker JL, Machulda MM, Fields JA, Lucas JA, Hassenstab J, Aduen PA, Day GS, Graff-Radford NR, Jack CR, Graff-Radford J, Petersen RC (2025). Mayo Normative Studies: regression-based normative data for remote self-administration of the Stricker Learning Span, Symbols Test and Mayo Test Drive Screening Battery Composite and validation in individuals with Mild Cognitive Impairment and dementia. The Clinical Neuropsychologist, 39(5), 1–30. 10.1080/13854046.2025.2469340 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Stricker NH, Stricker JL, Frank RD, Fan WZ, Christianson TJ, Patel JS, Karstens AJ, Kremers WK, Machulda MM, Fields JA, Graff-Radford J, Jack CR Jr., Knopman DS, Mielke MM, & Petersen RC (2024). Stricker Learning Span criterion validity: a remote self-administered multi-device compatible digital word list memory measure shows similar ability to differentiate amyloid and tau PET-defined biomarker groups as in-person Auditory Verbal Learning Test. Journal of the International Neuropsychological Society, 30(2), 138–151. 10.1017/S1355617723000322 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Stricker NH, Stricker JL, Karstens AJ, Geske JR, Fields JA, Hassenstab J, Schwarz CG, Tosakulwong N, Wiste HJ, Jack CR Jr., Kantarci K, & Mielke MM (2022). A novel computer adaptive word list memory test optimized for remote assessment: Psychometric properties and associations with neurodegenerative biomarkers in older women without dementia. Alzheimers Dement (Amst), 14(1), e12299. 10.1002/dad2.12299 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Stricker NH, Twohy EL, Albertson SM, Karstens AJ, Kremers WK, Machulda MM, Fields JA, Jack CR Jr., Knopman DS, Mielke MM, & Petersen RC (2022). Mayo-PACC: A parsimonious preclinical Alzheimer’s disease cognitive composite comprised of public-domain measures to facilitate clinical translation. Alzheimers Dement. 10.1002/alz.12895 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Tang-Wai DF, Knopman DS, Geda YE, Edland SD, Smith GE, Ivnik RJ, Tangalos EG, Boeve BF, & Petersen RC (2003). Comparison of the short test of mental status and the mini-mental state examination in mild cognitive impairment. Archives of Neurology, 60(12), 1777–1781. 10.1001/archneur.60.12.1777 [DOI] [PubMed] [Google Scholar]
  38. Thompson LI, Harrington KD, Roque N, Strenger J, Correia S, Jones RN, Salloway S, & Sliwinski MJ (2022). A highly feasible, reliable, and fully remote protocol for mobile app-based cognitive assessment in cognitively healthy older adults. Alzheimers Dement (Amst), 14(1), e12283. 10.1002/dad2.12283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Walter S, Langford OG, Clanton TB, Jimenez-Maggiora GA, Raman R, Rafii MS, Shaffer EJ, Sperling RA, Cummings JL, & Aisen PS (2020). The Trial-Ready Cohort for Preclinical and Prodromal Alzheimer’s Disease (TRC-PAD): Experience from the First 3 Years. J Prev Alzheimers Dis, 7(4), 234–241. 10.14283/jpad.2020.47 [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Weiner MW, Aaronson A, Eichenbaum J, Kwang W, Ashford MT, Gummadi S, Santhakumar J, Camacho MR, Flenniken D, Fockler J, Truran-Sacrey D, Ulbricht A, Mackin RS, & Nosheny RL (2023). Brain health registry updates: An online longitudinal neuroscience platform. Alzheimers Dement, 19(11), 4935–4951. 10.1002/alz.13077 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Weiner MW, Nosheny R, Camacho M, Truran-Sacrey D, Mackin RS, Flenniken D, Ulbricht A, Insel P, Finley S, Fockler J, & Veitch D. (2018). The Brain Health Registry: An internet-based platform for recruitment, assessment, and longitudinal monitoring of participants for neuroscience studies. Alzheimers Dement, 14(8), 1063–1076. 10.1016/j.jalz.2018.02.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Young SR, Lattie EG, Berry ABL, Bui L, Byrne GJ, Yoshino Benavente JN, Bass M, Gershon RC, Wolf MS, & Nowinski CJ (2023). Remote Cognitive Screening Of Healthy Older Adults for Primary Care With the MyCog Mobile App: Iterative Design and Usability Evaluation. JMIR Form Res, 7, e42416. 10.2196/42416 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The data supporting the findings of this study are available upon reasonable request to the corresponding author and with approval from Mayo Clinic Study of Aging investigators.

RESOURCES