Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jul 1.
Published in final edited form as: Exp Aging Res. 2021 Mar 1;47(4):303–321. doi: 10.1080/0361073X.2021.1894072

Feasibility and Psychometric Integrity of Mobile Phone-Based Intensive Measurement of Cognition in Older Adults

Paul W H Brewster 1, Jonathan Rush 1, Lana Ozen 1, Rebecca Vendittelli 1, Scott M Hofer 1
PMCID: PMC8225552  NIHMSID: NIHMS1679213  PMID: 33648422

Abstract

There is a pressing need for assessment approaches that can be deployed remotely to measure cognitive outcomes in clinical trials and longitudinal aging cohorts. We evaluated the utility of a mobile phone-based intensive measurement study for this purpose. A small cohort of healthy older adults (N=17, mean age=73) completed five assessment “bursts” over 12 months, with each measurement burst involving two assessments daily for five consecutive days. Each assessment included brief tests of visual short-term memory and information processing speed, as well as surveys measuring state factors that can affect cognition. At study endpoint we had 94% retention, 97% compliance, and high participant satisfaction. Mobile cognitive test scores demonstrated good reliability, moderate correlations with in-person baseline neuropsychological testing, and significant associations with participant age and education level. We conclude that mobile phone-based intensive measurement designs represent a promising assessment approach for measuring cognition longitudinally in older adults.

Keywords: Intensive measurement, aging, longitudinal methods, psychometrics, cognition, ecological momentary assessment

Introduction

By 2050, the number of people living with dementia is estimated to triple world-wide (World Health Organization, 2017). With age being the main risk factor for development of dementia (Alary et al., 2017), the World Health Organization has proposed a research strategy to “establish longitudinal cognitive surveillance of healthy individuals to detect earliest changes that distinguish premanifest neurodegenerative disease causing dementia from normal aging” (Shah et al., 2016). The feasibility and impact of such large-scale efforts are limited by traditional cognitive assessment methods, which are expensive, lack sensitivity to subtle change over time, and are vulnerable to state factor influences on cognition (e.g., deleterious effects of stress, fatigue, mood). Mobile and web-based technologies have become ubiquitous over the past two decades and can address some of the limitations of traditional assessment approaches by permitting greater precision of measurement, increased flexibility around assessment scheduling, and self-administration in the participant’s natural environment. Mobile technologies are increasingly accessible to the older population, with smartphone ownership now exceeding 60% among Canadians age 65 and older (Statistics Canada, 2020). Responding to both the measurement limitations of traditional assessment approaches and the now urgent need for remote data collection methods, this manuscript describes a use case of self-administered mobile cognitive assessment that capitalizes upon the flexibility of assessment scheduling to improve statistical power to detect within-person change.

Most traditional longitudinal study designs model cognitive change as a fixed, linear process. This is imperfect because performance on cognitive tests can fluctuate as a function of multiple factors that operate on vastly different timescales (Martin & Hofer, 2004). For example, mood state and sleep quality may predict day-to-day fluctuations in cognition by way of their influence on examinee alertness during testing, shifts in health status or medication compliance may affect an examinee’s cognitive functioning across weeks or months, while normative and pathological aging-related brain changes can be expected to yield progressive changes in cognition over years and decades. Rast and colleagues (2012) demonstrated in a longitudinal aging cohort that as much as half of variation in cognitive performance over four years was accounted for by short-and medium-term timescales. Conventional longitudinal designs do not permit delineation of changes across these varying timescales, which thereby limits statistical power to detect progressive, within-person change (Rast & Hofer, 2014).

Intensive measurement designs provide one approach for delineating short-term performance variation from longer-term change. As illustrated in Figure 1, these designs distribute “bursts” of closely-spaced assessments over longer intervals, which permits statistical decomposition of performance variation across short (within-burst) and longer-term (across-burst) timescales. These designs have historically been prohibitively resource-intensive, but with mobile technologies now permitting remote, self-administered assessments, they are becoming increasingly feasible for use in clinical trials and longitudinal cohorts.

Figure 1.

Figure 1

Intensive Measurement Design

Self-administered ambulatory and home-based cognitive assessments have been successfully implemented in several measurement burst studies, and have been demonstrated feasible and valid for use with general adult populations. Sliwinski and colleagues (2018) examined the reliability and validity of self-administered mobile cognitive testing deployed within a 14-day measurement burst study in an adult sample ranging in age from 25-65. Their battery included three digital cognitive tasks assessing perceptual speed and working memory, with participants completing five assessments daily across 14 consecutive days. Aggregate scores from these tasks demonstrated excellent between-person reliability (≥.97) and evidence of construct validity relative to in-person cognitive assessment. They also demonstrated modest within-person reliability across assessment sessions, suggesting that intensive measurement may permit detection of within-person fluctuations in state factors that can affect cognition. Based on the obtained between-person reliability estimates the authors projected that as few as five assessments – the equivalent of one day’s worth of intensive measurement – was sufficient for a reliability of .90 or higher.

Rates of compliance with self-administered intensive measurement study designs have not been reported for older adults, but this topic has been examined in general adult samples. Sliwinski and colleagues (2018) reported an 85% completion rate in the aforementioned 14-day measurement burst study. Daniëls and colleagues (2020) reported a 71% completion rate for a mobile app-based intensive measurement study involving eight assessments per day across six consecutive days. The cognitive tasks, though different from those examined by Sliwinski and colleagues, similarly assessed visual working memory and perceptual speed. Their sample ranged in age from 19-73 with no reported effect of age on completion rate. Lancaster and colleagues (2020) reported 88% participant retention in a mobile app-based study of adults age 40-59 involving daily assessments across up to 36 days. Their assessment battery included up to three cognitive tasks, with completion rate differing between tasks and ranging from 38% to 75%. User feedback indicated that participants found the app-based assessments to be acceptable (average rating of 8.3/10) and enjoyable (average task enjoyment ratings ranged from 6.44/10-7.26/10). Taken together, it seems that compliance rates are high in single-burst intensive measurement designs, but can vary depending on the cognitive tasks included.

Demographic trends indicate that older adults are increasingly comfortable using mobile devices (Pew Research Center, 2017). However, older adults remain the smallest age group of technology users (Statistics Canada, 2017), and age-associated sensory changes could make the smaller user interface of a mobile phone screen less practical for deployment of cognitive assessments with this age group (Muniz-Terrera, Watermeyer, Danso, & Ritchie, 2019). As described above, prior intensive measurement research with cognition as the outcome has examined a general adult population, and has focused on variability within a single measurement burst, rather than examining performance trajectories across multiple bursts. Empirical data over longer periods of time and with an older population are thus needed to help determine the utility of mobile intensive measurement designs for supporting early detection of cognitive change in older adults.

We deployed a 12-month mobile phone-based intensive measurement study in a small sample of cognitively intact older adults. The tasks and mobile software were the same as those used by Sliwinski and colleagues (2018). The primary objective was to examine the feasibility of this measurement approach for use longitudinally with older adults as captured by participant retention, compliance, and satisfaction. We also sought to compare preliminary reliability and validity evidence from this cohort to the single burst data previously reported by Sliwinski and colleagues (Sliwinski et al., 2018).

Methods

Participants

Participants were recruited from the listserv of the Institute on Aging and Lifelong Health (IALH) at the University of Victoria. The listserv is a registry of older adults in the Victoria, British Columbia region who have previously provided IALH with their contact information to be informed of upcoming research opportunities. Eligibility criteria required that participants be age 65+ living in the Victoria, BC area with no diagnosis of dementia, traumatic brain injury, or stroke, and with no hand tremor or visual impairment that would prohibit the use of a mobile phone app. This is primarily a feasibility study and so it was of interest to us to determine the number of interested participants that would meet eligibility criteria. Of the first 20 participants to self-refer to this study, 17 met eligibility criteria and were included in the study. Mean age at study baseline was 73 (SD = 5.79, range = 65-85). Ten participants were female (58%). All participants had at least a high school diploma, with 41% completing an undergraduate degree and 24% completing a graduate degree. All participants self-described as white/Caucasian. Regarding medical history, thirteen (77%) participants reported arthritis, four participants (24%) reported past or current cardiovascular disease, and six participants (35%) reported a history of neurological injury (two strokes, one TIA, four concussions). No participants reported past or current psychiatric illness. The study was reviewed and approved by the University of Victoria Research Ethics Board.

Measures

Participants completed 1) a telephone screening assessment to determine eligibility, 2) an in-person baseline assessment involving neuropsychological testing and a brief inventory of mood and personality questionnaires, 3) a mobile phone-based intensive measurement protocol involving five assessment “bursts”, each comprised of five consecutive days, spaced over 12 months, and 4) a user experience survey at study endpoint. The intensive measurement assessment schedule is illustrated in Figure 2.

Figure 2.

Figure 2

Intensive Measurement Schedule

Baseline Cognitive Assessment

We used Version 3 of the National Alzheimer’s Coordinating Center (NACC) Uniform Data Set (UDS) neuropsychological battery to capture baseline cognition among study participants. The NACC UDS is described in detail elsewhere (Weintraub et al., 2018), but in brief this battery includes the Montreal Cognitive Assessment (MoCA), a story memory learning and delayed recall task (Craft Story), a complex figure copy and recall task (Benson Figure), forward and backward digit span (Number Span Test), an object naming test (Multilingual Naming Test; MINT), Phonemic Fluency (letters F and L), Category Fluency (animals and vegetables), and the Trail Making test.

Baseline Questionnaires

As part of the baseline assessment participants completed the NACC UDS Subject Health History, the Geriatric Depression Scale (GDS; Yesavage et al., 1983), the UCLA Loneliness Scale (Russell, Peplau, & Cutrona, 1980), and the Lawton Brody Instrumental Activities of Daily Living Scale (Lawton & Brody, 1969). All baseline participant data were stored using REDCap (Harris et al., 2019).

Mobile Assessments

Participants completed two brief, previously validated cognitive tasks as part of each mobile assessment. The mobile assessments were administered on Asus Zenfone 4 Max devices, which are mid-range cellular phones with a 5.5” display (720 x 1280 pixels) and a 60 Hz refresh rate. Participants were provided with these devices with the mobile assessment software pre-installed.

Dot Memory.

The Dot Memory test measures visual short-term memory. Each trial consists of 3 phases: Encoding, distraction, and retrieval. During the encoding phase, the participant is tasked with remembering the locations of three dots on a 5x5 grid. After a three-second study period, the grid is removed and there is a brief distraction task. After performing the distraction task for six seconds, an empty grid appears on the screen and participants were prompted to recall the locations of the initially presented dots by tapping the appropriate grid location. Participants completed three trials of this task (encoding, distractor, retrieval), with a 1-second delay between trials. The dependent variable was an error score, with partial credit given based on the deviation of responses from the correct positions on the grid. If all dots were recalled in their correct location the participant received a score of zero. In the case of one or more retrieval errors, Euclidian distance of the location of the incorrect dot to the correct grid is calculated, with higher scores indicating less accurate placement and poorer performance. Figure 3 provides a visual of the task. Sliwinski and colleagues (2018), in their 14-day burst study with a general adult sample, reported scores on the Dot Memory test had high loadings (−.71) on a latent factor derived from in-person tests of working memory, and showed good discriminant validity. They also reported intraclass correlations (ICCs) of .97 for scores averaged across burst.

Figure 3.

Figure 3

Dot Memory Test

Symbol Search.

The Symbol Search test measures visual attention and perceptual speed. Participants were shown a row of three symbol pairs at the top of the screen, and are simultaneously presented with two symbol pairs at the bottom of the screen. The task is to decide, as quickly as possible, which of the two pairs presented at the bottom of the screen matches one of the pairs at the top of the screen. Stimuli are presented until a response is provided. Participants completed 12 trials of this task. The dependent variable was mean response time of correct trials. Figure 4 provides a visual of the task. Sliwinski and colleagues (2018) found scores on this test to demonstrate medium loadings (.67) on a latent factor derived from in-person tests of perceptual speed with good discriminant validity. Aggregate scores were reported to have an ICC of .98.

Figure 4.

Figure 4

Symbol Search Test

In-App Questionnaires.

The 20-item Positive and Negative Affect Schedule (PANAS; Watson et al., 1988) was included to capture current mood state. Also included were single-item questions assessing sleep quality and duration, physical exercise, subjective fatigue, distractibility, pain, and subjective effort put forth during completion of the cognitive tasks. Responses to all but the physical activity questions were Likert-based. These in-app questionnaires took approximately three minutes to complete.

Research Design and Procedure

Prospective participants were screened for eligibility using an adaptation of the Memory and Aging Telephone Screen (MATS; Rabin et al., 2007). Those that met eligibility criteria were invited to an in-person assessment at IALH, where informed consent was obtained, the baseline assessment was completed, and participants were trained on the mobile phone-based assessments. Participants completed the first mobile phone-based measurement burst at study baseline and completed subsequent bursts remotely at month three, six, nine, and at the 12-month study endpoint. The mobile assessments were completed on locked-down mobile devices that were provided to participants at the beginning of each assessment burst and retrieved by mail at the end of each assessment burst. Each measurement burst involved completing two mobile assessments daily across five consecutive days. Each mobile assessment included the cognitive tasks and the described battery of in-app questionnaires and took about five minutes to complete. A user experience survey was included at the study endpoint to capture participant satisfaction with the measurement protocol. Germane to the current study were the following questions: 1) Was the assessment schedule manageable in the context of your daily life? (Yes/No); 2) How user-friendly did you find the mobile technology? (0-100); 3) How much did you enjoy using the mobile technology? (0-100).

Analytical Approach

We assessed feasibility 1) by tracking participant retention and compliance with the intensive measurement protocol over the 12-month study period, and 2) based on participant satisfaction, as assessed by the user experience survey completed at the study endpoint. We assessed between-person reliability at the burst level using the ICC2, an extension of the ICC based on the average of multiple measurements (Bliese, 2000). The ICC2 differs from the ICC in that the residual variance (σe2) is divided by the number of measurement occasions. This is the same approach to measuring between-person reliability as was used by Sliwinski and colleagues (2018). Because this approach assumes an equal number of measurements for all participants, we examined both the ICC2 values for the full sample, and after excluding participants with one or more missed assessments within each burst. We also examined the coefficient of stability of burst-level data for each of the two cognitive tests. We established a value of .80 as our a priori standard for acceptable reliability (Nunnally & Bernstein, 1994). We investigated construct validity first by examining bivariate associations between each of the two mobile cognitive tests (averaged across the first intensive measurement burst) and performance on the baseline administration of the NACC UDS neuropsychological test battery. Those baseline measures that correlated significantly with burst 1 mobile cognitive test performance were then examined as predictors in mixed linear regression analyses that permit decomposition of various sources of performance variation (analytical approach is described below). We then investigated criterion validity by examining age, education, and state factor influences on the same mobile test parameters. State factor variables were obtained from the in-app questionnaires, and included self-reported negative and positive affect as reported on the PANAS, as well as self-rated pain, fatigue, distractibility, and effort.

Mobile Data Cleaning

Response time data for the Symbol Search test were trimmed at 250ms and 6000ms to eliminate accidental responding and trials spoiled by environmental distraction. For Dot Memory, accuracy was reflected in a distance score. The Euclidian distance from each user inputted dot (i.e., during the retrieval phase) to each computer-generated dot (i.e., those seen during the encoding phase) was recorded by the device. The Euclidean distance across all 3 dots per trial was summed for all possible permutations in the statistical programming software R, and participants were assigned the lowest possible score per trial.

Multilevel Modelling

We used 3-level mixed linear regression analysis to simultaneously model short-term (within-burst) and longer-term (across-burst) variation in performance on the mobile cognitive tests. This analytical approach is described in detail elsewhere (Rast et al., 2012). Level 1 of the multilevel models captures within-burst variation (e.g., for each burst, within-person performance variation across five consecutive days). Level 2 captures across-burst variation (e.g., within-person performance variation across the five measurement bursts). The time metric for Level 1 is days (e.g., days 1-5 for each burst), while the time metric for Level 2 is bursts (e.g., bursts 1-5 spaced 3 months apart over 12 months). Level 3 captures individual-level performance variation, which includes between-person variability across all levels. Both linear and quadratic effects were included to capture within-person change at Level 1 and Level 2, as well as accelerations or decelerations in change within and across bursts. Cross-level interactions were included to account for potential across-burst changes in within-burst slope effects (e.g., due to practice effects associated with repeated exposure to the cognitive tasks). Random effects were specified at Level 3 to capture individual differences in Level 1 and Level 2 intercept and slope values.

We used the aforementioned 3-level model to examine between-and within-person predictors of performance variation. Due to the small sample size and exploratory nature of these analyses, we conducted a series of univariate analyses whereby each predictor was examined individually in the 3-level model. The only exception to this were age and education, which were included together in the same model to avoid confounding.

Each between-person predictor (e.g., static variables like baseline demographic characteristics, scores from the baseline assessment) was included by itself at Level 3 of the model (between-person level) along with its interaction with Level 1 and Level 2 slope estimates. This allowed us to determine the extent to which a baseline characteristic influenced overall performance level on the mobile tasks (Level 3), and the extent to which baseline characteristics predicted between-person differences in change in performance on the mobile tasks within bursts (Level 1), and across bursts (Level 2).

Within-person (e.g., time-varying) predictors such as self-rated fatigue or negative affect at the time of each assessment are dynamic over time, and so the analytical approach for these variables differs from the approach for examining between-person (e.g., baseline) predictors. At Level 1, person-level means for each day (e.g., mean level of fatigue or negative affect across the morning and evening session completed that day) were included and centred on the person-level mean for that measurement burst (e.g., mean level of fatigue or negative affect across all five days of the measurement burst). This level of analysis informs whether cognitive performance is lower during days where participants are reporting a relatively higher level of negative affect, fatigue, pain, etc. At Level 2 (across bursts) we included the person-level within-burst mean centred on the person-level grand mean. This level of analysis reveals, independent of the day-to-day fluctuations captured at Level 1, whether cognitive performance level within a measurement burst varies as a function of the relative level of fatigue, negative affect, etc. reported during that burst. This captures mid-term (e.g., week-to-week) performance fluctuations. At Level 3 (between-person) we used the person-level grand mean centred on the sample grand mean for each time-varying predictor. This tells us whether participants who report a high overall level of a certain state-level characteristic perform differently on the cognitive tasks than those who may report a lower level of this characteristic. For example, this level of analysis would inform whether individuals who tend to experience high negative affect overall relative to the rest of the sample also perform more poorly overall on the cognitive tasks. All mixed linear regression analyses were conducted using MPlus version 8 with Bayesian estimation (Muthén & Muthén, 1998-2017). Data and syntax are available upon request.

Results

Sample Characteristics

Baseline performance of the study sample on the NACC UDS neuropsychological test battery is reported in Table 1. As reflected in the MoCA, Lawton-Brody, GDS, and UCLA Loneliness Scale scores, this is a functionally independent, non-depressed, cognitively intact sample. Three participants (18%) each had one score falling below the ninth percentile relative to the demographically adjusted NACC UDS normative sample. Note that isolated low scores are common among healthy older adults (Palmer et al., 1998) so we do not interpret this as clinically significant.

Table 1.

Group-Level Performance on In-Person Baseline Assessment Battery

Questionnaire/Test Mean (SD)
Montreal Cognitive Assessment 27.9 (1.85)
Craft Story Recall (Immediate) 20.3 (6.08)
Craft Story Recall (Delayed) 17.7 (4.45)
Benson Figure Copy 15.8 (0.56)
Benson Figure Recall 12.0 (2.69)
Number Span
 Forward (Correct) 10 (1.58)
 Backward (Correct) 8.8 (2.04)
Category Fluency Test
 Animals (total) 24.2 (5.25)
 Vegetables (total) 17.1 (3.68)
Trail Making Part A (secs) 29.8 (8.76)
Trail Making Part B (secs) 76.4 (23.17)
Multilingual Naming Test Total 30.8 (1.85)
Verbal Fluency Test
 F-words (total) 15.8 (3.94)
 L-words (total) 15.6 (4.01)
Lawton-Brody IADL Scale 8.0 (0)
UCLA Loneliness Scale 8.5 (7.58)
Geriatric Depression Inventory 1.0 (1.17)

Note. SD = standard deviation. IADL = Instrumental activities of daily living. UCLA = University of California, Los Angeles.

Feasibility: Participant Retention, Compliance, and Satisfaction

Of the 17 participants enrolled in this study, 16 (94%) completed the 12-month study in its entirety. One participant chose to withdraw from the study at the 9-month point, in March 2020, citing concern regarding COVID-19. Data from the baseline assessment and the four mobile intensive measurement bursts for this participant were retained for all study analyses.

Regarding compliance with the mobile assessment schedule, 97.5% of all scheduled mobile assessments were completed within the allotted window of time (i.e., 818/840 possible assessments). Most participants (11; 65%) missed one or more mobile assessments at some point over the study. This is out of a total of 50 assessments per participant over the five bursts of the study. There was an average of 1.33 missed assessments (SD = 1.56) among those participants that missed at least one assessment, with a range of 1-6.

Regarding user satisfaction, 100% of participants endorsed the intensive measurement protocol as manageable in the context of their daily lives. With regard to user-friendliness, participants rated the mobile component of the study at a mean of 78.63/100 (SD = 19.79), with 100 being highest-possible user-friendliness. Participants rated their enjoyment of the mobile component of the study at a mean of 85.64/100 (SD = 3.79), with 100 being highest-possible enjoyment.

Reliability

Reliability estimates for burst-level performance on the Dot Memory and Symbol Search tests are presented by burst in Table 2. Dot Memory scores from bursts 1, 2, 4 and 5 all demonstrated acceptable between-person reliability at or above the .80 range. Reliability for burst 3 was just below the acceptable range with a between-person reliability estimate of .79. With regard to scores from the Symbol Search test, between-person reliability was in the excellent range at .94 or higher across all bursts. Inclusion of scores from participants with 1 or more missed sessions within each burst did not meaningfully alter the pattern of results for either test. Burst-level retest reliability between burst 1 and burst 2 was .85 for Dot Memory and .87 for Symbol Search.

Table 2.

Three-Level Model Parameters for Dot Memory

Fixed effect Model 1 Model 2
Estimate SE p (one-tail) Estimate SE p (one-tail)
Intercept 6.665 0.924 <.01 8.803 1.129 <.01
Day 0.756 0.521 0.10
Burst −1.943 0.416 <.01
Day2 −0.135 0.075 0.034
Burst2 0.172 0.084 0.033
Day*Burst 0.055 0.103 0.256
Random effect Estimate SE p (one-tail) Estimate SE p (one-tail)
Person Level
 Intercept 11.736 6.55 <.01 8.873 5.367 <.01
 Burst 0.395 0.713 <.01
 Day 0.197 0.227 <.01
 Day2 0.004 0.006 <.01
 Burst2 0.017 0.03 <.01
 Day*Burst 0.047 0.062 <.01
Burst Level
 Intercept 3.198 0.943 <.01 0.522 0.403 <.01
 Day 0.029 0.025 <.01
 Day2 0.002 0.002 <.01
Day Level
 Residual 10.475 0.808 <.01 9.43 0.764 <.01

Note. SE = standard error. Day2 indicates within-burst quadratic slope effects. Burst2 indicates across-burst quadratic slope effects.

Multilevel Modelling

Results of the multilevel modeling analyses for Dot Memory and Symbol Search are presented in Tables 2 and 3, respectively. Model 1 is an unconditional model where variance components are estimated for each cognitive task at the within-burst, across-burst, and between-person levels. These are examined in relation to total performance variation using the ICC. For Dot Memory, the proportion of between-person variance was .46, the proportion of across-burst variance was .13, and the proportion of within-burst variance was .41. For Symbol Search, the proportions of between-person, across-person, and within-person variance were .71, .14, and .14, respectively. This indicates that the majority of variation in performance on the Symbol Search test is at the between-person level, whereas performance variation on Dot Memory is roughly equal at the between-person and within-burst levels. Variation in performance across bursts (e.g., within-person change) accounts for only a small proportion of variance for both tasks.

Table 3.

Three-Level Model Parameters for Symbol Search

Fixed effect Model 1 Model 2
Estimate SE p (one-tail) Estimate SE p (one-tail)
Intercept 2.524 0.147 <.01 2.999 0.176 <.01
Day −0.113 0.05 0.01
Burst −0.237 0.052 <.01
Day2 0.002 0.008 0.395
Burst2 0.035 0.012 0.005
Day*Burst 0.027 0.017 0.095
Random effect Estimate SE p (one-tail) Estimate SE p (one-tail)
Person Level
 Intercept 0.308 0.120 <.01 0.411 0.233 <.01
 Burst 0.017 0.017 <.01
 Day 0.007 0.008 <.01
 Day2 0.000 0.000 <.01
 Burst2 0.001 0.001 <.01
 Day*Burst 0.001 0.001 <.01
Burst Level
 Intercept 0.061 0.012 <.01 0.006 0.005 <.01
 Day 0.003 0.002 <.01
 Day2 0.000 .000 <.01
Day Level
 Residual 0.062 0.004 <.01 0.048 0.004 <.01

Note. SE = standard error. Day2 indicates within-burst quadratic slope effects. Burst2 indicates across-burst quadratic slope effects.

Model 2 builds upon Model 1 by including linear and quadratic slope terms for capturing within-and across-burst change. Also included are cross-level interactions to capture across-burst changes in within-burst slope effects. For Dot Memory, there was a significant linear slope effect across bursts indicating improved performance over time. There was no linear or quadratic slope effect within bursts, nor was there significant across-burst change in within-burst slope effects. For Symbol Search, there was a significant linear within-burst slope effect suggestive of progressively faster performance across days within each burst. Also significant were across-burst linear and quadratic slope effects, suggesting progressively faster performance across bursts with a gradual slowing of performance gains over time. Slope estimates are plotted in Figures 5 and 6 relative to mean-level trajectories on both tasks.

Figure 5.

Figure 5

Mean trajectory of Dot Memory Performance vs. Plotted Parameter Estimates from Three-Level Mixed Linear Modeling

Figure 6.

Figure 6.

Mean trajectory of Symbol Search Performance vs. Plotted Parameter Estimates from Three-Level Mixed Linear Modeling

Construct Validity

Table 4 presents correlations between baseline scores on the NACC UDS neuropsychological test battery and burst 1 aggregate scores for Dot Memory and Symbol Search. Dot memory scores demonstrate moderate correlations with backward digit span, part B of the Trail Making test, figure recall, and object naming. Symbol Search response time also correlated with part B of the Trail Making Test, as well as forward digit span. Building upon Model 2, we included each of these baseline scores as predictors of model slope terms in a series of univariate analyses (see Table 5).

Table 4.

Correlations between Mobile and In-Person Cognitive Test Scores

Burst 1 Symbol
Search (Mean RT)
Burst 1 Dot Memory
Distance Score
Digit Span Forward −.511* −.198
Digit Span Backward −.247 −.536**
TMT A .362 .200
TMT B .666** −.510*
Phonemic Fluency −.047 −.029
Category Fluency −.323 −.261
Story Learning −.287 −.335
Story Recall .055 .029
Figure Copy .124 −.338
Figure Recall −.376 −.592*
Object Naming −.412 .641**

Note. RT = reaction time. TMT = Trailmaking Test.

*

indicates p-values less than or equal to 0.05.

**

indicates p-values less than or equal to 0.01.

Table 5.

Baseline Neuropsychological Tests Yielding Significant Associations with Mixed Linear Regression Model Parameters

Trailmaking Test part B Figure Recall
Dot Memory Est. SE p (one-tail) Est. SE p (one-tail)
Day −0.019 0.021 0.184 0.334 0.192 0.072
Burst 0.016 0.025 0.263 0.441 0.224 0.02
Day2 0.003 0.003 0.152 −0.067 0.031 0.027
Burst2 −0.001 0.005 0.409 −0.077 0.05 0.041
Day*Burst −0.003 0.004 0.211 −0.006 0.041 0.44
Between-Person 0.101 0.043 0.01 −1.041 0.438 0.011
Trailmaking Test part B Forward Digit Span
Symbol Search Est. SE p (one-tail) Est. SE p (one-tail)
Day −0.005 0.002 0.025 0.058 0.033 0.044
Burst 0.003 0.003 0.155 −0.02 0.041 0.313
Day2 0.001 0.000 0.100 −0.007 0.005 0.100
Burst2 −0.001 0.001 0.095 0.006 0.009 0.270
Day*Burst 0.000 0.001 0.420 −0.003 0.011 0.390
Between-Person 0.019 0.007 .000 −0.245 0.103 0.017

Note. SE = standard error. Day2 indicates within-burst quadratic slope effects. Burst2 indicates across-burst quadratic slope effects.

For Symbol Search, baseline scores for forward digit span and Trail Making Test part B both predicted higher level 3 (between-person) performance level, and lower linear level 1 performance increases within bursts. That is, better performance on these baseline tasks predicted faster overall performance on the Symbol Search task, but less within-burst improvement in response time. There was no significant association with across-burst changes in performance. For Dot Memory, figure recall scores predicted level 1, level 2, and level 3 parameters. Better figure recall was associated with higher overall performance on Dot Memory, quadratic performance increases within bursts, and both linear and quadratic performance increases across bursts. Trail Making Test scores predicted level 3 (between-person) performance on Dot Memory but did not predict any level 1 or level 2 model slope terms. Neither object naming nor backward digit span predicted any model parameters for Dot Memory (results not displayed).

Criterion Validity

We examined criterion validity first by examining the influence of participant age and education level on model parameters. For Symbol Search, education level was associated with a level 2 (across-burst) linear slope effect such that higher education predicted faster performance increases across bursts (Est. = −0.127, SE = .069, p = .036). No age effect was observed and education did not predict any other model parameters. For Dot Memory, level 3 (between-person) effects were significant for education (Est. = −2.790, SE = .984, p = .004), and marginally significant for age (Est. = 0.243, SE = .148, p = .058). Age significantly moderated the level 1 quadratic slope effect (Est. = .024, SE = .009, p <.01), and was also associated with a cross-level interaction between level 1 and level 2 (Est. = −0.042, SE = 0.016, p <0.01). Taken together, these findings suggest that older participants are initially slower to reach asymptote within each measurement burst, but this effect attenuates across bursts. Finally, we examined associations between model slope terms and time-varying predictors derived from the in-app state questionnaires (affect, distractibility, pain, fatigue, subjective effort). No significant associations were observed with either of the mobile cognitive tasks across any of the examined model parameters (ps all >.05).

Discussion

We found very good retention, compliance, and user satisfaction with a year-long mobile phone-based intensive measurement study in a cohort of healthy older adults. We had a 100% retention rate for most of the study, with the loss of only one participant during the final measurement burst, which coincided with the onset of the COVID-19 pandemic. Although most participants missed at least one session, this level of data loss is insubstantial given the 50 total sessions per participant over the course of this 12-month study. There is a lingering stereotype that older adults are uncomfortable using mobile technology despite clear demographic trends indicating the opposite (Pew Research Center, 2017), and indeed participants rated this study protocol as highly user-friendly, enjoyable, and manageable within the context of their daily lives. We interpret these findings together to provide strong support for the utility of longitudinal self-administered mobile phone-based intensive measurement study designs in older adult samples. As a caveat to this, our sample was highly educated and self-referred to participate in a study that was advertised to involve mobile assessments. Thus it can be assumed that a selection bias contributed to the user satisfaction and compliance rates. The current results cannot be assumed to generalize to epidemiological cohorts where a broader sociodemographic range is represented. Future studies will be needed to examine generalizability of our findings to such cohorts.

To our knowledge, this is the first published study to track cognition longitudinally in a sample of older adults using distributed intensive measurement bursts. Performance variation for both tasks operated primarily at the between-person level, with only 13% of variation in Dot Memory scores and 14% of variation in Symbol Search scores operating longitudinally across bursts. Across-burst slope terms revealed that performance on both tasks tended to improve across the 12 months of observation. Note that this is a small cohort of healthy older adults, and so it is not surprising that practice effects would overshadow any age-or disease-associated declines. Future research in larger samples will permit multigroup analyses to investigate individual differences in these parameters. There is also the potential to apply double negative exponential models to permit delineation of practice gains from other influences on longitudinal change (e.g., Munoz, Sliwinski, Scott, & Hofer, 2016). These analyses are beyond the scope of this sample but will be important outcomes for future research.

We found good between-person reliability of the two examined mobile cognitive tests at the aggregate burst level. Reliability estimates for Dot Memory Scores were at or just below (burst 3 ICC2 = .79) our threshold of .8 for acceptable reliability, while Symbol Search scores were consistently in the excellent range (all ICC2s ≥ .94). These reliability estimates are slightly lower than those obtained by Sliwinski and colleagues (2018) from a more intensive study conducted with a general adult population. Their protocol involved five sessions daily across 14 consecutive days, which we regarded as potentially burdensome for participants in a longitudinal study. In their manuscript, Sliwinski and colleagues extrapolated from their findings that ten measurement sessions (equivalent to two days of testing in their measurement protocol) would yield reliability estimates of .95 for Symbol Search and .86 for Dot Memory, which is remarkably close to what we report here. We can thus conclude that reliability of these measures is similar between older adults (as reported in this paper) and a general adult cohort (as reported by Sliwinski and colleagues). Retest reliability was very good for both tasks across a three-month interval, which further supports the stability of these burst-level data among healthy older adults. The Dot Memory task included only three trials per session, and given the good user satisfaction with the current measurement protocol it may be worthwhile for future studies to increase the number of trials to further improve reliability.

With regard to validity evidence, we observed significant correlations between aggregate burst 1 Symbol Search and Dot Memory scores and baseline performance on subtests of the NACC UDS neuropsychological battery. The associations made sense conceptually in that Dot Memory (a visual short-term memory task) correlated moderately with measures of complex attention/working memory, visual memory, and semantic memory, whereas Symbol Search (a task primarily assessing attention and information processing speed) correlated moderately with measures of simple attention and controlled sequential processing. These analyses are preliminary in nature but they suggest that mobile self-administered cognitive assessments may capture the same constructs that are measured by in-person performance on neuropsychological tests of attention, controlled sequential processing, and memory.

Inclusion of neuropsychological test scores and baseline demographics as predictors in the mixed linear regression analyses permitted a more detailed analysis of their associations with performance on the mobile cognitive tests. Most associations were observed at the between-person level. However, age and higher performance on two of the examined neuropsychological tests predicted within-burst practice effects, education level predicted linear slope effects for Symbol Match across measurement bursts, and higher performance on figure recall predicted both linear and quadratic slope effects for Dot Memory across measurement bursts. The capability of three-level regression models to capture nuanced individuals differences in learning characteristics may be useful for detecting cognitive changes in at-risk individuals at the earliest stages (e.g., Jutten et al., 2020). Deployment of this study design in at-risk samples will help elucidate the most sensitive parameters for the purpose of early detection.

Despite the relatively restricted educational range of the study sample, education level had a moderate effect on both Symbol Match and Dot Memory scores, predicting stronger practice effects on Symbol Match across measurement bursts, and higher overall performance on Dot Memory. The positive effect of education on cognitive performance has been thoroughly described in the literature (e.g., (Opdebeeck, Martyr, & Clare, 2016), and is hypothesized to operate through multiple pathways across the life course (Lövdén, Fratiglioni, Glymour, Lindenberger, & Tucker-Drob, 2020). Education is understood to contribute primarily to between-person differences in cognitive ability, with minimal impact on slopes of cognitive decline in old age (Lövdén, Fratiglioni, Glymour, Lindenberger, & Tucker-Drob, 2020). It is not surprising, then, that education level predicted between-person differences in performance on the Dot Memory test. This task measures visual short-term memory, which has been previously shown to be robustly associated with education level among older adults (Pliatsikas et al, 2019). The finding of stronger practice effects on Symbol Match reaction time scores among more highly educated participants is novel, as prior longitudinal studies have not observed participant demographic characteristics to influence practice effects (e.g., Gross et al, 2015). Replication of this finding is warranted given the small sample size. If replicated, this association may reflect greater active cognitive reserve mechanisms among participants with higher education levels (e.g., Stern, Barnes, Grady, Jones, & Raz, 2019).

None of the in-app state factor questionnaires predicted cognitive performance at any level of analysis. This is in contrast to previous laboratory-based intensive measurement studies that have reported within-person coupling of daily stressors and cognitive performance (Sliwinski et al., 2010). This discrepancy may reflect design factors specific to the current study, as we included fewer within-day assessments than has been the case in studies where short-term performance fluctuations constitute the outcome of interest. Limitations in power due to the small sample size in the current study (N = 17) could have also contributed to the inability to detect significant associations with state factors. We did not examine within-person reliability of the cognitive assessments within measurement bursts, but Sliwinski and colleagues (2018) reported only modest within-person reliability in their measurement design, which was more intensive and consisted of more within-burst assessment days (14 vs. 5) than the present study. Suboptimal within-person reliability of the cognitive assessments may thus had a limiting effect on our ability to detect coupled associations between cognitive performance and state factors at the time of testing. There is a trade-off between frequency of assessments and participant satisfaction with the measurement protocol, and the optimal design will differ as a function of the timescales of interest. For studies of aging and dementia where longitudinal change may constitute the main outcome of interest, lower within-person reliability may be less of a design limitation, whereas studies examining the cognitive sequelae of factors operating on a shorter timescale (e.g., sleep, exercise, substance use) may require a higher number of sessions within measurement bursts in order to detect these associations. More research is needed to determine the optimal timescales for each of these use cases.

In summary, we interpret our findings to provide strong evidence of feasibility and between-person reliability of self-administered mobile cognitive assessments for use with older adults. We report preliminary evidence of construct validity relative to in-person neuropsychological assessment, and criterion validity by way of sensitivity to age and education effects in this small sample of relatively restricted demographic range. Our small sample size and the demographic homogeneity of this self-referred convenience sample will limit the generalizability of these findings, and replication is warranted in larger and more diverse samples. Future research is needed to optimize the assessment scheduling within measurement bursts, and to characterize performance characteristics in cohorts at elevated risk of dementia. These represent worthwhile avenues of investigation given the pressing need for remote approaches for cognitive assessment and the methodological advantages of intensive measurement for detecting progressive, within-person change. Deployment of these methods in larger-scale longitudinal cohorts could inform personalized dementia risk profiles based on individual trajectories and permit more powerful, personalized trials where those at the highest predicted risk would be included and those at lower risk excluded.

Acknowledgments

We would like to acknowledge Tomiko Yoneda, MSc, and Kaitlin Blackwood, BA for their contributions to data collection and preparation, as well as neuropsychological assessment and scoring. Research reported in this publication was supported by the National Institute on Aging of the National Institutes of Health under award number 1U2CAG060408 and the Neil and Susan Manning Cognitive Health Initiative. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Footnotes

We have no known conflict of interest to disclose.

References

  1. Alary F, Goldberg J, Joanette Y (2017). When the Rising Tide Impacts the World: Addressing the Global Challenge of Dementia. Canadian Journal on Aging, 36(3), 415–418. [DOI] [PubMed] [Google Scholar]
  2. Bliese PD (2000). Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In Klein KJ & Kozlowski SW (Eds.), Multilevel theory, research, and methods in organizations (pp. 349–381). Jossey-Bass. [Google Scholar]
  3. Daniëls NEM, Bartels SL, Verhagen SJW, Van Knippenberg RJM , De Vugt ME, Delespaul PAEG (2020). Digital assessment of working memory and processing speed in everyday life: Feasibility, validation, and lessons-learned. Internet Interventions, 19. 10.1016/j.invent.2019.100300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Gross A, Benitez A, Shih R, Bangen K, Glymour M, Sachs B, … Manly J (2015). Predictors of retest effects in a longitudinal study of cognitive aging in a diverse community-based sample. Journal of the International Neuropsychological Society, 21(7), 506–518. 10.1017/S1355617715000508 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O'Neal L, McLeod L, Delacqua G, Delacqua F, Kirby J, Duda SN, & REDCap Consortium (2019). The REDCap consortium: Building an international community of software platform partners. Journal of Biomedical Informatics, 95, 103208. 10.1016/j.jbi.2019.103208 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Huang Z, Muniz-Terrera G, & Tom B (2017). Power analysis to detect treatment effects in longitudinal clinical trials for Alzheimer's disease. Alzheimer's & Dementia, 3(3), 360–366. 10.1016/j.trci.2017.04.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Jutten RJ, Grandoit E, Foldi NS, Sikkes S, Jones RN, Choi SE, Lamar ML, Louden D, Rich J, Tommet D, Crane PK, & Rabin LA (2020). Lower practice effects as a marker of cognitive performance and dementia risk: A literature review. Alzheimer's & Dementia (Amsterdam, Netherlands), 12(1), e12055. 10.1002/dad2.12055 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Lancaster C, Koychev I, Blane J, Chinner A, Wolters L, & Hinds C (2020). Evaluating the Feasibility of Frequent Cognitive Assessment Using the Mezurio Smartphone App: Observational and Interview Study in Adults With Elevated Dementia Risk. JMIR mHealth and uHealth, 8(4), e16142. 10.2196/16142 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Lawton MP, & Brody EM (1969). Assessment of older people: Self-maintaining and instrumental activities of daily living. The Gerontologist, 9(3), 179–186. 10.1093/geront/9.3_Part_1.179 [DOI] [PubMed] [Google Scholar]
  10. Lövdén M, Fratiglioni L, Glymour MM, Lindenberger U, Tucker-Drob EM (2020). Education and cognitive functioning across the life span. Psychological Science in the Public Interest, 21(1), 6–41. 10.1177/1529100620920576 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Martin M, & Hofer SM (2004). Intraindividual variability, change, and aging: conceptual and analytical issues. Gerontology, 50(1), 7–11. 10.1159/000074382 [DOI] [PubMed] [Google Scholar]
  12. Mielke MM, Machulda MM, Hagen CE, Edwards KK, Roberts RO, Pankratz VS, Knopman DS, Jack CR Jr, & Petersen RC (2015). Performance of the CogState computerized battery in the Mayo Clinic Study on Aging. Alzheimer's & dementia: the journal of the Alzheimer's Association, 11(11), 1367–1376. 10.1016/j.jalz.2015.01.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Muniz-Terrera G, Watermeyer T, Danso S, & Ritchie C (2019). Mobile cognitive testing: opportunities for aging and neurodegeneration research in low- and middle-income countries. Journal of Global Health, 9(2). 10.7189/jogh.09.020313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Munoz E, Sliwinski MJ, Scott SB, & Hofer S (2015). Global perceived stress predicts cognitive change among older adults. Psychology and Aging, 30(3), 487–499. 10.1037/pag0000036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Muthén LK & Muthén BO (2017). Mplus User’s Guide (8th Edition). Muthén & Muthén. [Google Scholar]
  16. Nunnally JC and Bernstein IH (1994). The Assessment of Reliability. Psychometric Theory (3rd Edition, pp. 248–292). McGraw-Hill. [Google Scholar]
  17. Opdebeeck C, Martyr A, Clare L (2016). Cognitive Reserve and Cognitive Function in healthy older people: a meta-analysis. Aging, Neuropsychology, and Cognition, 23(1), 40–60. 10.1080/13825585.2015.1041450 [DOI] [PubMed] [Google Scholar]
  18. Palmer BW, Boone KB, Lesser IM, & Wohl MA (1998). Base rates of "impaired" neuropsychological test performance among healthy older adults. Archives of Clinical Neuropsychology, 13(6), 503–511. 10.1016/S0887-6177(97)00037-1 [DOI] [PubMed] [Google Scholar]
  19. Pliatsikas C, Veríssimo J, Babcock L, Pullman MY, Glei DA, Weinstein M, Goldman N, & Ullman MT (2019). Working memory in older adults declines with age, but is modulated by sex and education. Quarterly Journal of Experimental Psychology, 72(6), 1308–1327. 10.1177/1747021818791994 [DOI] [PubMed] [Google Scholar]
  20. Pew Research Center, May 2017, “Tech Adoption Climbs Among Older Adults”. https://www.pewresearch.org/internet/2017/05/17/tech-adoption-climbs-among-older-adults/
  21. Rabin LA, Saykin AJ, Wishart HA, Nutter-Upham KE, Flashman LA, Pare N, & Santulli RB (2007). The Memory and Aging Telephone Screen: development and preliminary validation. Alzheimer's & Dementia, 3(2), 109–121. 10.1016/j.jalz.2007.02.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Rast P, & Hofer SM (2014). Longitudinal design considerations to optimize power to detect variances and covariances among rates of change: Simulation results based on actual longitudinal studies. Psychological Methods, 19, 133–154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Rast P, Macdonald SW, & Hofer SM (2012). Intensive Measurement Designs for Research on Aging. GeroPsych: The Journal Gerontopsychology and Geriatric Psychiatry, 25(2), 45–55. 10.1024/1662-9647/a000054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Russell D, Peplau LA, & Cutrona CE (1980). The revised UCLA Loneliness Scale: Concurrent and discriminant validity evidence. Journal of Personality and Social Psychology, 39, 472–480. 10.1037/0022-3514.39.3.472 [DOI] [PubMed] [Google Scholar]
  25. Shah H, Albanese E, Duggan C, Rudan I, Langa KM, Carrillo MC, Chan KT, Joanette Y, Prince M, Rossor M, Saxena S, Snyder HM, Sperling R, Varghese M, Wang H, Wortmann M, Dua T (2016). Research priorities to reduce the global burden of dementia by 2025. The Lancet Neurology, 15(12), 1285–1294. 10.1016/S1474-4422(16)30235-6 [DOI] [PubMed] [Google Scholar]
  26. Sliwinski MJ, Mogle JA, Hyun J, Munoz E, Smyth JM, & Lipton RB (2018). Reliability and validity of ambulatory cognitive assessments. Assessment, 25(1), 14–30. 10.1177/1073191116643164 [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Statistics Canada, 2020. Table 22-10-0115-01, “Smartphone use and smartphone habits by gender and age group”. DOI: 10.25318/2210011501-eng [DOI]
  28. Statistics Canada, November 2017, “The Internet and Digital Technology”. https://www150.statcan.gc.ca/n1/pub/11-627-m/11-627-m2017032-eng.htm
  29. Stern Y, Barnes CA, Grady C, Jones RN, & Raz N (2019). Brain reserve, cognitive reserve, compensation, and maintenance: operationalization, validity, and mechanisms of cognitive resilience. Neurobiology of Aging, 83, 124–129. 10.1016/j.neurobiolaging.2019.03.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Tierney MC, Naglie G, Upshur R, Moineddin R, Charles J, & Jaakkimainen RL (2014). Feasibility and validity of the self-administered computerized assessment of mild cognitive impairment with older primary care patients. Alzheimer Disease and Associated Disorders, 28(4), 311–319. 10.1097/WAD.0000000000000036 [DOI] [PubMed] [Google Scholar]
  31. Verhagen S, Daniëls N, Bartels SL, Tans S, Borkelmans K, de Vugt ME, & Delespaul P (2019). Measuring within-day cognitive performance using the experience sampling method: A pilot study in a healthy population. PloS one, 14(12), e0226409. 10.1371/journal.pone.0226409 [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Weintraub S, Besser L, Dodge HH, Teylan M, Ferris S, Goldstein FC, Giordani B, Kramer J, Loewenstein D, Marson D, Mungas D, Salmon D, Welsh-Bohmer K, Zhou XH, Shirk SD, Atri A, Kukull WA, Phelps C, & Morris JC (2018). Version 3 of the Alzheimer Disease Centers' Neuropsychological Test Battery in the Uniform Data Set (UDS). Alzheimer Disease and Associated Disorders, 32(1), 10–17. 10.1097/WAD.0000000000000223 [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. World Health Organization. (2019, November 27). 10 Facts on Dementia. http://www.who.int/features/factfiles/dementia/en/
  34. Yesavage J, Brink TL, Rose TL, Lum O, Huang V, Adey M, & Leirer VO (1983). Development and validation of a geriatric depression screening scale: A preliminary report. Journal of Psychiatric Research, 17(1), 37–49. 10.1016/0022-3956(82)90033-4 [DOI] [PubMed] [Google Scholar]

RESOURCES