Abstract
Traumatic brain injury (TBI) is a global public health problem that affects the long-term cognitive, physical, and psychological health of patients, while also having a major impact on family and caregivers. In stark contrast to the effective trials that have been conducted in other neurological diseases, nearly 30 studies of interventions employed during acute hospital care for TBI have failed to identify treatments that improve outcome. Many factors may confound the ability to detect true and meaningful treatment effects. One promising area for improving the precision of intervention studies is to optimize the validity of the outcome assessment battery by using well-designed tools and data collection strategies to reduce variability in the outcome data. The Transforming Research and Clinical Knowledge in TBI (TRACK-TBI) study, conducted at 18 sites across the United States, implemented a multi-dimensional outcome assessment battery with 22 measures aimed at characterizing TBI outcome up to one year post-injury. In parallel, through the TBI Endpoints Development (TED) Initiative, federal agencies and investigators have partnered to identify the most valid, reliable, and sensitive outcome assessments for TBI. Here we present lessons learned from the TRACK-TBI and TED initiatives aimed at optimizing the validity of outcome assessment in TBI.
Keywords: traumatic brain injury, longitudinal studies, multicenter study, outcome assessment, neuropsychological tests, self-report, error, validity, common data elements
Introduction
Traumatic brain injury (TBI) is a global health problem impacting 3 to 5 million people in the United States and contributing to life-long physical, cognitive, and psychological consequences.1 Although recent research efforts have increased our understanding of TBI pathophysiology2,3 and chronic sequelae,4–8 acute clinical trials have either failed to improve long-term outcome or demonstrated small effect sizes. The relative futility of these studies may be related to the complex array of injury and non-injury factors that characterize TBI, rather than failure of specific treatment interventions. For example, animal models of TBI typically control for injury location and severity, as well as age, gender, and genotype,9,10 and focus on one construct of recovery, such as memory. Human TBI studies typically enroll highly heterogeneous samples with varied injury, demographic, environmental, and genetic characteristics and assess recovery across a spectrum of domains. A series of opinion and review papers have summarized the broad range of issues that may contribute to failed trials and provided recommendations that may lead to improved outcomes in future studies.11–14 These reports recognize that variability in medical management, crude methods for stratifying patients by injury severity, poor participant compliance, and other factors may contribute to the large underlying between-participant variability that cannot be overcome even with the most effective treatment.
One challenge faced by investigators conducting TBI studies is how to control for sources of error that emanate from the participant, examiner or outcome measure that may mask actual differences between study groups. Standardization of study procedures (in particular, outcome assessment) to minimize potential sources of variance is especially important in multi-site longitudinal studies focused on documenting the natural course of recovery after TBI, predicting functional outcome, and detecting treatment effects.
The purpose of the current review is two-fold: 1) to identify potential areas of inconsistency and heterogeneity in conducting multi-site longitudinal TBI outcome assessment and 2) to describe strategies intended to optimize outcome assessment. Toward this aim, we discuss lessons learned during the protocol design and implementation phases of the Transforming Research and Clinical Knowledge in TBI (TRACK-TBI, https://tracktbi.ucsf.edu/, ClinicalTrials.gov identifier: NCT02119182) study and the TBI Endpoints Development (TED, https://tbiendpoints.ucsf.edu/)15 initiative. TRACK-TBI is an 18-site collaboration that aims to create a comprehensive dataset integrating clinical, imaging, proteomic, and genomic biomarkers, with a multidimensional outcome assessment battery designed for patients with mild to severe TBI across the first year post-injury. TED is a multidisciplinary effort funded by the U.S. Department of Defense, working to develop and harmonize a TBI metadataset comprised of data elements drawn largely from completed TBI clinical trials
TRACK-TBI investigators with expertise in each of the outcome domains convened to develop the Flexible Outcome Assessment Battery (FAB), which is comprised of 22 measures that assess global functioning, cognitive performance, symptoms, social participation, quality of life, and psychological health. The battery is tailored to participants based on their current level of function and most of the measures are included in the TBI Common Data Elements (https://www.commondataelements.ninds.nih.gov). Participants are assessed in-person at two weeks, six months, and twelve months post-injury, and by telephone at three months post-injury. Thus, the FAB enables acquisition of both cross-sectional and longitudinal data across the full spectrum of TBI severity. Through its participation in the International Initiative for TBI Research (InTBIR, https://intbir.nih.gov), a partnership between NIH, the European Commission, the Canadian Institutes of Health Research, OneMind™ Foundation, and the Department of Defense, the TRACK-TBI assessment platform was harmonized with the outcome battery currently employed by the Collaborative European NeuroTrauma Effectiveness Research study (CENTER-TBI, https://www.center-tbi.eu/). In combination, TRACK-TBI and CENTER-TBI will enroll 8000 participants, creating the largest harmonized international TBI dataset assembled to date. In parallel, the TED effort provides an unprecedented opportunity to conduct clinical outcome assessment (COA) validation studies, based on a large data repository, aimed at identifying measures that are best-suited for use in TBI clinical trials, including US Food and Drug Administration- (FDA) sponsored TBI drug and device trials.
The TRACK-TBI outcome assessment battery adheres to FDA principles that guide selection of COAs for use in clinical trials. Key principles include specification of what the COA is measuring (ie, concept of interest) and the context within which it is being used (ie, context of use). These principles are seeded in the workplans for both TRACK-TBI and TED.
Error in outcome assessment
Outcome assessment is influenced by inherent errors that emanate from the participant, examiner, and the outcome measure itself.16 These factors can influence cross-sectional and longitudinal outcomes. Here, we use error to refer to uncontrolled variability in participant pre-injury characteristics and nature and severity of injury that influence outcome after TBI as well as variation in assessment administration, scoring procedures and psychometric properties. Table 1 lists examples of each source of error and describes strategies to minimize exposure to these risks.
Table 1.
Source of Error | Example | Optimization Strategies |
---|---|---|
Participant | non-injury related premorbid patient characteristics/demographics |
|
altered cognitive status |
|
|
executive deficits |
|
|
illness/pain |
|
|
fatigue |
|
|
motor impairment |
|
|
language barrier |
|
|
poor effort |
|
|
failure to return for follow-up |
|
|
Examiner | administration of incorrect form |
|
improper adherence to standardized administration guidelines |
|
|
poor adherence to time and event anchors |
|
|
errors in scoring |
|
|
data entry, transcription and conversion errors |
|
|
Measure | lack of clear administration and scoring instructions |
|
multiple forms and versions |
|
|
alternate forms required to avoid practice effects across assessments |
|
Examples of the three sources of error in outcome assessment and suggestions for optimization
Participant-related Sources of Error
Participant characteristics such as age, gender, level of education and socioeconomic status may impact outcome independent of injury characteristics and may be highly variable across individuals. A plan should be developed in advance to monitor participant demographics and adjust enrollment strategies, if necessary, to ensure that baseline sample characteristics are balanced across treatment arms and comparable across sites. This process will enable monitoring of appropriate matching of injured and control groups on critical demographic and other variables known to impact clinical outcome assessments. The effects of differences in these variables can be minimized by adopting analytic strategies, such as using normalized scores that account for age, sex and education, and applying regression methods. The TRACK-TBI initiative employed comprehensive enrollment and follow-up interviews to probe premorbid characteristics and capture changes (such as non-study related post-injury illness and injury) that occur during the course of the study. Specific statistical analyses were planned to account for potential differences in these participant characteristics.
Other factors that may affect cognitive test performance and responses to self-report questions that are not related to the TBI include altered cognitive status due to premorbid medical and psychological conditions, intoxication, test-taking attitude, impaired awareness, poor effort and exposure to sedating medications. Illness, pain, or fatigue may prompt refusal, hurried completion of measures, or unreliable responses. Language barriers can be a problem for participants who are multi-lingual or must rely on an interpreter. Sensory and physical impairments can compromise responses on performance-based measures, especially those that are timed. Deficits in executive function whether related or unrelated to the TBI, may affect comprehension, judgement, and recall.17 Finally, participants may intentionally perform poorly to secure compensation or retain services.
Many of the factors listed above are difficult to ascertain until the assessment is underway. When detected, examiners should record their observations and determine whether the problem can be mitigated using strategies that preserve the standardized test procedures17. The TRACK-TBI FAB was specifically designed to include measures that could be completed in a valid manner by participants at all levels of function (or, in some instances, surrogates), ranging from those who have not regained consciousness (evaluated with the Coma Recovery Scale-Revised18) to those capable of completing self-report measures and standardized neuropsychological tests. Test Completion Codes, modified from a prior TBI clinical trial19, were used to standardize validity ratings for each measure administered (Table 2). Performance was coded as valid (if the measure was completed in full with no threats to validity), attempted but not completed (if the measure was not completed or judged invalid), or not attempted (if no attempt was made to administer the measure). Each code described why performance was judged invalid (e.g., severe neurological/cognitive impairment, poor effort, suspected language barrier) and examiners were instructed to make additional notation as to why the assessment was deemed invalid. Apart from ensuring consistent test administration procedures, completion codes provide an additional source of information regarding the factors that affect outcome measure administration in participants diagnosed with TBI.
Table 2.
Test Attempted and completed | |
1.0 | Test completed in full, in person- results valid |
1.1 | Non-standard administration – a measure normally requiring an oral response, allowed a written response, results valid |
1.2 | Non-standard administration –Other (specify):__________________________________ |
1.3 | Test Completed, valid administration done over the phone |
Test Attempted but NOT completed | |
2.1 | Test attempted but not completed due to cognitive/neurological reason |
2.2 | Test attempted but not completed due to non-neurological/physical reasons |
2.3 | Test attempted but not completed - participant cognitively intact enough to respond but poor effort, random responding, rote response, not cooperative, refusal, intoxication |
2.4 | Test attempted but not completed due to major problems with English language proficiency (and/or Spanish language proficiency if the site can also enroll Spanish speaking participants) |
2.5 | Test attempted but not completed due to test interrupted by illness and test could not be completed later |
2.6 | Test attempted but not completed due to logistical reasons, other reasons – site specific |
Test not attempted | |
3.1 | Test not attempted due to severity of cognitive/neurological deficits |
3.2 | Test not attempted due to non-neurological/physical reasons |
3.3 | Test not attempted - participant can respond appropriately but poor effort, not cooperative, refusal, intoxication |
3.4 | Test not attempted due to major problems with English language proficiency (and/or Spanish language proficiency if the site can also enroll Spanish speaking participants) |
3.5 | Test not attempted due to participant illness and test could not be completed later |
3.6 | Test not attempted due to logistical reasons, other reasons – site specific |
4.0 | Test not attempted, completed or valid due to examiner error |
5.0 | Other (specify:____________________________________________________) |
An example of test completion codes that can be utilized to characterize test validity and evaluate factors contributing to invalid or incomplete assessment. Adapted from Zafonte et al (2012)18.
In longitudinal studies of outcome in TBI, it is critical to ensure that participants followed are representative of those who were originally enrolled in the study. When participants do not return for follow-up, bias may be introduced as there may be systematic differences between those followed and not followed. For example, participants who are lost to follow-up may be more likely to be employed, and, therefore, higher functioning. Alternatively, those who miss follow-up appointments may be from a lower socio-economic bracket and less able to obtain transportation. To mitigate these challenges, study sites should make efforts to schedule follow-up assessments during non-standard business hours, arrange and reimburse transportation, and make other accommodations to encourage participants to return for visits. Additional strategies to maximize participant follow-up have been developed by the Traumatic Brain Injury Model Systems and can be found here: https://www.tbindsc.org/SOP.aspx.
Examiner-related Sources of Error
The validity of an assessment rests on selection of the most appropriate measures for the target population and study aims, and compliance with a predetermined assessment protocol. Standardization of test order, for example, may reduce error variance by minimizing performance fatigue, demand characteristics of measures, and inter-measure effects. A significant source of error arises when test administration and scoring procedures deviate from published guidelines. For example, the TRACK-TBI team discovered different versions of the instructions for administration of the Trail Making Test [TMT]) with regard to how the examiner should respond to errors.20,21 These differences are important as they can influence completion time, the key metric for this measure. Another example of examiner bias relates to the Glasgow Outcome Scale-Extended (GOSE),22 the most widely-used measure in TBI clinical trials.23 In some studies, the GOSE22 score is intended to reflect disability that is specifically attributable to the TBI. In others, the GOSE score represents the cumulative effects of all central and peripheral injuries and serves as a measure of global function. These procedural differences can introduce large variations at the group level that overwhelm the effects of the study treatment, further emphasizing the importance of adherence to administration and scoring guidelines.
TRACK-TBI investigators with expertise in TBI outcome assessment vetted multiple versions of each measure in the FAB and achieved consensus on the most appropriate instructions and scoring criteria relative to the study aims. A comprehensive Standard Operating Procedure (SOP) was developed that included a decision-making algorithm designed to align the subject’s level of function with appropriate clinical outcome measures. Testing instructions, timing parameters, and the order of test administration were also explicitly described. All forms were provided via a central electronic database that was regularly monitored and updated. The Clinical Data Interchange Standards Consortium, an organization that develops standards and innovations to streamline medical research, provided guidance on best practices for collecting TBI data (see Therapeutic Area Data Standards User Guide for TBI- www.cdisc.org/sites/default/files/members/standard/ta/traumatic-brain-injury/taug-tbi-v1.pdf).
Patient-reported outcome measures frequently include questions that ask participants to anchor responses to a specific timeframe. For example, the Short Form-12 (SF-12)24 asks participants to rate their health over the past four weeks, the Brief Symptom Inventory-18 (BSI-18)25 rates symptoms that have been distressing over the past two weeks, and the Rivermead Postconcussion Questionnaire (RPQ)26 anchors responses to the past week or 14 days, depending on the version. Two potential sources of variance may result from these inconsistent time anchors. First, additional cognitive burden is placed on the participant when time anchors are not consistent across measures and switching between time anchors is required throughout an assessment battery. The impact of changing response epochs across measures in a single session in this population has not been studied and examiners must be especially vigilant to direct the participant’s attention to the specific time period in question.
Second, the prescribed time anchors may not be appropriate for the study design or aims. For example, in the TRACK-TBI study, the first follow-up assessment occurred approximately two weeks post-injury. The RPQ26 is available in versions that assess a seven- and fourteen-day time period; the latter was selected for TRACK-TBI to ensure the participants’ frame of reference aligned with the two-week follow-up assessment. However, the Posttraumatic Stress Disorder Checklist (PCL-5)27 anchors responses to the prior month. To avoid ratings that reflect general health before and/or after the injury at the initial follow-up assessment, examiners were instructed to replace the anchor instruction with “since your injury.” Altering standardized test instructions should be avoided. In circumstances in which it is necessary to modify the instructions, permission from the publisher may be required. While modification of published and validated assessments should be limited to the extent possible, it is occasionally necessary to accomplish study aims. In this case, changes should be approved by a committee of experts and implemented as early into data collection as possible to avoid variance in responses due solely to modification of instructions, test items, or scoring rules. This break from use of standardized forms has implications for making comparisons across data acquired both within and across studies, and normative data sets based on the original form should not be used.
A different type of anchor is related to the index event that led to enrollment in the study. Some measures (e.g., RPQ,26 PCL-5,27 Quality of Life after Brain Injury28) are explicit in instructing participants to think about the current injury or a specific prior event when responding to items. Other measures (e.g., Patient Health Questionnaire [PHQ-9],29 SF-12,24 BSI-18, Satisfaction with Life Scale30) of general health or life satisfaction do not reference a specific event. The examiner should ensure that the participant understands how the test is anchored, particularly when different anchors are used within the same test battery. In TRACK-TBI, examiners were trained to ensure that subjects were aware that anchors varied by measure and to highlight changes when they occurred from one measure to the next.
Many clinical outcome assessments utilize visual or auditory stimuli to assess memory, processing speed and executive function. Presentation of stimuli must be standardized within and across study sites to ensure that variability in responses is the result of the participant’s performance and not test administration. While test-developers provide the test stimuli and instructions for their use, we found that several measures required additional guidance to ensure uniformity in administration. For the Confusion Assessment Protocol (CAP)31 and the Rey Auditory Verbal Learning Test (RAVLT),32,33 which utilize drawings and word lists respectively, examiners were provided with precise instructions on how to present the stimuli to ensure that exposure remained constant across sites. TRACK-TBI employed a multi-step data quality assurance process to reduce examiner error in which a core group of outcomes experts vetted each measure, provided extensive training to examiners, “certified” examiners through video-recorded test administration simulations, and conducted frequent teleconferences to address issues and questions.
A final source of extraneous variability may result from errors in data entry, transcription and conversion. Data quality assurance procedures should be established at the local level to minimize the risk of these errors. Data quality monitoring should include procedures designed to identify transcription errors when transferring data from paper to electronic forms, miscalculation of raw scores, errors converting raw scores to standard scores and conflicting responses on two measures assessing the same construct. Each local site within TRACK-TBI developed monitoring plans to oversee data quality and minimize data entry and transcription errors. In addition, all subject data across all sites were entered into an electronic data system (QuesGen Systems, Inc; Burlingame, CA) which generated automatic error reports for select data fields. To move through the data entry fields, examiners had to address all flagged error fields and reconcile all errors. In addition to local monitoring, when feasible, centralized data audits should be conducted by an independent party to address data quality study-wide.
Measure-related Sources of Error
Sources of error that emanate from the measure are often the result of ambiguous administration and scoring guidelines, unclear wording of questions and test items, and practice effects that result from repeated assessment. For example, responses on the RPQ26 should reflect changes in symptom severity relative to the pre-injury baseline. However, the instructions can be misinterpreted such that only current symptom severity is reported, or the response reflects the change in symptom severity compared with a prior assessment. There are also scoring nuances that are not intuitive and can lead to error (e.g., depending on the scoring algorithm applied, responses of “1” may or may not be converted to “0” for analysis). Thus, extra training is required to ensure examiners understand the conceptual framework for the assessment. Multiple scoring schemas are also available for the BSI-18,25 and the Brief Test of Adult Cognition by Telephone (BTACT).34–36 Given these potential sources of error, examiners must ensure that they and the participants have adequate understanding of the items, response options, and scoring method.
In multi-site studies, problems can arise when different versions of the same test are available. For example, under the auspices of InTBIR, an effort was made to harmonize the outcome assessment batteries used in the TRACK-TBI and CENTER-TBI studies. The investigators subsequently discovered that the two consortia were using different versions of the SF-12.24 To reconcile this problem, a conversion system was developed to accommodate the variations in scoring between the two studies. In addition, when a repeated measures design is employed, alternate forms are sometimes required to avoid practice effects across assessments. In the TRACK-TBI study, the RAVLT32,33 is administered three times over 12 months. Therefore, two alternate forms of equal difficulty were incorporated into the protocol.37,38 To accommodate Spanish-speaking participants, both forms were translated into Spanish but were not tested for equivalency. A post-hoc analysis will test for differences in the distribution of scores on the forms and inform a determination regarding their comparability. Thorough investigation of all forms for each measure is required to ensure the correct versions are included in the final battery of assessments. For a table of measures used in TRACK-TBI and factors that may introduce error into the assessment, see Table 3.
Table 3.
TRACK-TBI Measure | Time Anchor | Event Anchor | Multiple Forms | Multiple Scoring Schemes | Requires Stimuli | Equivalent Forms Available |
---|---|---|---|---|---|---|
GOSE21 | X | X | X | |||
E-DRS-PI54 | X | X | X | |||
CRS-R17 | X | |||||
CAP30 | X | X | X | |||
RPQ25 | X | X | X | X | ||
PCL-526 | X | X | X | |||
SF-1223 | X | X | X | |||
PHQ-928 | X | ? | ||||
TMT20 | X | X | X | |||
RAVLT32 | X | X | X | X | ||
WAIS-symbol search and coding55 | X | |||||
QOLIBRI27 | X | X | ||||
BSI-1824 | X | X | ||||
PROMIS Pain56 Interference57 and Intensity58 | X | X | ||||
SWLS29 | X | |||||
M2PI59 | X |
Potential sources of error in the global function, performance, and self-report measures comprising the TRACK-TBI Flexible Outcome Assessment. Abbreviations: BSI-18 Brief Symptom Inventory, CAP Confusion Assessment Protocol, CRS-R Coma Recovery Scale-Revised, GOSE Glasgow Outcome Scale Extended, M2PI Mayo Portland Participation Index, PHQ9 Patient Health Questionnaire, PCL-5 Posttraumatic Stress Disorder Checklist, QOLIBRI Quality of Life after Brain Injury, RAVLT Rey Auditory Verbal Learning Test, RPQ Rivermead Postconcussion Questionnaire, SWLS Satisfaction with Life Scale, SF-12 Short Form 12, TMT Trail Making Test, WAIS Wechsler Adult Intelligence Scale
Finally, during the design of the study, investigators are tasked with selecting the most appropriate COAs to assess outcome. This decision should rest in large part on the psychometric robustness of the measures under consideration. A key factor in the decision-making process relates to the adequacy of the COA’s internal construct validity. The investigator should know whether the COA’s items or scores have been verified to reflect the phenomenon that the COA intends to measure, and whether the underlying construct is unidimensional or is comprised of more than one explanatory factor. Interpretation of the results of the measure may be unreliable unless these and other critical psychometric features have been assessed. Modern approaches to outcome measure development such as Item Response Theory and Rasch analysis provide the means to answer these questions. Apart from these basic psychometric properties, COA stakeholders are increasingly interested in understanding the clinical significance or ecological validity of the results of a particular outcome measure. This can be accomplished by determining the COA’s responsiveness- the extent to which a change in the COA over a specific period of time corresponds to a change in an alternate, independent measure of the same construct.39,40 The minimum clinically important difference (MCID) is a useful measure of external responsiveness that anchors COA performance to some change that has been judged to be meaningful to the patient. With few exceptions (e.g., the GOSE), outcome measures used in TRACK-TBI have been tested using contemporary measurement techniques and have been determined to be psychometrically sound.
Importance of the Healthy Control Cohort
Most TBI outcome and natural history studies include a control group for normative comparison against the population of interest.41 Control participants are typically selected from the community to match the target population on important demographic characteristics such as age, gender, education, and race, though other variables (e.g socioeconomic status). Recruiting from the community is advantageous as there is a large pool of potential participants who are relatively accessible. However, some studies have shown that recruiting control participants from the community is not sufficient as the pre-injury profile of this cohort differs from TBI participants. Thus, pre-injury characteristics may be the predominant cause of differences between participants who have sustained a TBI and healthy individuals.
There is evidence that some of the variance in outcome following mild TBI may be attributable to the general effects of the trauma, rather than specific effects caused by the brain injury42–45 (though for an alternate conclusion see Mathies 201346 and Beauchamp 201747). To address this concern, individuals who have experienced orthopedic injuries have been recruited as participants to control for non-specific effects of traumatic injury, including post-traumatic stress and general inflammation.48–50 This cohort also provides a control for pre-existing risk factors (e.g. substance use, impulsivity, participation in contact sports) that increase the probability of an injury.51–53 Nonetheless, careful screening for TBI is advisable as occult brain injury in orthopedic injury patients could obscure between-group differences.53 “Birds of a feather” controls (i.e., friends and relatives of the patient) permit investigators to control for additional influences on outcome.54 In addition to sharing demographic features, this cohort also controls for exposure to environmental risk factors, personality characteristics and other influences related to acculturation.55 Both orthopedic and friend control groups were recruited for the TRACK-TBI study to parse the effects of the TBI from non-specific demographic or general trauma effects.
Future Research Directions
Optimization of the approach to outcome assessment in multi-center trials is necessary to maximize the probability that variability in the data can be attributed to the effects of the brain injury or of the treatment. Prior to initiating data-collection, sources of error related to the participant, examiner and measure should be considered and management strategies implemented to mitigate their impact. Poor control of participant-, examiner-, and measure-related sources of error may lead to systematic bias in data collection (e.g. selection bias) and interpretation (e.g. attribution of outcomes to the brain injury rather than other influencing factors).57 While each study has unique aims, and employs different outcome assessment measures, there are some overarching principles that can increase fidelity and help maintain the integrity of the data.
Recommendations for Optimizing Clinical Outcome Assessment in Multi-Center Longitudinal TBI Studies
Based on lessons learned through our experience with TRACK-TBI and TED, we have developed a 6-step plan for optimizing outcome assessment in multi-center, longitudinal TBI studies:
Convene an Expert Working Group (EWG) comprised of subject matter experts in clinical outcome assessment. The role of the Outcomes EWG is to triage and select COA’s, oversee ongoing examiner training activities, monitor data quality and troubleshoot over the course of the study.
Develop an SOP manual that details all aspects of the outcome assessment protocol. The SOP should describe the purpose of each measure, the order of administration and the procedures for administration and scoring. A process for updating the manual and disseminating updates to participating sites should be established. The SOP manual and all assessment measures should be housed in a central, easily accessible repository.
Provide in-person and electronic training modules for all study examiners to promote uniformity in administration and scoring of outcome measures. Develop and implement a procedure for certifying examiner competency before authorizing data collection (e.g., require videotaped demonstration of test administration).
Hold regularly-scheduled teleconferences with outcome data collectors to address questions, adjudicate unusual test administration and scoring circumstances, apprise participants of SOP amendments, and conduct ongoing training.
Establish a schedule for re-training examiners and consider using multi-media training devices (e.g., videotaped simulations, webinars, written manuals, case presentations).
Mandate that participating sites have a local plan for data quality assurance and, when possible, conduct on-site audits study-wide.
Conclusion
Successful multi-site, longitudinal TBI outcomes research relies on precise assessment of the domains of interest. Minimizing the risk of error from multiple sources (i.e., subject, examiner, measure) will increase the likelihood that the outcomes reported reflect the effects of the injury or treatment intervention rather than non-specific factors that may influence outcome. The experience-based recommendations provided here represent reasonable steps that should be considered to help ensure that high-quality outcome data are obtained across participating sites.
Acknowledgments
We wish to thank the examiners at all TRACK-TBI sites who contributed their experience and knowledge to the observations and recommendations described in this manuscript. We are particularly grateful to the patients and families who participated in the TRACK-TBI initiative.
Abbreviations
- BSI-18
Brief Symptom Inventory
- CAP
Confusion Assessment Protocol
- COA
Clinical Outcome Assessment
- CRS-R
Coma Recovery Scale-Revised
- FDA
Food and Drug Administration
- GOSE
Glasgow Outcome Scale -- Extended
- M2PI
Mayo Portland Participation Index
- PHQ9
Patient Health Questionnaire
- PCL-5
Posttraumatic Stress Disorder Checklist
- QOLIBRI
Quality of Life after Brain Injury
- RAVLT
Rey Auditory Verbal Learning Test
- RPQ
Rivermead Postconcussion Questionnaire
- SWLS
Satisfaction with Life Scale
- SF-12
Short Form 12
- TBI
Traumatic Brain Injury
- TED
Traumatic Brain Injury Endpoints Development Initiative
- TMT
Trail Making Test
- TRACK-TBI
Transforming Research and Clinical Knowledge in Traumatic Brain Injury
- WAIS
Wechsler Adult Intelligence Scale
Footnotes
Conflicts of Interest:
YB: None
MM: None
SD:None
NT: None
KB: None
JM: None
ST: None
MS: None
HL: None
JK: None
JC: None
TM: None
JW: None
GM: Dr. Manley reports grants from NIH, Department of Defense, and other support from One Mind, Abbott, General Electric, Pfizer, and Johnson & Johnson Family of Companies/DePuy Synthes/Codman Neuro during the conduct of the study.
JG: None
Disclosure of Funding:
This study was supported by the National Institute of Neurological Disorders and Stroke grant number U0-1NS086090 and the Department of Defense grant number DoD W81XWH-14-2-0176. The TRACK-TBI and TED initiatives also receive key support from One Mind, and over 25 industry and philanthropic partners.
YB: U.S. Department of Defense, James S. McDonnell Foundation
MM: U.S. Department of Defense, National Collegiate Athletic Association, National Football League
SD: None
NT: National Institutes of Health, U.S. Department of Defense
KB: None
JM: None
ST: None
MS: National Institute on Disability, Independent Living, and Rehabilitation Research, National Institutes of Health
HL: CDMRP W8IXWH-13, TBI Endpoints Development, R21 NS086714, VA Merit Review B1320-I, and VA/DoD Chronic Effects of Neurotrauma
JK: None
JC: None
TM: None
JW: None
GM: Department of Defense, NIH, and other support from One Mind, Palantir, and Johnson & Johnson Family of Companies/DePuy Synthes/Codman Neuro
JG: National Institutes of Health, U.S. Department of Defense
References
- 1.Center for Disease Control and Prevention. Traumatic Brain Injury in the United States: Epidemiology and Rehabilitation. Atlanta, GA: National Center for Injury Prevention and Control; Division of Unintentional Injury Prevention; 2015. [Google Scholar]
- 2.Bramlett HM, Dietrich WD. Long-Term Consequences of traumatic brain injury: Current status of potential mechanisms of injury and neurological outcomes. J Neurotrauma. 2015;32(23):1834–1848. doi: 10.1089/neu.2014.3352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Smith DH, Hicks R, Povlishock JT. Therapy development for diffuse axonal injury. J Neurotrauma. 2013;30(5):307–323. doi: 10.1089/neu.2012.2825. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Andelic N, Hammergren N, Bautz-Holter E, Sveen U, Brunborg C, Røe C. Functional outcome and health-related quality of life 10 years after moderate-to-severe traumatic brain injury. Acta Neurol Scand. 2009;120(1):16–23. doi: 10.1111/j.1600-0404.2008.01116.x. [DOI] [PubMed] [Google Scholar]
- 5.Dikmen SS, Machamer JE, Powell JM, Temkin NR. Outcome 3 to 5 years after moderate to severe traumatic brain injury. Arch Phys Med Rehabil. 2003;84(10):1449–1457. doi: 10.1016/s0003-9993(03)00287-9. [DOI] [PubMed] [Google Scholar]
- 6.Gardner RC, Burke JF, Nettiksimmons J, Kaup A, Barnes DE, Yaffe K. Dementia risk after traumatic brain injury vs nonbraintTrauma: The role ofaAge and severity. JAMA Neurol. 2014;71(12):1490. doi: 10.1001/jamaneurol.2014.2668. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Greenwald BD, Hammond FM, Harrison-Felix C, Nakase-Richardson R, Howe LLS, Kreider S. Mortality following traumatic brain injury among individuals unable to follow commands at the yime of rehabilitation admission: A National Institute on Disability and Rehabilitation Research Traumatic Brain Injury Model Systems study. J Neurotrauma. 2015;32(23):1883–1892. doi: 10.1089/neu.2014.3454. [DOI] [PubMed] [Google Scholar]
- 8.Moretti L, Cristofori I, Weaver SM, Chau A, Portelli JN, Grafman J. Cognitive decline in older adults with a history of traumatic brain injury. Lancet Neurol. 2012;11(12):1103–1112. doi: 10.1016/S1474-4422(12)70226-0. [DOI] [PubMed] [Google Scholar]
- 9.Xiong Y, Mahmood A, Chopp M. Animal models of traumatic brain injury. Nat Rev Neurosci. 2013;14(2):128–142. doi: 10.1038/nrn3407. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Johnson VE, Meaney DF, Cullen DK, Smith DH. Handbook of Clinical Neurology. Vol. 127. Elsevier; 2015. Animal models of traumatic brain injury; pp. 115–128. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Maas AIR, Roozenbeek B, Manley GT. Clinical trials in traumatic brain injury: past experience and current developments. Neurother J Am Soc Exp Neurother. 2010;7(1):115–126. doi: 10.1016/j.nurt.2009.10.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Bagiella E, Novack TA, Ansel B, et al. Measuring Outcome in Traumatic Brain Injury Treatment Trials: Recommendations From the Traumatic Brain Injury Clinical Trials Network. J Head Trauma Rehabil. 2010;25(5):375–382. doi: 10.1097/HTR.0b013e3181d27fe3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Roozenbeek B, Lingsma HF, Maas AI. New considerations in the design of clinical trials for traumatic brain injury. Clin Investig. 2012;2(2):153–162. doi: 10.4155/cli.11.179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Stein DG. Embracing failure: What the Phase III progesterone studies can teach about TBI clinical trials. Brain Inj. 2015;29(11):1259–1272. doi: 10.3109/02699052.2015.1065344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Manley GT, MacDonald CL, Markowitz A, et al. The Traumatic Brain Injury Endpoints Development (TED) Initiative: Progress on a public-private regulatory collaboration to accelerate diagnosis and treatment of traumatic brain injury. J Neurotrauma. 2017 Jun; doi: 10.1089/neu.2016.4729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Giacino JT, Schnakers C, Rodriguez-Moreno D, Kalmar K, Schiff N, Hirsch J. Progress in Brain Research. Vol. 177. Elsevier; 2009. Behavioral assessment in patients with disorders of consciousness: gold standard or fool’s gold? pp. 33–48. [DOI] [PubMed] [Google Scholar]
- 17.Bogner J. Community Participation: Measurement Issues With Persons With Deficits in Executive Functioning. Arch Phys Med Rehabil. 2010;91(9):S66–S71. doi: 10.1016/j.apmr.2009.11.032. [DOI] [PubMed] [Google Scholar]
- 18.Giacino JT, Kalmar K, Whyte J. The JFK Coma Recovery Scale-Revised: measurement characteristics and diagnostic utility. Arch Phys Med Rehabil. 2004;85(12):2020–2029. doi: 10.1016/j.apmr.2004.02.033. [DOI] [PubMed] [Google Scholar]
- 19.Zafonte RD, Bagiella E, Ansel BM, et al. Effect of citicoline on functional and cognitive status among patients with traumatic brain injury: Citicoline Brain Injury Treatment Trial (COBRIT) JAMA. 2012;308(19):1993–2000. doi: 10.1001/jama.2012.13256. [DOI] [PubMed] [Google Scholar]
- 20.Bowie CR, Harvey PD. Administration and interpretation of the Trail Making Test. Nat Protoc. 2006;1(5):2277–2281. doi: 10.1038/nprot.2006.390. [DOI] [PubMed] [Google Scholar]
- 21.Reitan RM. Validity of the Trail Making Test as as indicator of organic brain damage. Percept Mot Skills. 1958;8(3):271–276. [Google Scholar]
- 22.Wilson JT, Pettigrew LE, Teasdale GM. Structured interviews for the Glasgow Outcome Scale and the extended Glasgow Outcome Scale: guidelines for their use. J Neurotrauma. 1998;15(8):573–585. doi: 10.1089/neu.1998.15.573. [DOI] [PubMed] [Google Scholar]
- 23.McMillan T, Wilson L, Ponsford J, Levin H, Teasdale G, Bond M. The Glasgow Outcome Scale — 40 years of application and refinement. Nat Rev Neurol. 2016;12(8):477–485. doi: 10.1038/nrneurol.2016.89. [DOI] [PubMed] [Google Scholar]
- 24.Ware J, Kosinski M, Keller SD. A 12-Item Short-Form Health Survey: construction of scales and preliminary tests of reliability and validity. Med Care. 1996;34(3):220–233. doi: 10.1097/00005650-199603000-00003. [DOI] [PubMed] [Google Scholar]
- 25.Derogatis L. Brief Symptom Inventory 18 (BSI-18): Administration, Scoring, and Procedures Manual. Bloomington, MN: Pearson; 2001. [Google Scholar]
- 26.King NS, Crawford S, Wenden FJ, Moss NEG, Wade DT. The Rivermead Post Concussion Symptoms Questionnaire: a measure of symptoms commonly experienced after head injury and its reliability. J Neurol. 1995;242(9):587–592. doi: 10.1007/BF00868811. [DOI] [PubMed] [Google Scholar]
- 27.Blevins CA, Weathers FW, Davis MT, Witte TK, Domino JL. The Posttraumatic Stress Disorder Checklist form DSM-5 (PCL-5): Development and initial psychometric evaluation. J Trauma Stress. 2015;28(6):489–498. doi: 10.1002/jts.22059. [DOI] [PubMed] [Google Scholar]
- 28.von Steinbüchel N, Wilson L, Gibbons H, et al. Quality of Life after Brain Injury (QOLIBRI): Scale development and metric properties. J Neurotrauma. 2010;27(7):1167–1185. doi: 10.1089/neu.2009.1076. [DOI] [PubMed] [Google Scholar]
- 29.Kroenke K, Spitzer RL, Williams JBW. The PHQ-9: Validity of a brief depression severity measure. J Gen Intern Med. 2001;16(9):606–613. doi: 10.1046/j.1525-1497.2001.016009606.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Diener E, Emmons RA, Larsen RJ, Griffin S. The Satisfaction With Life Scale. J Pers Assess. 1985;49(1):71–75. doi: 10.1207/s15327752jpa4901_13. [DOI] [PubMed] [Google Scholar]
- 31.Sherer M, Nakase-Thompson R, Yablon SA, Gontkovsky ST. Multidimensional assessment of acute confusion after traumatic brain injury. Arch Phys Med Rehabil. 2005;86(5):896–904. doi: 10.1016/j.apmr.2004.09.029. [DOI] [PubMed] [Google Scholar]
- 32.Rey A. L’examen psychologique dans les cas d’encéphalopathie traumatique. (Les problems.). / The psychological examination in cases of traumatic encepholopathy. Problems. Arch Psychol. 1941;28:215–285. [Google Scholar]
- 33.Vakil E, Blachstein H. Rey Auditory-Verbal Learning Test: Structure analysis. J Clin Psychol. 1993;49(6):883–890. doi: 10.1002/1097-4679(199311)49:6<883::aid-jclp2270490616>3.0.co;2-6. [DOI] [PubMed] [Google Scholar]
- 34.EGB, KCP, Kristen D-O. Bi-factor analyses of the Brief Test of Adult Cognition by Telephone. Neuro Rehabilitation. 2013;(2):253–265. doi: 10.3233/NRE-130842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Lachman ME, Agrigoroaei S, Tun PA, Weaver SL. Monitoring cognitive functioning: psychometric properties of the brief test of adult cognition by telephone. Assessment. 2014;21(4):404–417. doi: 10.1177/1073191113508807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Tun PA, Lachman ME. Telephone assessment of cognitive function in adulthood: the Brief Test of Adult Cognition by Telephone. Age Ageing. 2006;35(6):629–632. doi: 10.1093/ageing/afl095. [DOI] [PubMed] [Google Scholar]
- 37.Shapiro DM, Harrison DW. Alternate forms of the AVLT: a procedure and test of form equivalency. Arch Clin Neuropsychol Off J Natl Acad Neuropsychol. 1990;5(4):405–410. [PubMed] [Google Scholar]
- 38.Ryan JJ, Geisser ME. Validity and diagnostic accuracy of an alternate form of the Rey Auditory Verbal Learning Test. Arch Clin Neuropsychol Off J Natl Acad Neuropsychol. 1986;1(3):209–217. [PubMed] [Google Scholar]
- 39.Jette A, Haley S. Contemporary measurement techniques for rehabilitation outcomes assessment*. J Rehabil Med. 2005;37(6):339–345. doi: 10.1080/16501970500302793. [DOI] [PubMed] [Google Scholar]
- 40.Grimby G, Tennant A, Tesio L. The use of raw scores from ordinal scales: Time to end malpractice? J Rehabil Med. 2012;44(2):97–98. doi: 10.2340/16501977-0938. [DOI] [PubMed] [Google Scholar]
- 41.Levin HS, Shum D, Chan RCK. Understanding Traumatic Brain Injury: Current Research and Future Directions. 2014 [Google Scholar]
- 42.Larrabee GJ, Binder LM, Rohling ML, Ploetz DM. Meta-Analytic Methods and the importance of non-TBI Factors related to outcome in mild traumatic brain injury: Response to Bigler et al. (2013) Clin Neuropsychol. 2013;27(2):215–237. doi: 10.1080/13854046.2013.769634. [DOI] [PubMed] [Google Scholar]
- 43.Larrabee GJ, Rohling ML. Neuropsychological differential diagnosis of Mild Traumatic Brain Injury. Behav Sci Law. 2013;31(6):686–701. doi: 10.1002/bsl.2087. [DOI] [PubMed] [Google Scholar]
- 44.Ponsford J, Cameron P, Fitzgerald M, Grant M, Mikocka-Walus A, Schönberger M. Predictors of postconcussive symptoms 3 months after mild traumatic brain injury. Neuropsychology. 2012;26(3):304–313. doi: 10.1037/a0027888. [DOI] [PubMed] [Google Scholar]
- 45.Dikmen SS, Machamer JE, Winn HR, Temkin NR. Neuropsychological outcome at 1-year post head injury. Neuropsychology. 1995;9(1):80–90. [Google Scholar]
- 46.Mathias JL, Dennington V, Bowden SC, Bigler ED. Community versus orthopaedic controls in traumatic brain injury research: How comparable are they? Brain Inj. 2013;27(7–8):887–895. doi: 10.3109/02699052.2013.793398. [DOI] [PubMed] [Google Scholar]
- 47.Beauchamp MH, Landry-Roy C, Gravel J, Beaudoin C, Bernier A. Should young children with TBI be compared to community or orthopedic control participants? J Neurotrauma. 2017 Apr; doi: 10.1089/neu.2016.4868. [DOI] [PubMed] [Google Scholar]
- 48.Bryant RA, O’Donnell ML, Creamer M, McFarlane AC, Clark CR, Silove D. The psychiatric sequelae of traumatic Injury. Am J Psychiatry. 2010;167(3):312–320. doi: 10.1176/appi.ajp.2009.09050617. [DOI] [PubMed] [Google Scholar]
- 49.McCauley SR, Wilde EA, Barnes A, et al. Patterns of early emotional and neuropsychological sequelae after mild traumatic brain injury. J Neurotrauma. 2014;31(10):914–925. doi: 10.1089/neu.2012.2826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Stancin T, Taylor HG, Thompson GH, Wade S, Drotar D, Yeates KO. Acute psychosocial impact of pediatric orthopedic trauma with and without accompanying brain injuries. J Trauma. 1998;45(6):1031–1038. doi: 10.1097/00005373-199812000-00010. [DOI] [PubMed] [Google Scholar]
- 51.Shilon Y, Pollak Y, Aran A, Shaked S, Gross-Tsur V. Accidental injuries are more common in children with attention deficit hyperactivity disorder compared with their non-affected siblings: Accidental injuries and ADHD. Child Care Health Dev. 2012;38(3):366–370. doi: 10.1111/j.1365-2214.2011.01278.x. [DOI] [PubMed] [Google Scholar]
- 52.Uslu M, Uslu R, Eksioglu F, Ozen NE. Children with fractures show higher levels of impulsive-hyperactive behavior. Clin Orthop. 2007;460:192–195. doi: 10.1097/BLO.0b013e31805002da. [DOI] [PubMed] [Google Scholar]
- 53.Wilde EA, Li X, Hunter JV, et al. Loss of consciousness is related to white matter injury in mild traumatic brain injury. J Neurotrauma. 2016;33(22):2000–2010. doi: 10.1089/neu.2015.4212. [DOI] [PubMed] [Google Scholar]
- 54.Pagulayan KF, Temkin NR, Machamer J, Dikmen SS. A longitudinal study of health-related quality of life after traumatic brain injury. Arch Phys Med Rehabil. 2006;87(5):611–618. doi: 10.1016/j.apmr.2006.01.018. [DOI] [PubMed] [Google Scholar]
- 55.Dikmen S, Temkin N, McLean A, Wyler A, Machamer J. Memory and head injury severity. J Neurol Neurosurg Psychiatry. 1987;50(12):1613–1618. doi: 10.1136/jnnp.50.12.1613. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Corrigan JD, Harrison-Felix C, Bogner J, Dijkers M, Terrill MS, Whiteneck G. Systematic bias in traumatic brain injury outcome studies because of loss to follow-up. Arch Phys Med Rehabil. 2003;84(2):153–160. doi: 10.1053/apmr.2003.50093. [DOI] [PubMed] [Google Scholar]
- 57.Sackett DL. Bias in analytic research. J Chronic Dis. 1979;32(1–2):51–63. doi: 10.1016/0021-9681(79)90012-2. [DOI] [PubMed] [Google Scholar]