Abstract
BACKGROUND
The Early Rehabilitation Barthel Index (ERBI) comprises seven items of the Early Rehabilitation Index and ten items of the Barthel Index. The ERBI is usually used to measure functional changes in patients with severe acquired brain injury (sABI), but its measurement properties have yet to be extensively assessed.
AIM
To study the unidimensionality and internal construct validity (ICV) of the ERBI through Confirmatory Factor Analysis (CFA), Mokken Analysis (MA), and Rasch Analysis (RA).
DESIGN
Multicenter prospective study.
SETTING
Inpatients from five intensive rehabilitation centers.
POPULATION
Two hundred and forty-seven subjects with sABI.
METHODS
ERBI was administered on admission and discharge to study its unidimensionality through CFA and MA and its ICV, reliability, and targeting through RA.
RESULTS
The preliminary analyses showed a lack of unidimensionality (RMSEA=0.460 >0.06; SRMR=0.176 >0.06; CFI=1.000 >0.950; TLI=1.000 >0.950). According to CFA, “Confusional state” and “Behavioral disturbance” items showed low factor loadings (<0.40), whereas these two items composed a separate scale within the MA. Furthermore, the baseline RA showed that three items misfitted (“Mechanical ventilation,” “Confusional state,” “Behavioral disturbances”) and a lack of conformity of several ICV requirements. After deletion of three misfitting items and further non-structural modifications (i.e., testlets creation to absorb local dependence between items and item misfit), the solution obtained showed adequate ICV, adequate reliability for measurements at the individual level (PSI>0.85), although with a frank floor effect. This final solution was successfully replicated in a total sample of the subjects. After post-hoc modifications of the score structure of two out of three misfitting items, the subsequent CFA (RMSEA=0.044 <0.06; SRMR=0.056 <0.06; CFI=1.000 >0.950 TLI=1.000 >0.950) and MA showed the resolution of the unidimensional issues.
CONCLUSIONS
Although the ERBI is a potentially valuable tool for measuring functioning in the coma-to-community continuum, our analyses suggested its lack of ICV, partly due to an incorrect scoring design of some items. A new perspective multicenter study is proposed to validate a modified version of the ERBI that overcomes the problems highlighted in this analysis.
CLINICAL REHABILITATION IMPACT
Our results do not support the use of the original structure of the ERBI in clinical practice and research, as a lack of ICV was highlighted.
Key words: Brain injuries, Health care outcome assessment, Psychometrics, Rehabilitation
Severe acquired brain injuries (sABIs) are associated with a wide heterogeneity of clinical, cognitive, behavioral, and psychosocial impairments, leading to reduced functional independence. Accurate and reliable measurement of functional independence is crucial to plan an appropriate and tailored rehabilitation treatment and to make an accurate functional prognosis.1 The Barthel Index (BI)2 was developed to measure functional independence in patients with different diseases. The BI has been widely used as a measurement instrument in patients with sABI in several kinds of researches.3, 4 Moreover, BI psychometric properties were confirmed in several neurologic patients as stroke,5 Parkinson’s disease,6 and brain injury.3 As its administration and total score calculation are straightforward, BI is widely used in clinical practice.
However, some functional changes in such severely disabled patients during the early rehabilitation stay may go undetected using BI. Indeed, for these patients, the initial rehabilitation goals are often related to medical stabilization, weaning from mechanical ventilation and other medical devices, and interventions targeting recovery of swallowing, and preventing consequences of hypomobility.7 These crucial rehabilitation outcomes are not captured by BI. Moreover, several studies8, 9 demonstrated a BI floor effect (the percentage of patients with the lowest possible score) being superior to the maximum recommended threshold of 15%.10 The presence of a floor effect compromises the BI ability to discriminate between patients with lower functional independence.11
As a consequence, to reduce this floor effect, Schönle et al.12 proposed to add to BI the Early Rehabilitation Index (ERI), a scale composed of seven highly relevant items investigating variables typical of neurological and neurosurgical patients in the early post-acute rehabilitation phase, as intensive medical monitoring, tracheotomy tube management and weaning, mechanical ventilation use, confusional state, behavioral disturbances, communication deficit, and dysphagia treatment; thus, the Early Rehabilitation Barthel Index (ERBI) was developed to include these domains. Although ERBI has been administered in several studies in persons with sABI,13, 14 its psychometric properties were assessed in detail by few studies.7, 15, 16 Particularly, Rollnik et al.15 demonstrated high correlations between nurses’ and physicians’ assessments (r=0.849) and moderate correlations with BI (r=0.438 and r=0.430 if administrated by nurses of physicians, respectively) in 273 neurological rehabilitation patients.15 Furthermore, Reis et al.16 showed that a Brazilian Portuguese version of the ERBI had a moderate to excellent inter-rater reliability (kappa ranging from 0.54 to 1.00) and that its total score correlated significantly with several other functional indicators in 122 patients admitted to the general intensive care unit.16 Finally, Rollnik et al.15, 17 demonstrated the predictive validity of low ERBI total scores in terms of significantly longer length of stay15, 17 and significant morbidities.15
The available psychometric evidence on the validity and reliability of the ERBI focused on the total score paradigm of the Classical Theory Test (CTT). However, specific analysis has yet to be undertaken to demonstrate the unidimensionality of the ERBI items. In other words, so far, there is no evidence that ERI and BI items could be summated to generate a valid total score. As such, all the reliability and validity data accrued so far should be interpreted cautiously, as they were based on total scores, which may be internally invalid as they may be multi-dimensional. On the other hand, other classical and modern psychometric approaches could provide unidimensionality evidence. Amongst modern psychometric models, the Rasch Measurement Theory (RMT) framework is worth mentioning as it can provide evidence of unidimensionality and other requirements for internal construct validity (ICV) and reliability.18 The RMT is unique as if the data generated by the scale meets its requirements, the scale total score will be transformed into continuous estimates of ability on an interval scale. Considering that these interval-level estimates also hold the invariance property, it could be said that a scale validated within the RMT can deliver the three fundamental tenets of measurement: invariance, unidimensionality, and interval-level estimates.11
Thus, the aims of this study were: 1) to assess the ERBI unidimensionality according to a variety of classical and modern psychometric methods; 2) to assess in more detail the ERBI ICV and reliability within the RMT; 3) to provide clues on possible strategies to improve the ERBI measurement properties.
Materials and methods
Study design and setting
Data were collected across five Italian intensive neurorehabilitation centers with expertise in diagnosing and caring for adults with sABI within two multicenter prospective observational studies, where the ERBI was employed as an outcome measure. The methods of these studies are described in detail elsewhere.13, 19
The study protocols was approved by the local Ethical Committee (No. 244, 24th October 2017), that were carried out according to the principles outlined in the Helsinki declaration. In addition, participants or their respective caregivers also signed a written informed consent before any study-related procedures.
Participants
Patients were enrolled through convenience sampling and were included in this study if they met the following inclusion criteria: aged over 18 years at the onset, with a first event of traumatic or non-traumatic (i.e., anoxic or vascular) sABI, with a clinical diagnosis of a disorder of consciousness based on standardized clinical criteria,20 first admission to intensive rehabilitation unit, no longer than three months since sABI. Subjects were excluded if: they presented a mixed etiology or they reported a premorbid history of psychiatric or neurodegenerative diseases.
Outcome measures
The Early Rehabilitation Barthel Index (ERBI)12 comprises seven items from ERI and ten items from BI (Table I). Items from ERI are scored with a 2-point Likert scale (-50 or -25 and 0 points), whereas items from BI are scored with a 3-point Likert scale (0, 5, and 10 points). All item scores are summed up to generate a total score that ranges from -325 to 100 points (the higher the score, the higher the improvement of functional independence).
Table I. —The Early Rehabilitation Barthel Index.
| Item | Scoring | |
|---|---|---|
| Early Rehabilitation Index (ERI) | Yes | No |
| A1. Intensive medical monitoring | −50 | 0 |
| A2. Tracheostoma requiring special treatment (suctioning) | −50 | 0 |
| A3. Intermittent (or continuous) mechanical ventilation | −50 | 0 |
| A4. Confusional state requiring special supervision | −50 | 0 |
| A5. Behavioral disturbances requiring special care (patient poses a risk to himself or his environment) | −50 | 0 |
| A6. Severe communication deficits | −25 | 0 |
| A7. Swallowing disorders requiring special supervision | −50 | 0 |
| Item | Scoring | ||
|---|---|---|---|
| Barthel Index (BI) | Unable | Needs help | Independent |
| B1. Feeding | 0 | 5 | 10 |
| B2. Transfers | 0 | 5-10 | 15 |
| B3. Grooming | 0 | 5 | |
| B4. Toilet use | 0 | 5 | 10 |
| B5. Bathing | 0 | 5 | |
| B6. Mobility | 0 | 5-10 | 15 |
| B7. Stairs | 0 | 5 | 10 |
| B8. Dressing | 0 | 5 | 10 |
| B9. Bowels | 0 | 5 | 10 |
| B10. Bladder | 0 | 5 | 10 |
Procedures
Demographic and clinical characteristics of the patients were collected at admission to each rehabilitation center. Moreover, ERBI was administered both on admission and discharge. The participating centers adopted a shared protocol to collect data to reduce inter-rater variability.
Data manipulation
Considering that the available data included observations collected at two different time points (i.e., on admission and at discharge), we applied the procedure proposed by Mallinson21 to build a sample that randomly contained only an assessment of an enrolled subject. Therefore, we generated a unique assessment sample, where each patient had either an admission or a discharge assessment chosen randomly. Finally, we generated a total sample which contained all available observations. The data manipulation procedure is detailed in Figure 1.
Figure 1.

—Generation of the unique assessment sample. We built a unique assessment sample that randomly contained an admission or discharge assessment.
Planned analyses
The following analyses were undertaken:
descriptive statistics for the total sample, scale, and items;
preliminary assessment of unidimensionality on the total sample:
Classical test theory item analysis (CTT-IA);
Mokken analysis (MA);
confirmatory factor analysis (CFA);
ICV with Rasch analysis (RA) on the unique assessment sample and total sample.
Statistical analysis
Descriptive sample and scale statistics
Descriptive statistics were computed to describe all the collected variables. Mean and standard deviation, median with first and third quartile, and absolute frequency with percentage were calculated for the interval, ordinal, and nominal variables, respectively.
Preliminary assessment of unidimensionality on the total sample
Classical item analysis
We assessed the internal consistency of the total sample by calculating the following statistics:
at the total score level, the Cronbach’s alpha (α);22 where values between 0.70 and 0.9523 were considered satisfactory;
at the item level:
the average inter-item correlations, i.e., the mean of the inter-item correlations (based on Spearman’s correlation coefficient (rs)24) between each pair of items; values ≥0.2 were recommended;24
the Cronbach’s alpha with an item deleted; values below the total α were expected for each item;24
the item-to-total correlations, based on rs between each item and its rest score (i.e., the total score minus the item score); values ≥0.40 were considered acceptable.24
Mokken analysis
We computed the Scalability coefficients (H)25 for each item. Particularly, we considered the following rules of thumb for the interpretability of this coefficient:25
<0.30: lack of scalability;
0.30-0.39: low scalability;
0.40-0.49: moderate scalability;
>0.50: high scalability.
Confirmatory factor analysis (CFA)
A CFA was run to assess the ERBI unidimensional structure. The assessment of model fit was performed using the following indexes:24, 26
the model Chi-square (χ2), an overall indicator of model fit, measures the discrepancy between the covariance matrices of the model and the sample.27 For adequate fit to the model, the χ2 probability values should be not significant;
the Root Mean Square Error of Approximation (RMSEA) measures the discrepancy between the covariance matrix predicted by the model and the population covariance matrix.27 Values≤0.06 are considered sufficient to indicate a good fit to the model;28
the Standardized Root Mean Square Residual (SRMR) is the average value across all residual values derived from the comparison between the predicted and the observed variance-covariance matrix.27 Values≤0.08 indicate an adequate fit to the model;28
the Tucker-Lewis Index (TLI) and the Comparative Fit Index (CFI) measure the proportionate improvement in the model fit by comparing the hypothesized and the null models. Values≥0.95 are considered indicative of a good fit to the model;28
a representative loading of each item on the latent factor>0.4 was considered acceptable.23
Should the initial baseline analysis fail to fit the CFA unidimensional model, we would assess the modification indices (MIs) on item pairs.27 MIs27 indicate the presence of residual covariance between items, as in the case of local dependency between items, where the response to one of the two items within the pair is influenced by the response to the other item.29, 30 In these cases, the model would be re-specified after accounting for local dependency by allowing correlation of the error terms of the items in the pair.24, 27 After accounting for local dependency, we would reassess the fit of the item set to a unidimensional model.
Analysis of internal construct validity with Rasch analysis
A full description of all the RA parameters and relative indexes employed is available in Supplementary Digital Material 1 (Supplementary Table I, Supplementary Text File 1). The RAs were based on the partial credit Rasch model, considering that the items had an unequal number of scoring categories. In brief, we based the interpretation of the analysis findings on the following summary statistics:
Fitness to the Rasch model relates to the stochastically invariant ordering of the items. It was assessed by computing the mean and standard deviation of the item and person fit residuals, which indicated a good fit if ≤1.4.31 Furthermore, we also assessed a summary Chi-square interaction statistics, which should not be significant (i.e., values above the Bonferroni adjusted P value) in case of no deviations from the model’s expectations.32-34
We also assessed the following ICV requirements:
unidimensionality, i.e., all scale items must measure a single underlying construct.33 We tested the unidimensionality with a post-hoc procedure,35 consisting of a paired t-test on separate estimates for each subject (derived from subsets of items identified by principal component analysis of the standardized Rasch residuals). Unidimensionality was considered ‘strict’ when both the proportion of significant t-tests (PST) and the lower bound of the binomial confidence interval for proportions (BCI) were below 5%.36 Otherwise, unidimensionality was considered ‘acceptable’ when only the BCI was <5%;36
monotonicity, i.e., the increase of the underlying latent trait should be associated with an increased probability of endorsing a scoring category indicative of higher functional independence within an item.36, 37 Therefore, this requirement was summarized as the percentage of items with disordered thresholds (T-DT), expecting a value of 0% for adequate monotonicity;38
local independence, i.e., all the variation among responses to an item is accounted for by the person’s ability only. Therefore, there should be no further systematic relationship among responses for the same value of ability.37 Thus, we considered items to be locally independent if their residual correlation was above a local dependency relative cutoff (LDRC), calculated by adding 0.2 to the average of residual correlations after having removed the association of each item to itself, equal to.33, 39 Then, we summed all the correlation coefficients of the residuals above the LDRC to obtain a total value of LD (T-LD), where 0 indicates the complete absence of LD;38
absence of differential item functioning (DIF), i.e., each item must be invariant also across relevant subgroups (or person factors), as gender or age.36, 40 A two-way ANOVA tested the presence of DIF for each item, where scores are compared across each level of the person factor and different ability levels, as summarized by the class intervals. If the ANOVA P values were significantly below the Bonferroni correction, DIF was present.41 We summarized the amount of DIF (T-DIF) by obtaining the absolute value of the base-ten logarithm of the sum of all significant P values across all items and all person factors.38 Thus, T-DIF values ranged from zero to infinite, where zero indicated the absence of DIF. We tested the following person factors within the DIF analysis: age (<40 years, between 41 and 59 years, >60 years); gender (male, female); etiology (traumatic brain injury, vascular, other), acute length of stay (<28 days, between 29-48 days, >49 days), time since lesion (<65 days, between 66-123 days, >124 days) and according to Levels of Cognitive Functioning (LCF) (levels 1-3: disorder of consciousness; level 4-6: global neuropsychological dysfunction; level ≥7 selective neuropsychological deficits).
Targeting and reliability were summarized as follows:
targeting (i.e., how well the measurement range of the scale matches the distribution of the calibrating sample)42 was studied by the visual inspection of the targeting graphs and by the presence of ceiling and floor effects. The latter were deemed significant if the highest or the lowest possible score were detected, respectively, for more than 15% of the subjects in the sample;10
separation reliability (i.e., the ability of the instrument to separate persons effectively based on their level of ability37 was expressed by the Person Separation Index (PSI), and α.43 In this context, we considered PSI or α values ≥0.85 and ≥0.70 but <0.85 as sufficient for individual-level and group-level measurements, respectively.30, 44
Should the data not fit the Rasch model (as it is often the case), several subsequent analyses would be undertaken to perform some adjustments to the scale aiming at controlling for the violations of the ICV requirements. This process, which we would undertake iteratively, would be based on post-hoc modifications that could be either:
structural, where the scale structure is actively modified at the item level, either because of rescoring or deleting items (i.e., these modifications affect the total score range). For item rescoring, collapsing adjacent response categories of the same item resolve monotonicity violations. Published guidelines would be followed,45 and the rescoring pattern would be performed to maximize statistical indexes and clinical meaning.26 Instead, an item would be deleted in case of a severe misfit to the model requirements, also taking into account the findings of the preliminary unidimensionality analysis;
statistical, where items would undergo testlets creation or item splitting. In these cases, the scale structure would be unmodified (i.e., the total score range would be unchanged), as these procedures mainly affect the conversion of the total score into interval-level estimates of ability. According to this approach, testlet creation (i.e., item grouping) would be performed on item pairs to account for LD,46 whereas item splitting would be used to account for violations of the absence of uniform-DIF requirement, as detected by the mean of ANOVA.40 However, should DIF be revealed, the influence of the item-splitting procedure on the person estimates would be examined by employing the procedure reported by Maritz et al.31 before incorporating the item-splitting in the final solution. In particular, after item splitting, we would anchor the splitted item solution (i.e., without DIF) to the solution without splitting (i.e., with DIF); subsequently, we would compare the person estimates of the two solutions with a paired t-test, assessing the size of any difference by the Cohen’s d. Thus, should Cohen’s d be <0.2, we would conclude that the DIF is negligible and therefore, we would not adjust for it; instead, in case of Cohen’s d>0.2, we would correct the DIF, by splitting the item and the splitted solution would be the final solution.
Fitness to the Rasch model, ICV requirements, reliability, and targeting were assessed for the original scale (base analysis) and, after each scale modification, to ascertain whether adequate model fit was achieved. This process was repeated cyclically until no further changes were needed and/or possible.
The RA was conducted on the unique assessment sample (i.e., validating sample). Once found a solution, the same analysis would be replicated on the total sample (i.e., confirmatory sample), which would then be anchored to the validating sample. The solution would be considered stable enough should adequate fit to the Rasch model be demonstrated also for the anchored confirmatory sample.
Statistical notes, software, and sample size issues
Descriptive statistics and internal consistency analyses were computed by SPSS software (v. 21 for Windows; SPSS, Inc.., Chicago, IL, USA). MA was run with the R package Mokken v. 2.8.4. CFA was performed using the Mplus software v. 6.0 (Muthen & Muthen, Los Angeles, CA, USA). RA was performed with RUMM2030 software v. 5.1 (Perth, Australia). A significance value of 0.05 was used and corrected for the number of tests by Bonferroni correction.47 Finally, we used the RUMM Logbook™ v. 1.9.5, an ad-hoc Excel 2007™ application developed using Microsoft Visual Basic™ macros to facilitate the interpretation of the results of each RA. A free copy of this application is available from the corresponding author upon request. For CFA, it was estimated that 247 subjects would guarantee a subject-parameter ratio of 9.1, which is close to the recommended ratio of 10:1,48 given 27 score points for ERBI. For RA, the same sample size would be sufficient to estimate item difficulty with α of 0.01 to <±0.5 logits.49
Data availability statement
The raw data associated with the article are publicly available for download from in www.zenodo.org (according to the license Creative Commons Attribution 4.0 International) from the following link: link: https://doi.org/10.5281/zenodo.8171871.
Results
Participants
Two hundred and forty-seven patients with sABI were included in this study. Demographic and clinical characteristics are presented in Table II.
Table II. —Demographic and clinical characteristics of the sample (N.=247).
| Variable | N. | % | Median | Min-max | Mean | SD |
|---|---|---|---|---|---|---|
| Age (years) | 51 | 0-82 | 47.7 | 19.1 | ||
| Gender | ||||||
| Male | 164 | 66.4 | ||||
| Female | 83 | 33.6 | ||||
| Etiology | ||||||
| TBI | 138 | 55.9 | ||||
| Vascular | 87 | 35.2 | ||||
| Anoxic | 20 | 8.1 | ||||
| Infective | 2 | 0.8 | ||||
| LOS-A (days) | 34 | 10-199 | 41.9 | 26.1 | ||
| LOS-R (days) | 142 | 4-605 | 158.4 | 26.1 | ||
| Discharge GOS | ||||||
| Good recovery | 16 | 16.5 | ||||
| Moderate disability | 47 | 19.0 | ||||
| Severe disability | 115 | 46.5 | ||||
| Vegetative state | 68 | 27.5 | ||||
| Death | 1 | 0.4 | ||||
| Center | ||||||
| ISA | 144 | 58.3 | ||||
| CCF | 51 | 20.6 | ||||
| IMOR | 20 | 8.1 | ||||
| FDG | 20 | 8.1 | ||||
| FSL | 12 | 4.9 |
N: number; Min: minimum; Max: maximum; SD: standard deviation; TBI: traumatic brain injury; LOS-A: acute length of stay; LOS-R: rehabilitation length of stay; GOS: Glasgow Outcome Scale; ISA: Instituto Sant’Anna; CCF: Centro Cardinal Ferrari; IMOR: Istituto di Montecatone Ospedale di Riabilitazione; FDG: Fondazione Don Gnocchi; FSL: Fondazione Santa Lucia.
Given the availability of an admission and a discharge assessment, we built a total sample including both assessments, giving 494 observations. Next, a unique assessment sample was created (N.=247), containing a randomly selected observation for each patient, with no repeated observations (Figure 1). Detailed clinical characteristics for the unique assessment sample are presented in Table III.
Table III. —Detailed clinical characteristics for each sample and assessment.
| Variable | Admission (N.=247) | Discharge (N.=247) | Unique assessment sample (N.=247) | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| N. | % | Med | Min-Max | Mean | SD | N. | % | Med | Min-max | Mean | SD | N. | % | Med | Min-max | Mean | SD | |
| TSL (days) | 36 | 10-428 | 49.4 | 52.0 | 183 | 31-587 | 198.8 | 106.5 | 81 | 15-587 | 125.7 | 112.8 | ||||||
| Diagnosis | ||||||||||||||||||
| VS | 86 | 34.8 | 35 | 14.2 | 62 | 25.1 | ||||||||||||
| MCS | 79 | 32.0 | 47 | 19.0 | 61 | 24.7 | ||||||||||||
| E-MCS | 82 | 33.2 | 165 | 66.8 | 124 | 50.2 | ||||||||||||
| Missing | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | ||||||||||||
| CRS-R | 12.5 | 1-23 | 13.5 | 8.0 | 23 | 3-23 | 17.8 | 6.8 | 17 | 1-23 | 15.6 | 7.6 | ||||||
| Missing | 5 | 2.0 | 25 | 10.1 | 14 | 5.7 | ||||||||||||
| LCF | ||||||||||||||||||
| 1 | 1 | 0.4 | 0 | 0.0 | 0 | 0.0 | ||||||||||||
| 2 | 85 | 34.4 | 36 | 14.6 | 62 | 25.1 | ||||||||||||
| 3 | 70 | 28.3 | 47 | 19.0 | 58 | 23.5 | ||||||||||||
| 4 | 37 | 15.0 | 5 | 2.0 | 16 | 6.5 | ||||||||||||
| 5 | 24 | 9.7 | 26 | 10.5 | 23 | 9.3 | ||||||||||||
| 6 | 25 | 10.1 | 48 | 19.4 | 37 | 15.0 | ||||||||||||
| 7 | 4 | 1.6 | 69 | 27.9 | 42 | 17.0 | ||||||||||||
| 8 | 1 | 0.4 | 15 | 6.1 | 9 | 3.6 | ||||||||||||
| Missing | 0 | 0.0 | 1 | 0.4 | 0 | 0.0 | ||||||||||||
| LCF Groups | ||||||||||||||||||
| 1-3 | 156 | 63.2 | 83 | 33.6 | 120 | 48.6 | ||||||||||||
| 4-6 | 86 | 34.8 | 79 | 32.0 | 76 | 30.8 | ||||||||||||
| 7-8 | 5 | 2.0 | 84 | 34.0 | 51 | 20.6 | ||||||||||||
| Missing | 0 | 0.0 | 1 | 0.4 | 0 | 0.0 | ||||||||||||
| Tracheostomia | ||||||||||||||||||
| Yes | 200 | 81.0 | 69 | 27.9 | 130 | 52.6 | ||||||||||||
| No | 47 | 19.0 | 178 | 72.1 | 117 | 47.4 | ||||||||||||
| Missing | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | ||||||||||||
| Respiration | ||||||||||||||||||
| Spontaneous | 122 | 49.4 | 214 | 86.6 | 173 | 70.0 | ||||||||||||
| Spontaneous+O2 | 102 | 41.3 | 27 | 10.9 | 58 | 23.5 | ||||||||||||
| Mechanical | 10 | 4.0 | 6 | 2.4 | 8 | 3.2 | ||||||||||||
| Missing | 13 | 5.3 | 0 | 0.0 | 8 | 3.2 | ||||||||||||
| Feeding | ||||||||||||||||||
| TNP | 5 | 2.0 | 2 | 0.8 | 4 | 1.6 | ||||||||||||
| NGT | 100 | 40.5 | 7 | 2.8 | 49 | 19.8 | ||||||||||||
| PEG | 99 | 40.1 | 89 | 36.0 | 95 | 38.5 | ||||||||||||
| Oral | 42 | 17.0 | 149 | 60.3 | 99 | 40.1 | ||||||||||||
| Missing | 1 | 0.4 | 0 | 0.0 | 0 | 0.0 | ||||||||||||
| Bladder | ||||||||||||||||||
| No | 239 | 96.8 | 45 | 18.2 | 137 | 55.5 | ||||||||||||
| Yes | 8 | 3.2 | 202 | 81.8 | 110 | 44.5 | ||||||||||||
| Missing | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | ||||||||||||
| Pressure sores | ||||||||||||||||||
| No | 154 | 62.3 | 208 | 84.2 | 180 | 72.9 | ||||||||||||
| Yes | 93 | 37.7 | 38 | 15.4 | 67 | 27.1 | ||||||||||||
| Missing | 0 | 0.0 | 1 | 0.4 | 0 | 0.0 | ||||||||||||
| Ashworth Scale a | ||||||||||||||||||
| No | 210 | 85.0 | 169 | 68.4 | 187 | 75.7 | ||||||||||||
| Yes | 37 | 15.0 | 58 | 23.5 | 51 | 20.6 | ||||||||||||
| Missing | 0 | 0.0 | 20 | 8.1 | 9 | 3.6 | ||||||||||||
| HO | ||||||||||||||||||
| No | 4 | 1.6 | 20 | 8.1 | 10 | 4.0 | ||||||||||||
| Yes | 243 | 98.4 | 207 | 83.8 | 228 | 92.3 | ||||||||||||
| Missing | 0 | 0.0 | 20 | 8.1 | 9 | 3.6 | ||||||||||||
| Craniectomy | ||||||||||||||||||
| No | 53 | 21.5 | 26 | 10.5 | 37 | 15.0 | ||||||||||||
| Yes | 194 | 78.5 | 221 | 89.5 | 210 | 85.0 | ||||||||||||
| Missing | 0 | 0.0 | 0 | 0.0 | 0 | 0.0 | ||||||||||||
| Hydrocephalus | ||||||||||||||||||
| No | 20 | 8.1 | 23 | 9.3 | 21 | 8.5 | ||||||||||||
| Yes | 207 | 83.8 | 204 | 82.6 | 206 | 83.4 | ||||||||||||
| Missing | 20 | 8.1 | 20 | 8.1 | 20 | 8.1 | ||||||||||||
N: number; Med: median; Min: minimum; Max: maximum; SD: standard deviation; TSL: time since lesion; VS: vegetative state; MCS: minimally conscious state; E-MCS: emergence of minimally conscious state; CRS-R: Coma Recovery Scale-Revisited; LCF: Cognitive Functioning Scale; O2: oxygen; TPN: total parenteral nutrition; NGT: nasogastric tube; PEG: percutaneous endoscopic gastrostomy; HO: heterotopic ossification. a Ashworth Scale score ≥3 points in >4 joints.
Analyses on the total sample
Internal consistency
Considering the total sample, at the total score level, α was satisfactory (α=0.964). Similar findings were reported for the average inter-item correlations (=0.544 >0.200). At the items level, we observed item-to-total correlation coefficients above 0.400 for all items but three (i.e., A3 [Mechanical ventilation], A4 [Confusional state], and A5 [Behavioral disturbances]). α with an item deleted showed that the deletion of five items (i.e., A1 [Intensive medical monitoring], A2 [Tracheotomy], A3[Mechanical ventilation], A4 [Confusional state], and A5 [Behavioral disturbances]) increased α (Table IV).
Table IV. —Results of classical item analysis, Mokken analysis and confirmatory factor analysis on original items (N.=494).
| Summary analysis | Classical item analysis | Mokken analysis | Confirmatory factor analysis - baseline | Confirmatory factor analysis - final | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| aIIC | α | Scale | Scale | RMSEA | SRMR | CFI | TLI | χ2df | P value | RMSEA | SRMR | CFI | TLI | χ2df | P value | |
| 0.544 | 0.964 | 0.460* | 0.176* | 1.000 | 1.000 | 12557.7119 | 0.0000* | 0.048 | 0.084* | 1.000 | 1.000 | 174.982 | 0.0000* | |||
| Recommended values | >0.200 | >0.700 | ≤0.06 | ≤0.06 | ≥0.950 | ≥0.950 | n.s | ≥0.05 | ≤0.06 | ≤0.06 | ≥0.950 | ≥0.950 | n.s | ≥0.05 | ||
| 1 | 2 | |||||||||||||||
| Item analysis | ITC | α-iid | H | H | Factor loading | SE | Factor loading | SE | ||||||||
| A1 Intensive medical monitoring | 0.615 | 0.964* | 1.00 | 1.000 | 0.000 | 1.000 | 0.000 | |||||||||
| A2 Tracheostoma | 0.676 | 0.964* | 0.97 | 0.946 | 0.013 | 0.952 | 0.013 | |||||||||
| A3 Mechanical ventilation | 0.092* | 0.968* | 0.98 | 0.998 | 0.000 | 0.999 | 0.000 | |||||||||
| A4 Confusional state | 0.147* | 0.969* | 0.66 | 0.296* | 0.068 | 0.372* | 0.083 | |||||||||
| A5 Behavioral disturbances | 0.132* | 0.969* | 0.66 | 0.292* | 0.078 | 0.205* | 0.080 | |||||||||
| A6 Severe communication deficits | 0.734 | 0.963 | 0.96 | 0.931 | 0.013 | 0.928 | 0.014 | |||||||||
| A7 Swallowing disorders | 0.758 | 0.963 | 0.97 | 0.999 | 0.000 | 0.966 | 0.010 | |||||||||
| B1 Feeding | 0.946 | 0.959 | 1.00 | 0.985 | 0.000 | 0.985 | 0.000 | |||||||||
| B2 Transfers | 0.961 | 0.959 | 0.98 | 0.990 | 0.000 | 0.990 | 0.000 | |||||||||
| B3 Grooming | 0.959 | 0.959 | 0.99 | 0.988 | 0.000 | 0.986 | 0.000 | |||||||||
| B4 Toilet use | 0.926 | 0.960 | 0.98 | 0.988 | 0.000 | 0.987 | 0.000 | |||||||||
| B5 Bathing | 0.866 | 0.961 | 0.98 | 0.986 | 0.000 | 0.986 | 0.000 | |||||||||
| B6 Mobility | 0.947 | 0.959 | 0.98 | 0.987 | 0.000 | 0.986 | 0.000 | |||||||||
| B7 Stairs | 0.901 | 0.960 | 0.98 | 0.988 | 0.000 | 0.986 | 0.000 | |||||||||
| B8 Dressing | 0.940 | 0.959 | 0.99 | 0.986 | 0.000 | 0.986 | 0.000 | |||||||||
| B9 Bowels | 0.938 | 0.960 | 0.97 | 0.986 | 0.000 | 0.986 | 0.000 | |||||||||
| B10 Bladder | 0.929 | 0.960 | 0.97 | 0.986 | 0.000 | 0.986 | 0.000 | |||||||||
| Recommended values | >0.400 | <0.964 | >0.30 | >0.30 | >0.400 | >0.400 | ||||||||||
aIIC: average item-to-total correlation; α: Cronbach’s alpha; RMSEA: Root Mean Square Error of Approximation; SRMR: Standardized Root Mean Square Residual; CFI: Comparative Fit Index; TLI: Tucker Lewis Index; χ2df: Chi-square with degree of freedom; ITC: item-to-total correlation; α-iid: alpha if an item was deleted; H: scalability coefficient; SE: standard error; n.s.: not significant. *Statistics values outside the recommended cut-off.
Mokken analysis
At the scale level, MA on the total sample showed that all items could be scalable on one main scale, but A4 [Confusional state] and A5 [Behavioral disturbances]), which were scalable onto a separate scale. At the item level, H coefficients showed high scalability for all items on both scales (H>0.30) (Table IV).
Confirmatory factor analysis
Baseline CFA on the total sample showed a lack of fit to a unidimensional structure (RMSEA=0.460). Furthermore, A4 [Confusional state] and A5 [Behavioral disturbances]), again, showed significantly lower factor loadings (<0.30) in comparison to all other items (>0.93). Finally, high MIs suggested local dependence between several pairs of items (Table IV). In particular, we observed MIs >0.994 amongst several pairs of BI items. Indeed, after accounting for local dependence, the scale did fit a unidimensional model (RMSEA=0.048), although the factor loadings for A4 and A5 remained significantly lower (<3.8) in comparison to the other items (>0.929) (Table IV).
Rasch analysis
As reported in Table V, the base RA on the unique assessment sample showed that the data failed to meet the requirements of stochastic invariance (χ2df=866.954; P≤0.001) local independence (10 items presented their residual correlation above the LDRC, here set at 0.168), and absence of DIF (six items presented uniform and non-uniform DIF). Furthermore, 11 items misfitted to the model. However, there were no violations of the monotonicity requirements (all items had an ordered threshold structure), and the scale appeared to be strictly unidimensional (PST=1.3%, lower BCI=-0.9%). At the item level, A3 (Mechanical ventilation), A4 (Confusional state), and A5 (Behavioral disturbances) showed misfitting to the model.
Table V. —Summary of the Rasch analyses for each sample.
| Analysis description | Fitness to the Rasch Model | Internal construct validity requirements | Separation reliability | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Item fit residual | Person fit residual | Item-trait interaction | Unidimensionality | Other ICV requirements | ||||||||||||
| Analysis name | N/CI | Mean | SD | Mean | SD | χ2df | P value | Cut-off a | PST (%) b | Lower BCI (%) b | T-DT c | T-LD d | T-DIF e | PSI | α | |
| Unique Assessment Sample | Base | 247/2 | -1.462 | 1.821* | -0.462 | 0.468 | 221.517 | 0.0000* | 0.0029 | 1.3% | -0.9% | 0.0% | 3.352* | 135.5* | 0.960 | 0.965 |
| Deleting A3-A4-A5 | 247/2 | -0.751 | 1.065 | -0.289 | 0.435 | 25.214 | 0.0331 | 0.0036 | 5.3% | 1.5% | 0.0% | 2.320* | 43.5* | 0.951 | 0.980 | |
| Five testlets | 247/2 | -0.633 | 0.919 | -0.324 | 0.402 | 7.25 | 0.2070 | 0.0100 | 3.1% | -0.7% | 0.0% | 0.000 | 21.5* | 0.902 | 0.936 | |
| Total Sample | Base | 494/4 | -1.742 | 2.315* | -0.458 | 0.440 | 866.934 | 0.0000* | 0.0029 | 2.6% | 1.5% | 0.0% | 1.576* | 121.8* | 0.953 | 0.964 |
| Deleting A3-A4-A5 | 494/4 | -1.020 | 1.466* | -0.349 | 0.452 | 491.242 | 0.0000* | 0.0036 | 4.0% | 2.0% | 0.0% | 1.536* | 112.0* | 0.940 | 0.979 | |
| Five testlets | 494/4 | -0.712 | 0.706 | -0.359 | 0.410 | 19.315 | 0.1988 | 0.0100 | 2.4% | 0.4% | 0.0% | 0.000 | 64.1* | 0.909 | 0.937 | |
| Anchored from unique assessment sample | 494/4 | -0.442 | 0.694 | -0.353 | 0.427 | 20.615 | 0.1518 | 0.0100 | 3.6% | 1.6% | 0.0% | 0.000 | 57.3 | 0.894 | 0.937 | |
| Splitting ST4 for time since lesion | 494/4 | -0.770 | 0.657 | -0.359 | 0.411 | 19.117 | 0.3260 | 0.0083 | - | - | 0.0% | 0.200 | 50.1 | 0.909 | - | |
| Splitting ST5 for Levels of Cognitive Function | 494/4 | -0.847 | 0.722 | -0.352 | 0.392 | 21.920 | 0.3526 | 0.0071 | - | - | 0.0% | 0.236 | 40.3 | 0.920 | - | |
| Recommended values | - | ≤1.4 | - | ≤1.4 | - | <Cut-off | - | <5.0 g | <5.0 | 0% | 0 | 0 | ≥0.85 f | ≥0.85 f | ||
ICV: internal construct validity; N/CI: the ratio between sample size and class intervals; SD: standard deviation; χ2df: unconditional Chi-square for model fit with degree of freedom; PST: the proportion of significant t-test carried out on the estimates that, within a principal component analysis of residuals, loaded positively and negatively (factor loading >0.30) on the first component; BCI, binomial (95%) confidence interval for proportions of significant t-test; T-DT: percentage of items with disordered thresholds; T-LD: total local dependency load; T-DIF: total DIF load; PSI: Person Separation Index; α: Cronbach’s alpha. a Bonferroni-corrected P value, which varies by analysis, and that is used to interpret the corresponding Chi-square P value; b unidimensionality is considered achieved either when PST is <5% or when the lower bound of its BCI is <5%; c the T-DT statistic is calculated as the percentage of items with disordered thresholds out of the total number of items. The values range from zero to 100%, where zero indicates the absence of items with disordered thresholds; d the T-LD summary statistics is calculated by summing together all the residual correlations values above the local dependency relative cut-off, which is calculated as the mean of all the residual correlations (excluding the correlation of items with themselves) minus 0.2. The values range from zero to infinite, where zero indicates the absence of local dependency; e the T-DIF summary statistic is calculated as the absolute value of the base ten logarithms of all P values for uniform and non-uniform DIF across all items and across all person factors, which are below the Bonferroni-corrected P value. The values range from zero to infinite, where zero indicates no DIF; f a value of ≥0.850 suggests a precision of measurement also at the individual level, whereas a value between 0.700 and 0.849 indicates precision only at the group level. *Values outside the recommended range.
Given the lack of fit to the Rasch model and the results of the preliminary assessment of unidimensionality analyses, we decided to delete A3 (Mechanical ventilation), A4 (Confusional state), and A5 (Behavioral disturbances). The subsequent RA on the reduced item set showed that the data satisfied the requirements of the stochastic ordering of the items (χ2df=25.214; P=0.0331), but failed to meet the local independence (6 items had residual correlation values higher than the LDRC, here set at 0.145), and absence of DIF (two items presented uniform and non-uniform DIF) requirements. Moreover, B2 showed a misfit to the model. However, the unidimensionality was acceptable (PST=5.3, lower BCI=1.5%), and no item showed DT (Table V).
Afterward, according to the item content and the local dependency pattern, we created testlets to absorb local dependency and item misfit. Particularly, we created the following five testlets:
ST1: Intensive medical monitoring (A1) + Tracheostomy (A2);
ST2: Severe communication deficits (A6) + Swallowing disorders (A7);
ST3: Feeding (B1) + Transfers (B2) + Mobility (B6) + Stairs (B7);
ST4: Grooming (B3) + Toilet use (B4) + Bathing (B5);
ST5: Bowels (B9) + Bladder (B10);
This five-testlet solution showed a good fit to the model (χ2df=7.25; P=0.2070), satisfied the requirements of monotonicity (no items with DT), strict unidimensionality (PST=3.1, lower BCI=-0.7%), and local independence. There was uniform and non-uniform DIF by etiology for ST1 (A1+A2) (Table V). However, the impact of this DIF on the person estimates appeared to be negligible, as the difference between the paired person estimates of the splitted versus the un-splitted solution yielded a Cohen’s d=0.135. Therefore, DIF was not accounted for. The PSI value (0.902) suggested that the scale had sufficient precision for measurement on single subjects (PSI >0.850). The item hierarchy indicated that ST1 (A1+A2) and ST4 (B3+B4+B5) were, respectively, the easiest and the most difficult testlet (Table VI). The item fit statistics and the scoring model for the final solution are reported in Table VI. The targeting graph of the final solution showed that participants were spread across seventeen logits, with a frank floor effect as 102 (47.4%) subjects presented the lowest score (Figure 2).
Table VI. —Items’ parameter, fit statistics, scoring model for the final solution in unique assessment sample (N.=247).
| Item description | Item parameters and fit statistics | Scoring model | ||||
|---|---|---|---|---|---|---|
| Location | SE | FR | χ2 | Prob a | ||
| ST01 – A1-A2 | -4.727 | 0.209 | 0.089 | 0.316 | 0.574 | 0-2 |
| ST02 – A6-A7-B1 | -1.334 | 0.152 | -0.378 | 0.839 | 0.360 | 0-4 |
| ST05 – B9-B10 | 1.265 | 0.171 | -0.032 | 0.040 | 0.841 | 0-4 |
| ST03 – B2-B6-B7 | 2.211 | 0.145 | -2.192 | 5.820 | 0.016 | 0-6 |
| ST04 – B3-B4-B5 | 2.585 | 0.124 | -0.651 | 0.173 | 0.678 | 0-6 |
SE: standard error; FR: fit residual; χ2: Chi-square; Prob: χ2 probability. The location is expressed in logits. The degrees of freedom for each χ2 were 1 for all items. a Bonferroni-corrected P value was set at 0.01, indicative of statistical significance at the 0.05 level.
Figure 2.
—Targeting (person-thresholds distribution) graphs for unique assessment sample. For each graph, persons (N.=247) and item thresholds are displayed, respectively, in the upper and the lower part of the chart, separated by the logit scale. Grouping set to interval length of 0.20, making 85 groups for base and final analyses. Freq: frequency; No: number; SD: standard deviation.
The base RA on total sample showed that even these data failed to satisfy the requirements of the stochastic ordering of the items (χ2df=866.934; P≤0.001), local independence (five items had residual correlation values higher than the LDRC, here set at 0.166), and absence of DIF (five items presented uniform and non-uniform DIF). Moreover, eleven items misfitting to the model. Finally, the unidimensionality was strict (PST=2.6, lower BCI=1.5%), and no item presented DT (Table V). Following this, we replicated the same solution achieved for the unique assessment sample on the total sample. This solution showed that the data satisfied the model requirement of stochastic invariance (χ2df=19.315; P=0.1988), strict unidimensionality (PST=2.4, lower BCI=0.4%), and monotonicity (no DT was reported). No testlet showed a misfit to the model, although four testlets presented five uniform DIFs and two non-uniform DIFs (Table V).
Finally, we anchored the item difficulty estimates from the unique assessment sample to the final solution of the total sample. Good fit to the model (χ2df=20.615; P=0.1518) and satisfaction of all the other ICV requirements (including unidimensionality) were confirmed, except for the presence of four uniform DIFs and two non-uniform DIFs. Notably, ST4 presented a uniform DIF for time since lesion; the difference between the paired person estimates of the splitted versus the un-splitted solution yielded a Cohen’s d value of 0.906 (large effect size). Consequently, a new Rasch analysis was run to split ST4 for time since lesion (<65 days vs. between 66-123 days and >124 days). The data meet the model expectation for stochastic invariance (χ2df=19.117; P=0.3260) and monotonicity; no testlet showed a misfit to the model. Moreover, three testlets presented uniform and non-uniform DIFs. Particularly, ST5 showed a uniform DIF for LCF; the difference between the paired person estimates of the splitted vs. the un-splitted solution yielded a Cohen’s d of 0.832, which, again, was a large effect size. Therefore, a new Rasch analysis was run to split ST5 for LCF. The data satisfied the model requirement of stochastic invariance (χ2df=21.920; P=0.3526) and monotonicity. No testlet presented misfitting to the model, and three items presented uniform and non-uniform DIFs. In particular, ST1 presented a uniform and non-uniform DIFs for etiology, whereas ST2 presented a uniform DIF for LCF and a DIF for time since lesion. However, the impact on the estimates of these item biases was small or negligible (Cohen’s d of 0.333, 0.125, and 0.216, respectively). Therefore, DIF for both items was not accounted for. For this solution, too, the PSI (0.920) was compatible with measurement on single subjects (>0.85).
Analysis of simulated data
Given the scoring model of A4 (Confusional state) and A5 (Behavioral disturbance), we hypothesized that their misfit could be due to the lack of matching between their current scoring structure and the temporal evolution of the related conditions. Indeed, both confusional state and behavioral disturbance (i.e., agitation) are not clinically detectable both in persons with disorders of consciousness and in individuals who have reached a higher level of independence. In other words, for an item to measure these two sub-constructs correctly, its structure should be unfolding and based on a three-level rating scale (0 = confusional state/behavioral disturbance non-detectable because of disorder of consciousness; 1 = clinical condition present; 2 = confusional stat/behavioral disturbance resolved). Thus, we hypothesized that fit to a unidimensional model could improve by replicating this unfolding structure at the level of the two items.
Thus, we applied the above rating scale to the two items to test this hypothesis. In particular, the lower score was assigned to patients with an LCF score ranging from 1 to 3 (i.e., absence of confusional state and behavioral disturbance because of a disorder of consciousness). In contrast, the two other score levels were unchanged. Following this, we performed a CTT-IA, MA, and CFA to test whether these two items’ new unfolding structure could better satisfy the unidimensionality requirement.
At the total score level, α (=0.972) was improved if compared to α of the original data (0.964), and the average inter-item correlations yielded similar results (0.660>0.544) (Table VII). At the items level, all items showed an item-to-total correlation coefficient above 0.400 but mA3 [Mechanical ventilation]. However, the item-to-total correlation coefficients for mA4 [Confusional state] and mA5 [Behavioral disturbances] improved significantly (>0.980) (Table VII). At the scale level, the MA now showed that all items could be scalable on a single scale, with H coefficients indicating high scalability for all items (H>0.30) (Table VII). The baseline CFA showed that the modified data still misfitted to a unidimensional model (RMSEA=0.400), although now the factor loadings for mA4 and mA5 were both >0.980. After accounting for local dependence, the fit to the unidimensional model improved if compared to the original scale (RMSEA=0.044 vs. 0.048) (Table VII).
Table VII. —Results of classical item analysis, Mokken analysis and confirmatory factor analysis on the modified scale (N.=494).
| Summary analysis | Classical item analysis | Mokken analysis | Confirmatory Factor Analysis - Baseline | Confirmatory Factor Analysis - Final | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| aIIC | α | Scale | RMSEA | SRMR | CFI | TLI | χ2df | P value | RMSEA | SRMR | CFI | TLI | χ2df | P value | |
| 0.660 | 0.972 | 0.400* | 0.159* | 1.000 | 1.000 | 9521.2119 | 0.0000* | 0.044 | 0.053 | 1.000 | 1.000 | 165.784 | 0.0000* | ||
| Recommended values | >0.200 | >0.700 | ≤0.06 | ≤0.06 | ≥0.950 | ≥0.950 | n.s | ≥0.05 | ≤0.06 | ≤0.06 | ≥0.950 | ≥0.950 | n.s | ≥0.05 | |
| 1 | |||||||||||||||
| Item analysis | ITC | α-iid | H | Factor loading | SE | Factor loading | SE | ||||||||
| A1 Intensive medical monitoring | 0.641 | 0.972* | 1.00 | 1.000 | 0.000 | 1.000 | 0.000 | ||||||||
| A2 Tracheostoma | 0.737 | 0.971 | 0.97 | 0.942 | 0.012 | 0.952 | 0.012 | ||||||||
| A3 Mechanical ventilation | 0.104* | 0.976* | 0.98 | 0.998 | 0.000 | 0.998 | 0.000 | ||||||||
| mA4 Confusional state | 0.758 | 0.972* | 0.95 | 0.984 | 0.007 | 0.921 | 0.015 | ||||||||
| mA5 Behavioral disturbances | 0.738 | 0.972* | 0.95 | 0.992 | 0.007 | 0.921 | 0.016 | ||||||||
| A6 Severe communication deficits | 0.785 | 0.971 | 0.96 | 0.930 | 0.012 | 0.940 | 0.012 | ||||||||
| A7 Swallowing disorders | 0.807 | 0.971 | 0.97 | 0.999 | 0.000 | 0.971 | 0.009 | ||||||||
| B1 Feeding | 0.941 | 0.968 | 1.00 | 0.984 | 0.000 | 0.986 | 0.000 | ||||||||
| B2 Transfers | 0.952 | 0.968 | 0.98 | 0.990 | 0.000 | 0.988 | 0.000 | ||||||||
| B3 Grooming | 0.940 | 0.968 | 0.99 | 0.987 | 0.000 | 0.986 | 0.000 | ||||||||
| B4 Toilet use | 0.899 | 0.969 | 0.99 | 0.987 | 0.000 | 0.986 | 0.000 | ||||||||
| B5 Bathing | 0.839 | 0.970 | 0.98 | 0.986 | 0.000 | 0.986 | 0.000 | ||||||||
| B6 Mobility | 0.927 | 0.968 | 0.98 | 0.986 | 0.000 | 0.986 | 0.000 | ||||||||
| B7 Stairs | 0.873 | 0.969 | 0.98 | 0.988 | 0.000 | 0.986 | 0.000 | ||||||||
| B8 Dressing | 0.915 | 0.969 | 0.99 | 0.986 | 0.000 | 0.984 | 0.000 | ||||||||
| B9 Bowels | 0.924 | 0.968 | 0.97 | 0.986 | 0.000 | 0.986 | 0.000 | ||||||||
| B10 Bladder | 0.905 | 0.969 | 0.97 | 0.986 | 0.000 | 0.986 | 0.000 | ||||||||
| Recommended values | >0.400 | <0.964 | >0.30 | >0.400 | N/A | >0.400 | N/A | ||||||||
aIIC: average item-to-total correlation; α: Cronbach’s alpha; RMSEA: Root Mean Square Error of Approximation; SRMR: Standardized Root Mean Square Residual; CFI: Comparative Fit Index; TLI: Tucker Lewis Index; χ2df: Chi-square with degree of freedom; ITC: item-to-total correlation; α-iid: alpha if an item was deleted; H: scalability coefficient; SE: standard error; n.s.: not significant. *Statistics values outside the recommended cut-off.
Discussion
To the best of our knowledge, this is the first study that deeply investigated the ERBI ICV through classical (i.e., CTT-IA and CFA) and modern (i.e., MA and RA) psychometric techniques. The unidimensionality preliminary assessment conducted with CTT-IA, MA, and CFA showed clearly that both A4 (Confusional state) and A5 (Behavioral disturbances) contributed to a lesser extent to the measurement of early functional changes if compared to the other items. The RA confirmed these findings, which also showed the misfitting of A3, A4, and A5 and violation of the stochastic invariance, local independence, and measurement invariance (i.e., presence of DIF), but not monotonicity and unidimensionality. After the deletion of three misfitting items and further non-structural modifications (i.e., testlets creation to absorb LD between items and item misfitting), the final solution showed stochastic and measurement invariance, local independence, strict unidimensionality, and adequate reliability for measurements at the individual level, albeit with a frank floor effect. This final solution was successfully replicated in the total sample of the subjects. After post-hoc modifications of the scoring structure of two out of three misfitting items, the subsequent CFA and MA showed the resolution of the unidimensional issues.
Within this study, we consecutively enrolled patients admitted to early rehabilitation across five different Italian rehabilitation centers. Furthermore, the enrolled patients presented different levels of consciousness, a wide range of duration of ICU stay, and various etiologies (i.e., traumatic, vascular, anoxic, and rare ones such as encephalitis). Therefore, the characteristics of our sample are similar to that enrolled in the GISCAR study, an Italian epidemiological study.50 Therefore, our sample could be considered sufficiently representative of the Italian sABI population admitted to early rehabilitation.
Mechanical ventilation (A3) showed a lack of unidimensionality and several fit issues in RA. This could be explained considering that few of the patients within the sample (4.0% and 1.6%, on admission and discharge, respectively) were mechanically ventilated on admission to a rehabilitation unit. Furthermore, some of these patients may be difficult to wean, especially if they present a brain stem injury. As score variance under this condition was limited, this could explain why this item did not work as expected when summed up with the other items and, hence, was deleted. However, should this scale be employed in subacute settings (e.g., intensive care unit), the prevalence of mechanically ventilated patients could be higher, making this item more appropriate for the targeted sample.
Within the RA, we deleted A4 (Confusional state) and A5 (Behavioral disturbances) given the lack of unidimensionality and the RA findings. Indeed, the latter showed highly significant Chi-squares indicating a serious misift to the stochastic invariance requirement of the Rasch model. We hypothesized that this could be the consequence of the violation of the monotonicity requirement due to the incorrect dichotomous scoring structure. Indeed, what is expected is that as the level of independence increases (i.e., the ERBI total score increases), the score of each item increases. However, the dichotomous structure allowed the attribution of the higher score for these two items both to patients with a severe disorder of consciousness and to those with a much higher level of independence. We tested this hypothesis by simulating an unfolding structure for these two items, where the absence of the related behaviors was scored as the lowest in the presence of a disorder of consciousness. This post-hoc scoring modification recreated the same quantitative hierarchy of all the other items, significantly improving the unidimensionality shown by MA and CFA. These findings suggest the need for a refinement of the scoring structure of these two items rather than their deletion, as their content coverage is relevant and valuable to assess the level of independence of patients in the early post-acute neurorehabilitation phase.
After deleting three items and creating five testlets, the fit to Rasch model was achieved even if the targeting graph showed a frank floor effect. However, different results were obtained in the Brazilian version of ERBI,16 in which the authors found no floor effect. However, this discrepancy could be explained by considering the different populations and settings of the two studies. Indeed, within the study from Silva et al., they enrolled patients admitted to the intensive care unit with surgical outcomes or internal and respiratory pathologies. Furthermore, most of them were discharged at the end of their hospitalization in the ICU with a consequent probable higher functionality than reported by our patients with sABI. Therefore, we hypothesize that the tendency towards floor effect of the ERBI for sABI patients admitted to early rehabilitation could be improved by adding additional items which may be specifically relevant for this population and that could be better able to discriminate at lower levels of ability. For instance, items that explore the level of consciousness, the presence of bedsores, and of other devices (e.g., bladder catheter, nasogastric tube, and for the administration of intravenous medications) could help overcome this frank floor effect, as these items would investigate elements of clinical complexity which are more likely to be present on admission to rehabilitation rather than on discharge. Thus, we propose a revision of the content of the ERBI for patients admitted to early rehabilitation that, together with the revision of the scoring structure of A4 and A5, may lead to a drastic reduction of the floor effect. Of course, this hypothesis about the improvement of the measurement properties of the ERBI should be tested and confirmed within the context of a new prospective multicenter study.
We did not perform a Rasch analysis on the simulated data. The reason for that is that the scoring structure of the behavioral disturbance and confusional state items was modified deterministically using the LCF as a guidance. Indeed, the Rasch model expects some degree of randomness in the data which cannot be achieved with data simulated deterministically.51 On the other hand, the purpose of the post-hoc analyses on simulated data was to test the hypothesis that unidimensionality could improve after modification of these two items. This hypothesis was successfully tested within this study with less restrictive models (i.e., MA and CFA) and will be further tested using Rasch analysis in future prospective longitudinal studies.
We did not report in this study the recommended conversion table from raw scores to interval-level estimates of ability.36 Although we have achieved a valid scale structure according to the Rasch Model, the modified ERBI presents a higher floor effect than its original counterpart. Although this is expected considering the deletion of three items, we believe that clinicians and researchers should not be encouraged to use this modified version, as they would be if we provided a conversion table. Indeed, our conclusion is that the design of the scale needs to be improved considering the basis of the indications provided by this study and that the validity of the new scale should be tested with a proper, ad-hoc, prospective longitudinal study.
Our results show that, although ERBI meets the ICV requirements within the Rasch model after structural modifications, it showed a frank floor effect that effectively limits its use in clinical practice and research. Our post-hoc analyses demonstrated that modifying the score structure of two misfitting items that were deleted could improve the unidimensionality of the ERBI; moreover, other items with appropriate content for the sample examined could reduce the floor effect. Therefore, ERBI cannot be used in this form in clinical practice and research in measuring functional changes in patients with ABI; consequently, we suggest modifying its structure before using it and verifying its psychometric properties before using it in new scientific studies.
Limitations of the study
This study presents several limitations that deserve to be highlighted. First, although our sample could be considered sufficiently representative of the Italian sABI population, our findings cannot be generalized to patients with different diseases admitted to other settings (e.g., ICU). Second, we performed the rescoring of A4 and A5 based on the LCF score. This post-hoc imputation, based on a deterministic approach would need confirmation on a further analysis based on newly collected real data. Third, since the data analyzed in this article comes from previously published studies, the procedures may not have been the same between the different studies; this issue could introduce a measurement bias, i.e., it may lead to systematic errors or differences in how the data is collected or interpreted. However, the assessors involved in these studies were experienced, and these assessments were part of routine clinical practice. Finally, in this study, we could not exert any control on the ERBI administration procedures, nor we could assess classical inter- and intra-rater reliability, as we performed a secondary analysis built upon data collected from previously published studies.
Conclusions
The results of this study suggest that the ERBI total score cannot be considered a valid measure of the functional changes occurring in patients with sABI admitted to early rehabilitation. However, this study has also suggested the possibility of improving the scoring of some items and to reduce the founded frank floor effect, we found, adding other items that might have a more appropriate content to the construct that ERBI intends to measure. Future researchers are needed to refine the instrument to make it a valuable measuring tool for this population.
History
Supplementary Digital Material 1
Supplementary Table I
Assessment of the measurement quality of an instrument within the Rasch analysis framework.
Footnotes
Conflicts of interest: The authors certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript.
Funding: The publication of this article was supported by the “Ricerca Corrente” funding from the Italian Ministry of Health.
References
- 1.Estraneo A, Loreto V, Masotta O, Pascarella A, Trojano L. Do Medical Complications Impact Long-Term Outcomes in Prolonged Disorders of Consciousness? Arch Phys Med Rehabil 2018;99:2523–2531.e3. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=29807003&dopt=Abstract 10.1016/j.apmr.2018.04.024 [DOI] [PubMed] [Google Scholar]
- 2.Mahoney FI, Barthel DW. Functional evaluation: the Barthel Index. Md State Med J 1965;14:61–5. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=14258950&dopt=Abstract [PubMed] [Google Scholar]
- 3.Houlden H, Edwards M, McNeil J, Greenwood R. Use of the Barthel Index and the Functional Independence Measure during early inpatient rehabilitation after single incident brain injury. Clin Rehabil 2006;20:153–9. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=16541936&dopt=Abstract 10.1191/0269215506cr917oa [DOI] [PubMed] [Google Scholar]
- 4.Laratta S, Lucca LF, Tonin P, Cerasa A. Factors Influencing Burden in Spouse-Caregivers of Patients with Chronic-Acquired Brain Injury. BioMed Res Int 2020;2020:6240298. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=32685509&dopt=Abstract 10.1155/2020/6240298 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hsueh IP, Lin JH, Jeng JS, Hsieh CL. Comparison of the psychometric characteristics of the functional independence measure, 5 item Barthel index, and 10 item Barthel index in patients with stroke. J Neurol Neurosurg Psychiatry 2002;73:188–90. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12122181&dopt=Abstract 10.1136/jnnp.73.2.188 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Tofani M, Massai P, Fabbrini G, Berardi A, Pelosin E, Conte A, et al. Psychometric properties of the Italian version of the Barthel Index in patients with Parkinson’s disease: a reliability and validity study. Funct Neurol 2019;34:145–50. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=32453995&dopt=Abstract [PubMed] [Google Scholar]
- 7.Rollnik JD, Bertram M, Bucka C, Hartwich M, Jöbges M, Ketter G, et al. Criterion validity and sensitivity to change of the Early Rehabilitation Index (ERI): results from a German multi-center study. BMC Res Notes 2016;9:356. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=27440117&dopt=Abstract 10.1186/s13104-016-2154-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.van Bennekom CA, Jelles F, Lankhorst GJ, Bouter LM. Responsiveness of the rehabilitation activities profile and the Barthel index. J Clin Epidemiol 1996;49:39–44. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8598509&dopt=Abstract 10.1016/0895-4356(95)00559-5 [DOI] [PubMed] [Google Scholar]
- 9.Shah S, Vanclay F, Cooper B. Improving the sensitivity of the Barthel Index for stroke rehabilitation. J Clin Epidemiol 1989;42:703–9. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=2760661&dopt=Abstract 10.1016/0895-4356(89)90065-6 [DOI] [PubMed] [Google Scholar]
- 10.McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res 1995;4:293–307. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7550178&dopt=Abstract 10.1007/BF01593882 [DOI] [PubMed] [Google Scholar]
- 11.Hobart J, Cano S. Improving the evaluation of therapeutic interventions in multiple sclerosis: the role of new psychometric methods. Health Technol Assess 2009;13:iii, ix–x, 1–177. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=19216837&dopt=Abstract 10.3310/hta13120 [DOI] [PubMed] [Google Scholar]
- 12.Schönle PW. [The Early Rehabilitation Barthel Index—an early rehabilitation-oriented extension of the Barthel Index]. Rehabilitation (Stuttg) 1995;34:69–73. [German]. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7624593&dopt=Abstract [PubMed] [Google Scholar]
- 13.Lucca LF, De Tanti A, Cava F, Romoli A, Formisano R, Scarponi F, et al. Predicting Outcome of Acquired Brain Injury by the Evolution of Paroxysmal Sympathetic Hyperactivity Signs. J Neurotrauma 2021;38:1988–94. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=33371784&dopt=Abstract 10.1089/neu.2020.7302 [DOI] [PubMed] [Google Scholar]
- 14.Estraneo A, Pascarella A, Masotta O, Bartolo M, Pistoia F, Perin C, et al. Multi-center observational study on occurrence and related clinical factors of neurogenic heterotopic ossification in patients with disorders of consciousness. Brain Inj 2021;35:530–5. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=33734911&dopt=Abstract 10.1080/02699052.2021.1893384 [DOI] [PubMed] [Google Scholar]
- 15.Rollnik JD. The Early Rehabilitation Barthel Index (ERBI). Rehabilitation (Stuttg) 2011;50:408–11. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=21626475&dopt=Abstract 10.1055/s-0031-1273728 [DOI] [PubMed] [Google Scholar]
- 16.Reis NF, Biscaro RR, Figueiredo FC, Lunardelli EC, Silva RM. Early Rehabilitation Index: translation and cross-cultural adaptation to Brazilian Portuguese; and Early Rehabilitation Barthel Index: validation for use in the intensive care unit. Rev Bras Ter Intensiva 2021;33:353–61. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=35107546&dopt=Abstract [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Rollnik JD, Janosch U. Current trends in the length of stay in neurological early rehabilitation. Dtsch Arztebl Int 2010;107:286–92. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=20467554&dopt=Abstract 10.3238/arztebl.2010.0286 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bond TG, Fox CM. Applying the Rasch model: Fundamental measurement in the human sciences: Psychology Press; 2013. [Google Scholar]
- 19.Lucca LF, Lofaro D, Leto E, Ursino M, Rogano S, Pileggi A, et al. The Impact of Medical Complications in Predicting the Rehabilitation Outcome of Patients With Disorders of Consciousness After Severe Traumatic Brain Injury. Front Hum Neurosci 2020;14:570544. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=33192402&dopt=Abstract 10.3389/fnhum.2020.570544 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Giacino JT, Ashwal S, Childs N, Cranford R, Jennett B, Katz DI, et al. The minimally conscious state: definition and diagnostic criteria. Neurology 2002;58:349–53. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11839831&dopt=Abstract 10.1212/WNL.58.3.349 [DOI] [PubMed] [Google Scholar]
- 21.Mallinson T. Rasch analysis of repeated measures. Rasch Meas Trans 2011;25. [Google Scholar]
- 22.Bland JM, Altman DG. Cronbach’s alpha. BMJ 1997;314:572. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=9055718&dopt=Abstract 10.1136/bmj.314.7080.572 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al. Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007;60:34–42. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=17161752&dopt=Abstract 10.1016/j.jclinepi.2006.03.012 [DOI] [PubMed] [Google Scholar]
- 24.Basagni B, Piscitelli D, De Tanti A, Pellicciari L, Algeri L, Caselli S, et al. The unidimensionality of the five Brain Injury Rehabilitation Trust Personality Questionnaires (BIRT-PQs) may be improved: preliminary evidence from classical psychometrics. Brain Inj 2020;34:673–84. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=32126842&dopt=Abstract 10.1080/02699052.2020.1723700 [DOI] [PubMed] [Google Scholar]
- 25.Mokken RJ. A theory and procedure of scale analysis. A theory and procedure of scale analysis. De Gruyter Mouton; 2011. [Google Scholar]
- 26.La Porta F, Franceschini M, Caselli S, Cavallini P, Susassi S, Tennant A. Unified Balance Scale: an activity-based, bed to community, and aetiology-independent measure of balance calibrated with Rasch analysis. J Rehabil Med 2011;43:435–44. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=21394420&dopt=Abstract 10.2340/16501977-0797 [DOI] [PubMed] [Google Scholar]
- 27.Byrne BM. Structural equation modeling with Mplus: Basic concepts, applications, and programming. Routledge; 2013. [Google Scholar]
- 28.Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling 1999;6:1–55. 10.1080/10705519909540118 [DOI] [Google Scholar]
- 29.Lundgren-Nilsson Å, Jonsdottir IH, Ahlborg G, Jr, Tennant A. Construct validity of the Psychological General Well Being Index (PGWBI) in a sample of patients undergoing treatment for stress-related exhaustion: a Rasch analysis. Health Qual Life Outcomes 2013;11:2. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=23295151&dopt=Abstract 10.1186/1477-7525-11-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Revicki DA, Chen WH, Tucker C. Developing item banks for patient-reported health outcomes. Handbook of Item Response Theory Modeling: Routledge; 2014:352-81. [Google Scholar]
- 31.Maritz R, Tennant A, Fellinghauer C, Stucki G, Prodinger B. The Functional Independence Measure 18-item version can be reported as a unidimensional interval-scaled metric: internal construct validity revisited. J Rehabil Med 2019;51:193–200. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=30843597&dopt=Abstract 10.2340/16501977-2525 [DOI] [PubMed] [Google Scholar]
- 32.La Porta F, Caselli S, Susassi S, Cavallini P, Tennant A, Franceschini M. Is the Berg Balance Scale an internally valid and reliable measure of balance across different etiologies in neurorehabilitation? A revisited Rasch analysis study. Arch Phys Med Rehabil 2012;93:1209–16. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=22521926&dopt=Abstract 10.1016/j.apmr.2012.02.020 [DOI] [PubMed] [Google Scholar]
- 33.La Porta F, Giordano A, Caselli S, Foti C, Franchignoni F. Is the Berg Balance Scale an effective tool for the measurement of early postural control impairments in patients with Parkinson’s disease? Evidence from Rasch analysis. Eur J Phys Rehabil Med 2015;51:705–16. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=26334361&dopt=Abstract [PubMed] [Google Scholar]
- 34.Pellicciari L, Piscitelli D, Caselli S, La Porta F. A Rasch analysis of the Conley Scale in patients admitted to a general hospital. Disabil Rehabil 2019;41:2807–16. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=29912585&dopt=Abstract 10.1080/09638288.2018.1478000 [DOI] [PubMed] [Google Scholar]
- 35.Smith EV, Jr. Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. J Appl Meas 2002;3:205–31. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12011501&dopt=Abstract [PubMed] [Google Scholar]
- 36.Tennant A, Conaghan PG. The Rasch measurement model in rheumatology: what is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis Rheum 2007;57:1358–62. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=18050173&dopt=Abstract 10.1002/art.23108 [DOI] [PubMed] [Google Scholar]
- 37.Kreiner S, Christensen KB. Person parameter estimation and measurement in Rasch models. In: Christensen KB, Kreiner S, Mesbah M, editors. Rasch models in health. Wiley & Sons; 2012. p. 63-78. [Google Scholar]
- 38.Pellicciari L, Piscitelli D, Basagni B, De Tanti A, Algeri L, Caselli S, et al. ‘Less is more’: validation with Rasch analysis of five short-forms for the Brain Injury Rehabilitation Trust Personality Questionnaires (BIRT-PQs). Brain Inj 2020;34:1741–55. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=33180650&dopt=Abstract 10.1080/02699052.2020.1836402 [DOI] [PubMed] [Google Scholar]
- 39.Christensen KB, Makransky G, Horton M. Critical Values for Yen’s Q3: Identification of Local Dependence in the Rasch Model Using Residual Correlations. Appl Psychol Meas 2017;41:178–94. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=29881087&dopt=Abstract 10.1177/0146621616677520 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tennant A, Penta M, Tesio L, Grimby G, Thonnard JL, Slade A, et al. Assessing and adjusting for cross-cultural validity of impairment and activity limitation scales through differential item functioning within the framework of the Rasch model: the PRO-ESOR project. Med Care 2004;42(Suppl):I37–48. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=14707754&dopt=Abstract 10.1097/01.mlr.0000103529.63132.77 [DOI] [PubMed] [Google Scholar]
- 41.Tennant A, Pallant J. DIF matters: A practical approach to test if differential item functioning makes a difference. Rasch Meas Trans 2007;20:1082–4. [Google Scholar]
- 42.Fisher WP. Rating scale instrument quality criteria. Rasch Meas Trans 2007;21:1095. [Google Scholar]
- 43.Panella L, La Porta F, Caselli S, Marchisio S, Tennant A. Predicting the need for institutional care shortly after admission to rehabilitation: rasch analysis and predictive validity of the BRASS Index. Eur J Phys Rehabil Med 2012;48:443–54. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=22510676&dopt=Abstract [PubMed] [Google Scholar]
- 44.Reeve BB, Hays RD, Bjorner JB, Cook KF, Crane PK, Teresi JA, et al. PROMIS Cooperative Group . Psychometric evaluation and calibration of health-related quality of life item banks: plans for the Patient-Reported Outcomes Measurement Information System (PROMIS). Med Care 2007;45(Suppl 1):S22–31. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=17443115&dopt=Abstract 10.1097/01.mlr.0000250483.85507.04 [DOI] [PubMed] [Google Scholar]
- 45.Linacre JM. Optimizing rating scale category effectiveness. J Appl Meas 2002;3:85–106. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11997586&dopt=Abstract [PubMed] [Google Scholar]
- 46.Lundgren Nilsson Å, Tennant A. Past and present issues in Rasch analysis: the functional independence measure (FIM™) revisited. J Rehabil Med 2011;43:884–91. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=21947180&dopt=Abstract 10.2340/16501977-0871 [DOI] [PubMed] [Google Scholar]
- 47.Bland JM, Altman DG. Multiple significance tests: the Bonferroni method. BMJ 1995;310:170. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7833759&dopt=Abstract 10.1136/bmj.310.6973.170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Tabachnick B, Fidell L. Using Multivariate Statistics, Allyn and Bacon, Boston, MA. Using Multivariate Statistics, 4th ed Allyn and Bacon, Boston, MA 2001:-. [Google Scholar]
- 49.Linacre J. Sample size and item calibration stability. Rasch Meas Trans 1994;7:328. [Google Scholar]
- 50.Zampolini M, Zaccaria B, Tolli V, Frustaci A, Franceschini M; GISCAR Group. Rehabilitation of traumatic brain injury in Italy: a multi-centred study. Brain Inj 2012;26:27–35. https://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=22149442&dopt=Abstract 10.3109/02699052.2011.635358 [DOI] [PubMed] [Google Scholar]
- 51.Roskam E, Jansen P. Rasch model from consistent stochastic ordering. Rasch Meas Trans 1992;6:232. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Table I
Assessment of the measurement quality of an instrument within the Rasch analysis framework.
Data Availability Statement
The raw data associated with the article are publicly available for download from in www.zenodo.org (according to the license Creative Commons Attribution 4.0 International) from the following link: link: https://doi.org/10.5281/zenodo.8171871.

