Skip to main content
The Journal of Spinal Cord Medicine logoLink to The Journal of Spinal Cord Medicine
. 2007;30(Suppl 1):S146–S149.

Rater Agreement on the ISCSCI Motor and Sensory Scores Obtained Before and After Formal Training in Testing Technique

Mary Jane Mulcahey 1,, John Gaughan 2, Randal R Betz 1, Lawrence C Vogel 3
PMCID: PMC2031990  PMID: 17874700

Abstract

Background/Objective:

The purpose of this study is to report the results of rater agreement for the International Standards for Neurological Classification of Spinal Cord Injury (ISCSCI) motor and sensory scores before and after training in the testing technique.

Methods:

Six raters performed sequential motor and sensory examinations on 5 adolescents with SCI according to the ISCSCI manual. After completion of the first examinations, all raters were provided with a half-day formal training session on testing techniques, after which the raters repeated the examinations. Intraclass correlation coefficients (ICCs) and 95% confidence intervals (CIs) were calculated to provide parameters for ICC interpretation: >0.90 = high agreement; 0.75 to 0.90 = moderate agreement; <0.75 = poor agreement.

Results:

After training, there was improvement in rater agreement of summed motor scores (MS) from ICC =0.809 to 0.862 and discrimination scores from ICC =0.786 to 0.892. There was moderate rater agreement for light touch scores (LTS) before and after training. After training, there was improvement in 95% CIs except for ICCs for LTS, but for all ICCs, the lower 95% CI value remained less than 0.75.

Conclusions:

Training improved rater agreement on MS and discrimination, but 95% CIs remained unacceptably wide. The positive effect of training in motor and sensory testing techniques is supported by the study data. Unlike previous studies that have suggested the ISCSCI has acceptable reliability for clinical trials, the results of this study do not fully support the use of the ISCSCI for clinical trials without better standardization to establish a lower 95% CI value of at least 0.75.

Keywords: Spinal cord injuries, Tetraplegia, Paraplegia, Muscle testing, Sensory testing, Reliability

INTRODUCTION

The American Spinal Injury Association (ASIA) published a neurologic assessment and classification of spinal cord injury (SCI) in 1982 that involved bilateral strength testing of 10 key muscles (5 upper extremity and 5 lower extremity) and bilateral sensory testing (sharp\dull discrimination and light touch) of 28 dermatomes (1). Revisions to the original standards were published (2,3) and subsequently adopted by the International Medical Society of Paraplegia (IMSOP). Additional revisions were made based on psychometric studies (4–6), and most recently, Marino et al (7) provided clarity on terminology.

In their current form (8), the International Standards for Neurological Classification in Spinal Cord Injury (ISCSCI) provide standardization for motor and sensory testing, the results of which are used to classify the neurologic consequence of the injury. The ISCSCI sensory examination requires testing for discrimination and light touch at 56 dermatomes (28 on each side). A score of 0 to 2 is assigned to each dermatome for each sensation (discrimination and light touch), such that 0 represents cannot feel; 1 represents can feel\discriminate but is not “normal” (reference to cheek); and 2 represents can feel\discriminate and feels exactly as cheek. Total discrimination and light touch scores are calculated by summing the left and right scores. The ISCSCI motor examination requires manual muscle strength testing of 10 upper limb muscles (5 muscles per side) and 10 lower extremity muscles (5 muscles per side). A score of 0 to 5 is assigned to each muscle based on the traditional muscle strength testing scale, where 0 represents no movement, 1 represents a trace of movement, 2 represents partial movement, gravity eliminated, 3 represents full movement against gravity, 4 represents movement against gravity, some resistance, and 5 represents normal strength. A total motor score is calculated by summing the right and left scores.

Similar to all measurement systems, the question concerning use, reliability, and validity is important. The agreement between tests and among raters of the ISCSCI motor and sensory examination has been addressed by various investigators. Priebe and Warning (6) reported on intraobserver reliability of the 1989 standards, and Cohen et al (5), Cohen and Bartko (9), and Jonsson et al (10) addressed the psychometrics of the 1992 revision. Reliability of the most recent revision of the ISCSCI has been studied by Marino et al (11) and Mulcahey et al (12). The studies conducted by Cohen and Bartko (9) and Marino et al (11) are most comparable to the one described in this work. The work of Cohen and Bartko (9) coincided with the Fidia Pharmaceutical Corporation's clinical trials and showed high agreement among raters on motor (intraclass correlation coefficient [ICC] = 0.98), discrimination (ICC =0.96), and light touch (ICC =0.96) scores. Likewise, the work of Marino et al (11), which coincided with the Proneuron Phase II Clinical Trial, showed high agreement for motor (ICC = 0.97), discrimination (ICC = 0.88), and light touch (ICC = 0.96) scores. Neither Cohen and Bartko nor Marino et al reported confidence intervals (CIs) for the ICC, thereby limiting the interpretation of the ICC. These investigators and others (10,12) have explicitly or implicitly suggested the importance of training and or experience in the motor and sensory testing techniques. Recently, ASIA conducted a survey of members' opinions about training in the ISCSCI, which reflects continued interest in formal training programs as a mechanism to better improve the rigor of the neurologic examination and classification. The purpose of this paper is to report the results of among-rater agreement of motor and sensory scores before and after formal training in testing techniques of the ISCSCI sensory and motor examinations. The catalyst for the formal training was a larger study of the ISCSCI in youth.

METHODS

Sample

As part of a larger study on the use of the ISCSCI in youth and children, 6 raters performed sequential motor and sensory examinations on 5 adolescents with SCI according to the ISCSCI (7,8) before and after formal training in the testing techniques. The raters were either research assistants (N = 4) or principle investigators (N = 2) for a study on the use of the ISCSCI and were participating in a training session on testing technique based on the current ISCSCI. Three raters had between 5 and 20 years of experience with the ISCSCI, and the 3 others had less than 5 years. Before training, each rater, with his or her assigned scribe, performed the motor and sensory examinations on 5 adolescents with SCI. After completion of the examinations, all raters participated in a formal training session on the ISCSCI on the testing methodology for the motor, sensory, and anal examinations. Formal training included a series of lectures on testing techniques, viewing of the testing technique video published by the ASIA as part of the ISCSCI training packet, and hands-on practice with immediate feedback from the instructor. The training session was provided by an expert physical therapist who has conducted formal competency programs on the ISCSCI examination and classification techniques for international clinical trials. The day immediately after training, the raters, with the same scribes, repeated the motor and sensory examinations on the same 5 volunteers.

The volunteers represent a convenience sample and were between 15 and 19 years of age and at least 1 year post-SCI; all were participating in “brush-up” rehabilitation at the time of the training session. Table 1 summarizes the pertinent characteristics of the study volunteers.

Table 1.

Characteristics of Volunteers With SCI

graphic file with name i1079-0268-30-sp1-146-t01.jpg

Data Analysis

ICCs and 95% CIs were calculated to determine agreement among raters for summed motor, discrimination, and light touch scores before and after formal training. ICCs were used for analysis to allow for comparison with previous studies (9,11) and because of its underlying assumption that differences in scores are caused by multiple factors. Although Shrout (13) suggested a scale for interpretation of ICC that defines the lower value for moderate agreement as 0.61, interpretation for this study's results used the scale proposed by Portney and Watkins (14) that defines 0.75 as the lower value for moderate agreement and 0.90 as the lower value for high agreement. A 95% CI for the ICC indicates the likely range of values containing the true population. A wide 95% CI suggests poor precision, and a narrow 95% CI suggests good precision.

RESULTS

As shown in Table 2, before training, agreement among raters was moderate for summed motor (ICC = 0.809), discrimination (ICC=0.786), and light touch (ICC=0.824) scores. After training, agreement among the raters improved for summed motor (ICC = 0.862) and discrimination (ICC = 0.892) scores and remained moderate for light touch scores (ICC = 0.767). Although training narrowed the 95% CIs for summed motor and pin-prick scores, they remained unacceptably wide (Table 2).

Table 2.

Among-Rater Agreement

graphic file with name i1079-0268-30-sp1-146-t02.jpg

Although the numbers are small (N = 3), training seemed to influence the agreement of sensory and motor scores obtained on volunteers with incomplete injuries. As shown in Table 3, after training, agreement for motor and sensory scores improved, and 95% CIs were significantly narrowed. Separate ICCs and CIs were not calculated for scores from those with complete injuries because there were only 2 volunteers.

Table 3.

Among-Rater Agreement for Incomplete SCI (N = 3)

graphic file with name i1079-0268-30-sp1-146-t03.jpg

DISCUSSION

The purpose of this study was to evaluate the among-rater agreement of the ISCSCI motor and sensory scores before and after training as a mechanism to further explore the benefits of formal training in testing technique. This study reports both ICCs and 95% CIs as indicators of agreement. With the exception of one (12), previous reliability studies using ICCs have omitted reporting the 95% CIs, resulting in insufficient information for adequate interpretation of the ICCs.

Although agreement on motor and pin prick scores improved after formal training, the agreement for summed motor and sensory scores was moderate both before and after formal training, as evidenced by ICCs of between 0.75 and 0.90. Formal training had an effect on the precision of the motor and discrimination examination as evidenced by the narrowing of 95% CIs, particularly for motor and discrimination scores (Table 2) and with volunteers with incomplete injuries (Table 3).

Before training, there was higher agreement for light touch than discrimination. After training, there was an improvement in agreement in the test of discrimination but a decline in agreement for light touch. This finding is consistent with the finding by Jonsson et al (10), and similarly, raters in this study had a preconceived assumption about the difficulty in conducting the pin-prick (discrimination) examination. As a direct response to this assumption, significant training was directed toward the pin-prick technique. Attention to training in the pin-prick technique may have inadvertently minimized the training effect on the light touch examination, thereby influencing the agreement results. Nevertheless, the results of this study imply that standardized training of testing technique for both sensory modalities should be established and implemented.

Similar to other reports (3,10), the agreement of motor scores was better than for the sensory scores and also showed more improvement than sensory scores after formal training. However, despite the high ICC before and after training, the 95% CIs were wide before training, with lower CI values being 0.518. After training, the lower CI values improved only to 0.634 and remained unacceptably low. Although training may be 1 contributing factor for the lower CI values, the poor precision of the motor examination may also be attributed to the test methodology when applied to SCI, particularly incomplete injuries. Other possible reasons for wide CIs include the low sample size used for the study.

The degrees of agreement for the motor and sensory examination in this study were moderate and consistent with those previously reported for the ISCSCI (8,11). However, with the exception of the study conducted by Mulcahey et al (12), previous studies have not reported the 95% CIs. In this study, 95% CIs for the motor and both of the sensory scores were unacceptably wide in pre- and posttraining. As such, unlike previous studies that have suggested the ISCSCI has acceptable reliability for clinical trials (11), the results of this study do not fully support the use of the ISCSCI for clinical trials without better standardization to establish a lower 95% CI value of at least 0.75. Training—and perhaps repeated training as opposed to 1 session—in testing technique may have a major impact on improving the lower CI value, but other test-related factors, such as the amount of pressure applied to the pin, the amount of time the rater should wait for a response to sensory input, and the amount of rectal pressure, should also be explored. Also, clearer guidelines in the testing manual may improve rater agreement.

Importantly, although ICC values in this study are similar to the previously reported values, comparison between this study's results and the results of previous studies cannot be fully appreciated because of the lack of 95% CI reporting. Further research is warranted to better explain the wide 95% CI findings in this and previous studies (12). Future psychometric studies on rater agreement of the ISCSCI should report the ICCs and 95% CIs so that accurate interpretation of the study results can be made, and comparisons among studies can be explored.

There are limitations to this study that require consideration when interpreting results. The sample, both raters and volunteers, is small and likely contributes to the wide CIs. Multiple, back-to-back neurologic examinations, as performed in this study, may have contributed to fatigue and\or recall, and although ICCs are designed to address this variability, the methodology, although common for this type of study, remains a limitation. Last, this study only evaluated rater agreement of summed scores; it did not evaluate agreement at individual myotomes or dermatomes, and it did not evaluate agreement on classification of SCI.

CONCLUSION

Results of this study show that there is moderate rater agreement for the ISCSCI motor and sensory scores and that formal training may have a positive effect on agreement for the motor and pin-prick scores. The precision of the motor and sensory scores is poor and supports the effort in formalizing and standardizing a training program in testing techniques. The Standards Committee should report 95% CIs for ICC values in future test manuals and should encourage published studies to do likewise. Until a coordinated and standardized training effort is endorsed, those using the ISCSCI for research may want to conduct their own training using the materials available from the ASIA and establish agreement among their research raters.

Acknowledgments

The authors thank Mary Schmidt, MS, PT, for conducting the formal training.

Footnotes

This study was funded by the Shriners Hospitals for Children Research Advisory Board Grant 8956.

REFERENCES

  1. American Spinal Injury Association . Standards for Classification of Spinal Injured Patients. Chicago, IL: American Spinal Injury Association; 1982. [Google Scholar]
  2. American Spinal Injury Association . Standards for Classification of Spinal Injured Patients. Chicago, IL: American Spinal Injury Association; 1992. [Google Scholar]
  3. Ditunno JF, Donovan WH, Maynard FM. Reference Manual for the International Standards for Neurological and Functional Classification of Spinal Cord Injury. Chicago, IL: American Spinal Injury Association; 1994. [Google Scholar]
  4. Donovan WH, Wikerson MA, Rossi D, Mechoulam F, Frankowski RF. A test of the ASIA guidelines for classification of spinal cord injuries. J Neurol Rehabil. 1990;4:39–53. [Google Scholar]
  5. Cohen ME, Ditunno JF, Donovan WH, Maynard FM. A test of the 1992 international standards for neurological and functional classification of spinal cord injury. Spinal Cord. 1998;36:554–560. doi: 10.1038/sj.sc.3100602. [DOI] [PubMed] [Google Scholar]
  6. Priebe M, Waring WP. The interobserver reliability of the revised American spinal injury association standards for neurological classification of spinal cord injury patients. Am J Phys Med Rehabil. 1991;70:268–270. doi: 10.1097/00002060-199110000-00007. [DOI] [PubMed] [Google Scholar]
  7. Marino RJ, Barros T, Biering-Sorensen F, et al. International standards for neurological classification of spinal cord injury. J Spinal Cord Med. 2003;26(suppl 1):S50–S56. doi: 10.1080/10790268.2003.11754575. [DOI] [PubMed] [Google Scholar]
  8. American Spinal Injury Association . Reference Manual for the International Standards for Neurological Classification of Spinal Cord Injury. Chicago, IL: American Spinal Injury Association; 2003. [Google Scholar]
  9. Cohen ME, Bartko JJ. Reliability of the ISCSCI-92 for neurological classification of spinal cord injury. In: Ditunno JF, Donovan WH, Maynard FM, editors. Reference Manual for the International Standards for Neurological and Functional Classification of Spinal Cord Injury. Chicago, IL: American Spinal Injury Association; 1994. [Google Scholar]
  10. Jonsson M, Tollback A, Gonzales H, Borg J. Inter-rater reliability of the 1992 international standards for neurological and functional classification of incomplete spinal cord injury. Spinal Cord. 2000;38:675–679. doi: 10.1038/sj.sc.3101067. [DOI] [PubMed] [Google Scholar]
  11. Marino RJ, Jones L, Kirshblum S, Tal J. Reliability of the ASIA motor and sensory examination. Abstract P43. J Spinal Cord Med. 2004;27:194. doi: 10.1080/10790268.2008.11760707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Mulcahey MJ, Gaughan J, Betz RR, Johansen K. The international standards for neurological classification of spinal cord injury: reliability of data when applied to children and youths. Spinal Cord. 2003. Oct 3 (epub ahead of print); doi: 10.1038/sj.sc.3101987. [DOI] [PubMed]
  13. Shrout PE. Measurement reliability and agreement in psychiatry. Stat Meth Med Res. 1998;7:301–317. doi: 10.1177/096228029800700306. [DOI] [PubMed] [Google Scholar]
  14. Portney LG, Watkins MP. Foundations of Clinical Research. Applications to Practice. 2nd ed. Upper Saddle River, NJ: Prentice Hall Health; 2000. [Google Scholar]

Articles from The Journal of Spinal Cord Medicine are provided here courtesy of Taylor & Francis

RESOURCES