Abstract
Background/Objective:
Pain is a common secondary complication of spinal cord injury (SCI). However, the literature offers varying estimates of the numbers of persons with SCI who develop pain. The variability in these numbers is caused in part by differences in the classification of pain; there is currently no commonly accepted classification system for pain affecting persons after SCI. This study investigated the interrater reliability of the Bryce/Ragnarsson SCI pain taxonomy (BR-SCI-PT). The hypothesis was that, when used by physicians with minimal training in the BR-SCI-PT, it would have high interrater reliability for the categorization of reported pains.
Methods:
One hundred thirty-five vignettes, each of which described a person with SCI with one or more different etiologic subtypes of pain, were evaluated by 5 groups of up to 10 physicians with SCI subspecialization (39 respondents total). Physician classifications were compared with those made by the investigators.
Results:
Of 179 pain descriptions, 83% were categorized correctly to one of the 15 BR-SCI-PT pain types; 93% were categorized correctly with respect to level (above/at/below neurological level of injury), whereas 90% were categorized correctly as being either nociceptive or neuropathic. Subjects expressed a generally high confidence in the correctness of their classifications.
Conclusions:
Substantial interrater agreement was achieved in determining subtypes of pain within the BR-SCI-PT. The agreement was improved for categorizing within less restrictive categories (ie, with respect to the neurological level of injury and whether the pain was nociceptive or neuropathic).
Keywords: Spinal cord injuries; Reliability; Reproducibility of results; Pain classification; Vignette; Complex Regional Pain Syndrome; Pain, nociceptive, neuropathic, compressive, central, radicular
INTRODUCTION
While there is consensus in the literature that pain after spinal cord injury (SCI) is a widespread and often serious problem, there is disagreement as to specific features of the pain. Reports offer widely varying estimates of the percentage of persons with SCI who develop chronic pain, from 63% to 81% for “pain” and 18% to 41% for “severe pain” (1). With respect to a more limited category, shoulder pain, the estimates range from 30% (2) to 51% (3). In this report, the term “pain after SCI” refers to all types of pain commonly found after SCI, including both pain resulting from the damage to the spinal cord itself and pain resulting from the lifestyle imposed by the neurological damage (4).
Successful treatment of pain after SCI depends on understanding its nature and causes, and a valid and reliable taxonomy is required for progress in this area. Many clinicians and researchers have attempted to develop a way of categorizing pain after SCI, often without reference to previous efforts. This has resulted in a large number of different classification schemes of pain after SCI (5–20). Hicken et al (21) counted no fewer than 29 different categorization schemes. As a result, the literature is replete with (often small) studies reporting on the efficacy of pharmacotherapy and with reports on the epidemiology of pain after SCI; the results are difficult to compare, given that the types of pain distinguished are different in nearly every study.
In 1919, soon after World War I, Holmes (5) described in detail 2 types of pain after cervical SCI caused by gunshot wounds, which he attributed to a central spinal cord origin. One pain type was described as being of a burning, shooting, or stabbing quality; usually of constant duration, although it often increased in intensity with peripheral stimulation. It was found in a diffuse bilateral distribution in the arms, shoulders, neck, and upper back. The other type was of a similar nature, yet found in a distribution below the level of the injury, in distant regions of the body. In 1938, Riddoch (6) classified these 2 types of pain, alluding to the examples described by Holmes, as “local, or segmental, pain” and “remote pain,” respectively. After World War II, when many persons with traumatic SCI began to survive for many years after their acute injuries, the first of 2 main comprehensive approaches to the classification of pain after SCI was reported. This first approach was to classify pain, which was experienced at or below the level of neurologic injury, by its putatively attributed source (ie, the nerve roots, muscle spasms, vasospasms, visceral organs, and the spinal cord) (7,8,22,23). Thirty years later, Burke (12), in his review of pain types, described another specific neuropathic pain type (ie, “end-zone pain,” or pain occurring at the level of the neurologic injury in a dermatomal distribution, but not characteristic of radicular pain).
At about the same time, Michaelis (9) articulated an approach to classification of pain after SCI based on the region where the pain was experienced in relation to the neurologic level of injury—the second main comprehensive approach. This regional approach acknowledged more fully nociceptive sources of pain in persons with SCI. Nociceptive pain occurs when intact peripheral nociceptors in partially or fully innervated areas of the body are activated by local damage to nonneural tissues, such as bone, ligaments, muscle, skin, or other organs. It can be contrasted to neuropathic pain, which occurs as a result of damage to neural tissue either in the peripheral or central nervous system. Later classification schemes added nociceptive pains occurring above and at the neurologic level to the traditional visceral and below/at the neurologic level of injury neuropathic pains, but continued to list the specific types solely by the attributed sources of these pains, albeit more widely accepted ones (14,16–18).
In 1997, Siddall and Ragnarsson simultaneously but independently attempted to merge the 2 main approaches of source attribution and regional localization (ie, above level, at level, and below level) into a single classification system. Both these schemes have since been revised by their authors (1,19,24–28). The Bryce/Ragnarsson SCI pain taxonomy (BR-SCI-PT), one evolution of these schemes, is based on consideration of all existing classification systems of pain after SCI, as well as recent clinical and research insights. It includes all the generally accepted pain types distinguished in previous classification systems, but arranged along 2 axes that bring some order to what heretofore was an area of much confusion and contradiction. A 3-tiered decision tree underlies the BR-SCI-PT schema (Figure 1). In tier 1, the pain is regionally localized relative to the neurologic level of SCI (ie, either above level, at level, or below level); in tier 2, the pain is identified as either nociceptive or neuropathic; and in tier 3, the pain is stratified into subtypes of the regionally localized nociceptive or neuropathic pain types (eg, bicipital tendonitis, muscle spasm, or appendicitis). The neurologic level of SCI is defined as the most caudal segment of the spinal cord with normal sensory and motor function bilaterally (23). In the BR-SCI-PT, above-level pain is localized rostral to the 2 dermatomes above the neurologic level; at-level pain is localized to within the 2 dermatomes above or below the neurologic level; and below-level pain is localized caudal to the 2 dermatomes below the neurologic level. The highest neurologic level should be used if the neurologic level is not the same on the right and left sides of the body (1,27,28).
Figure 1. Bryce/Ragnarsson SCI pain taxonomy.
There have been only a few studies that have attempted to determine the interrater reliability of classification systems of pain after SCI (20,29). Richards et al (29) determined the interrater reliability of the Donovan system (14) in 28 persons with traumatic-onset SCI who reported pain at a total of 60 sites. For each pain site, information pertinent to the Donovan taxonomy was provided incrementally (information relevant to 1 of 6 criteria at each step) by means of videotape to 3 evaluators. After each additional piece of information, the evaluators classified the pain into 1 of 5 types. Interrater agreement ranged from 50% to 70%, and interrater agreement did not change as additional information was provided. A 3-month test–retest analysis showed fairly good consistency of evaluators with themselves but poor interrater agreement at both time-points (30). Application of pain taxonomies developed by the International Association for the Study of Pain (25) and Tunks (15) to the same clinical material by the same investigators similarly resulted in poor to moderate agreement among the 3 evaluators, with κ values in the 0.30 to 0.65 range (31).
Cardenas et al (20) determined the interrater reliability of an ad hoc classification system they proposed for chronic pain in persons with SCI, using independent categorization of reported pain problems for 41 persons based on questionnaires alone and for 15 persons based on questionnaire and personal interviews. For the 41 questionnaires categorized independently by 2 evaluators, strength of agreement in categorizing 68 pain problems showed a κ of 0.68. For 15 persons whose pain was categorized in person by 2 evaluators, strength of agreement showed a κ of 0.66. The relatively modest agreement between evaluators shown in these studies may be the result of the taxonomies themselves, the clinical material used, the training of the examiners, or a combination of these; the study reports do not allow assessment of these factors.
The existing lack of consensus on a widely accepted, reliable, comprehensive, and easily used classification system for pain after SCI has contributed to the poor understanding of the problem of pain after SCI, the incomparability of the results of clinical research studies, and the inability to translate basic science advances in pain research into the clinical realm. There is not just a need for a consensus taxonomy, but the design and description of that taxonomy must be such that clinicians and others can use it to classify pains with a high level of reliability, achieving agreement with themselves from one occasion to the next, and with others. When SCI specialists are unable to agree on what type of pain they are treating, because of a lack of consensus on how to classify pain after SCI, clear treatment strategies cannot be developed.
This study investigated the interrater reliability of the BR-SCI-PT. The objective was to determine to what degree SCI specialists who are not familiar with the BR-SCI-PT could learn to apply it and agree on the classification of a variety of pain descriptions presented in vignette format. We tested the following hypotheses with regard to the BR-SCI-PT: (1) most patient pain reports can, with high levels of confidence, be assigned to 1 of the 15 categories enumerated, by physicians who have received minimal training in the taxonomy; (2a) the taxonomy has an interrater reliability of more than 95% for determining whether pain is located above level, at level, or below level; (2b) the taxonomy has an interrater reliability of more than 95% for determining whether pain is nociceptive or neuropathic; and (2c) the taxonomy has an interrater reliability of more than 85% for classifying pain into 1 of the 15 subtypes.
METHODS
In this Institutional Review Board–approved study, 39 physicians with SCI expertise classified a variety of pain descriptions presented in vignette form and reported their confidence in the correctness of their categorization. A total of 135 vignettes were prepared, each describing a person with SCI experiencing one or more pains. Neurological level, completeness of the SCI, and characteristics of the pain (subjective complaint, duration, triggers, etc.) were provided (Figure 2). The vignettes, derived in part from the actual experiences of persons with SCI, were prepared by the first author, who also established the number of pains and pain type(s) involved. The coauthors independently from one another also determined the number of pains and the pain type for each vignette. The results were compared, and where needed, the text of the vignette was adjusted to reduce ambiguity.
Figure 2. Sample vignette.
In 93 vignettes, only one type of pain was noted; in 40, there were 2 pain descriptions; and 2 vignettes presented persons having 3 different pain descriptions. (This count is based on the investigators' initial categorization, with which the respondents not necessarily agreed.) The total number of pain descriptions was 179, which were distributed over all the pain types distinguished in the BR-SCI-PT, with a minimum of 8 per type and a maximum of 20. The frequencies of the combinations of pain types found in the vignettes are provided in Table 1. Based on the number of pain types presented, their nature, and other characteristics of the description, the vignettes were classified by the investigators as being of low, intermediate, or high complexity. Because the total number of vignettes was too large to expect any physician-respondent to categorize them all, they were divided into 5 sets (termed booklets) of 27 vignettes each. Each booklet was completed by a different subgroup of respondents; subgroups varied in size from n = 5 to n = 10.
Table 1.
Combinations of Pain Types Represented in the Vignettes
Respondents were physicians who were certified in the subspecialty of Spinal Cord Injury Medicine by the American Board of Physical Medicine and Rehabilitation and were members of the American Spinal Injury Association (ASIA) or the American Paraplegia Society (APS). Specialists were recruited by sending letters requesting their participation in the project. The letter also included a description of the purpose of the project, information on the Institutional Review Board approval, the compensation offered ($100), which could be donated to an SCI organization, a detailed description of the BR-SCI-PT, instructions, the booklet with vignettes, and a return envelope. We sent out 131 packages and sent follow-up reminder letters between 3 and 5 weeks after the initial letter. A total of 39 completed booklets were received, all of which had useable answers. Because in response to the follow-up letter we received replies that the addressee never had received the initial package, we know that not all packages ended up in the hands of the person intended; therefore, an exact response rate cannot be calculated. However, it is minimally 30% (39/131).
The subjects were instructed to read each vignette, determine the number of pains present, and establish the precise identity of the pain or pains in terms of the BR-SCI-PT. They had the description of the BR-SCI-PT classification and the specific pain types it distinguishes available when making these judgments. To ensure that, in case of multiple pains, the categorization by the physician could be compared with that of the investigators, the respondents were required to write down the location and the unique characteristics for each pain, if they decided there was more than one. They also expressed their confidence in their decision as to the number of pains they distinguished and in the categorization they made of each pain. For both these purposes, a 5-point Likert–type scale was used, with 1 labeled “nothing more than a guess” and 5 labeled “absolute certainty” (see Figure 2 for the order and phrasing of the questions asked).
The responses provided by the 39 physicians were entered in an Excel spreadsheet, checked several times, and analyzed using SPSS software. Answers as to the number of pains present in a vignette, and the categorization of each one, were compared with the answers provided by the investigators, and coded as either correct or incorrect. However, reconsideration of the “correct” answers in the light of the responses by the participating physicians resulted in the formulation of an extensive list of alternative answers. Alternatives were allowed either because the language used in the vignette or in the BR-SCI-PT guide and instructions was not unambiguous, or because, given the state of knowledge in the field and the state of diagnostic procedures available, multiple taxonomy categories are feasible even after extensive workup of real-life persons with pain after SCI. In many instances, 2 categorizations of a pain were considered possible, based on the information presented in the vignette. In addition, for a number of vignettes, additional pains could be (and were) distinguished (eg, left and right arm pains defined as separate pain components, rather than as a single pain). Last, in 2 vignettes, it was not unreasonable to distinguish just one rather than 2 pains. Altogether, variant readings were allowed for 55 of 135 vignettes (41%). About 39% of the errors existent with the original standards were not considered errors with the modified standards, which included the accepted alternative answers, causing the overall error rate to drop from 29% to 18%.
Most of the tables present the percent of correct classifications under the original and/or modified standards, cross-tabulated with some aspect of the vignette. Statistical testing was not used because basic assumptions of independence of observations were violated. Given the fact that all hypotheses were rejected on the basis of the sample data, development of proper special formulas adjusting for dependence of observations was unnecessary.
κ, a measure of agreement corrected for chance agreement between judges, was calculated for each individual physician, comparing his or her categorizations for the vignette-presented pains (about 36 per physician, depending on booklet) with that of the investigators. The normal range for κ values is from 0.00, not better than chance agreement, to 1.00, complete agreement. It should be noted that various assumptions underlying calculation of κ, including independence of objects classified and independence of evaluators, are violated here; consequently, no statistical testing of the κ values was performed. The κ values presented were calculated from the original standards and not the modified ones, because κ is predicated on there being a single right answer for each object being classified. Under the modified standards, more than one respondent-selected category might be considered as corresponding to the investigators-selected one.
RESULTS
Completeness of Information
All respondents completed information for each vignette they were given—no incomplete booklets were returned. Information on the confidence respondents had in the number of pains they distinguished was lacking for 4% of vignettes (43 of 1,053). Only in one instance did a respondent fail to provide a code for a pain he/she determined was present. In all other instances, respondents either failed to recognize the presence of a (second, third) pain at all or they provided a code, whether this code was correct or incorrect. For pains that were recognized, a rating of confidence in the correctness of the taxonomic category selected was omitted in 3% of all instances (42 of 1,470). Thus, except for minor omissions, the average respondent completed the questionnaire in a diligent manner, providing all responses asked for with minor problems. The omissions in almost all instances involved the ratings of one's confidence in the taxonomic categorizations made and not the categorizations themselves.
Confidence in Task Execution
The 39 respondents made a total of 1,010 ratings as to the level of confidence they had in their report of the number of pains in each vignette. In almost one half of the cases (47%), the respondent stated he/she had certainty in the determination of the number of pains; in another 42%, the person felt fairly sure (Table 2). Only in about 10% of all instances did the person state that he/she took a pure guess or at least was not sure. The degree of confidence corresponded somewhat to the number of pain components the respondent distinguished: the more pains, the less certain he/she was of the count's correctness. The confidence in one's determination of the number of pains did not vary by the investigators' decision as to how many pains were present and also did not vary strongly by the degree of complexity the investigators had assigned to the vignettes.
Table 2.
Confidence in the Number of Pains Described in the Vignette, by Number of Pains Distinguished by the Respondent
There was some difference in mean confidence rating by booklet. The booklets differed minimally in complexity, as indicated by mean global complexity level (ranging from a low average of 1.5 ± 0.6 to a high of 1.7 ± 0.7) or by mean number of pains (ranging from a low of 1.2 ± 0.4 to a high of 1.4 ± 0.5); therefore, it is quite possible that these differences reflect the characteristics of the respondents rather than those of the vignettes. With between 5 and 10 respondents per subgroup, levels of knowledge or confidence in one's abilities possibly were not equalized by random assignment. Only 9 respondents (23%) had an average confidence rating (across 27 vignettes) of less than 4.0, which equals “fairly sure.” The average of 33% was between 4.0 and 4.5, and no less than 44% averaged 4.5 or higher, approaching “certainty” for each determination of the number of pains.
There was a total of 1,428 ratings as to the confidence subjects had in the categorization of the 1, 2, or 3 pains they determined to be represented in the vignettes. Again, the level of confidence respondents had in the taxonomy assignments they made was quite high: in 46% of all instances, the respondent stated she/he was “fairly sure,” and “certainty” was claimed in 35% of instances (Table 3). In less than 20% of the cases was the judgment one of “not sure” or less. There was limited variability in the confidence in the categorizations made, whether by the number of pains in the vignette (as determined by the respondent or by the investigators), the assigned complexity of the vignette, or the booklet involved (Table 3). The average confidence rating for each respondent calculated over 27 vignettes ranged from a low of 3.40 (somewhat above “not sure”) to 4.97 (“certainty” on all but 1 vignette). An average confidence rating below 4.0 was given by 33% of respondents; 49% had an average between 4.0 and 4.5, and 18% averaged over 4.5.
Table 3.
Confidence in the Categorization of the Pains Described in the Vignette by Complexity of the Vignette
Thus, the typical respondent was quite confident in the correctness of the number of pains he/she noted and in the correctness of the pain categorizations she/he made.
Correctness of Categorizations of Pains
The results of the assessment of the correctness of the classification of the pain components by the respondents are summarized in Table 4. Applying the original standard, that is, assuming that the initial classification of the pain components by the investigators before receiving feedback from the subjects is the only correct one, only 72% of all pains were categorized correctly by the respondents. This clearly is less than the 85% specified in hypothesis 2c. When these standards are modified, and alternative categorizations are allowed as described previously, the percent of correct classifications rises to 83%, marginally less than the minimum specified. This increase by 11% is accomplished by permitting alternative readings of selected individual pain components (8%), allowing “splitting” of certain pain components into 2 (3%) and approving the combining of 2 pain components into a single one (<1%). The remaining errors mostly consist of erroneous assignment of a pain component to 1 of the 15 taxonomic categories (12%). Additional errors consist of specifying nonexisting pain components (3%) or omitting existing pain components (2%). Under the original standards, the approved alternatives of the modified standards were considered errors. Consequently, the 28% errors under the original standard consist of 20% wrong categorizations, 6% “invented” extra pains, and 2% omitted pains (Table 4).
Table 4.
Number and Percent of Pain Components Classified Correctly and Incorrectly Under the Original and Modified Standards
Table 5 gives detailed information about the mis-classification of pain components. This reflects the original standards: the investigators' categorization of each of the pains they described is compared with the categorization by the respondents. The number on the main diagonal indicates the percent of times the respondents identified the type of pain for that column the same way the investigators did. The average percent correct is 78%; it varies from a low of 53% (for at-level neuropathic pain: compressive neuropathy) to a high of 93% (for below-level nociceptive pain: visceral pain). The “not categorized” row indicates how often respondents failed to recognize (or at least to write down and categorize) each type of pain. Table 5 also allows one to assess what pain types are frequently mistaken for one another. For instance, above-level nociceptive pain: mechanical/musculoskeletal (column 1) was categorized correctly in 74% of all instances (across vignettes and respondents); it was not noted 1% of the time, and it was categorized in no less than 22% of all instances as “at-level nociceptive pain: mechanical/musculoskeletal.” Such common “mistaken identities” were also found for some of the other pain types (eg, at-level nociceptive pain: visceral). In other cases, there were 2 pain types for which a pain type (as determined by the investigators) is commonly mistaken; this holds for above-level neuropathic pain: other and at-level neuropathic pain: compressive neuropathy.
Table 5.
Categorization of Pains by Respondents Compared With Categorization by Investigators (Percentages)
κ was calculated for each individual physician, comparing his or her categorizations for the vignette-presented pains (about 36 per physician, depending on booklet) with that of the investigators. κ values ranged from 0.55 to 0.91 and averaged 0.70.
Table 6 summarizes information on correctness of categorizations by the respondents for both the original standards applied in Table 5 and the modified standards. The number of “cases” here is somewhat larger than in Table 5, because the respondents “created” more additional pains (with or without the investigators' retroactive agreement) than they omitted. While the percentage correct did not increase for some categories (eg, above-level nociceptive pain: other), for other pain types, the increase was significant (eg, at-level neuropathic pain: Complex Regional Pain Syndrome [CRPS]).
Table 6.
Percent of Pains Classified Correctly by Original and Modified Standards for Individual Pain Types
The data in Table 7 indicate that the number of errors, whether according to the original or the modified standards, increases with the number of pain components the respondents distinguished. The number of errors also increases with the number of pains the investigators distinguished, although the 3-pain vignettes constituted an exception. Classifying the pain components in those vignettes the investigators designated as being of low complexity was indeed somewhat easier, resulting in higher percentages correct classification. Last, there was quite some difference in percentage correct by booklet, although this may also be a function of the capability of the respondents rather than the relative difficulties of the booklets, as was stated previously.
Table 7.
Percent of Pain Components Classified Correctly Under the Original and Modified Standards by Booklet, Complexity of Vignette, Number of Pain Components Distinguished by the Respondent, and Number of Pain Components Distinguished by the Investigators
Whatever standards are used, respondents who stated that they had a high level of confidence in their categorization tended to be correct more often than those who admitted they were guessing (data not shown). However, even those who stated they were wagering a guess tended to be correct slightly more than half the time (under the strict standards) or over two thirds of the time (under the modified standards). If they were truly just guessing, they would be expected to be correct approximately 7% of the time (randomly selecting 1 of 15 available categories). It should be noted that the guessers needed more help from the relaxation of the standards than those who claimed to be (fairly) sure: the “gain” in the percentage correct is much lower for the latter than for the former.
Correctness of Categorization of the Level of the Pain Relative to the Neurologic Level of Injury
Hypothesis 2a stated that respondents would be correct at least 95% of the time in classifying the pain components presented as above-level, at-level, or below-level. The data in Table 4 indicate that, under the original standards, in just over 84% of all 1,503 instances, the respondent selected the correct level for the pain. In applying the revised standards, 129 categorizations were changed from erroneous to correct, and the percentage improved to 93%—still short of the 95% correct anticipated.
The percentage classified correct as to level varied by the number of pain types in a vignette the respondent distinguished (Table 7), with the lower percentages correct noted for the vignettes with multiple pain components. However, this trend did not exist when the basis of comparison was the number of pains determined by the investigators. The vignettes' complexity also had no clear relationship to the percentage of correctly classified pains, whether according to original or modified standards. Last, variations by booklet/subgroup of respondents were very small. Differences were clear and consistent by expressed confidence in the correctness of one's categorization of the pain described (data not shown): those who stated that their categorization was a complete guess or somewhat of a guess were correct 89% of the time, under the modified standards, whereas those who expressed certainty exceeded the expected 95% agreement level, being correct 98% of the time.
The percent classified correctly differed quite some by pain type as determined by the investigators. The poorest performance under the original standards was found for above-level nociceptive pain: mechanical/musculoskeletal (76% classified correctly as to level) and the second worst was for at-level nociceptive pain: visceral (78% correct). Both of these pain types saw the largest gain from the easing of the standards, resulting in percentages correct close to 100%, similar to a number of other pain types.
Correctness of Categorization of Pain Type as Neuropathic vs Nociceptive
Hypothesis 2b held that respondents would be correct at least 95% of the time in classifying the pain components presented as to basic pain category: neuropathic vs nociceptive. The data in Table 4 indicate that, under the original standards, in just 86% of all 1,503 instances, the respondents selected the correct category. After modified standards were applied, the percentage improved to 90%—still short of the level anticipated.
Differences by the number of pains present, as determined by the investigators or the subjects, by the complexity of the vignette and by subgroup/booklet, were of the same nature as for the categorization of the level of the pain relative to neurologic level of injury (NLI), but tended to be smaller (Table 7). The same held true for differences by the respondents' confidence in the correctness of the categorization they made—more confident respondents tended to be correct in classification as to nociceptive vs neuropathic nature more often, whether the original or the revised standards were used, but the difference was fairly limited. Those who stated they were making a guess, or somewhat of a guess, were correct in their categorization as to pain type 84% of the time (original standards) or 89% (revised standards). Physicians who expressed certainty in the correctness of their categorization were correct 94% and 97%, respectively.
Differences in the percentage of correct classifications (neuropathic vs nociceptive) were limited among the 15 pain types as assigned by the investigators, with a low of 84% correct and a high of 98%. Only for 2 pain types did the picture change considerably when the original and the modified standards' results were compared: at-level neuropathic pain: radicular and at-level neuropathic pain: CRPS; both were classified correctly more often, about 10% of the time.
DISCUSSION
This study aimed to determine the interrater reliability of the BR-SCI-PT, which in the eyes of its developers is a comprehensive and easily used classification system for pain after SCI. Eighty-three percent of the pain descriptions were categorized correctly to 1 of the 15 BR-SCI-PT pain types; 93% of the pain descriptions were categorized correctly with respect to level (above/at/below), whereas 90% of the pain descriptions were categorized correctly as being either nociceptive or neuropathic. Judging by κ values, agreement was somewhat better than in previous studies of other SCI pain taxonomies (20,29,31). κ for subcategory classification by individual physicians ranged from 0.55 to 0.91 and averaged 0.70, based on the original standards and not the more correct modified standards. The proportion correctly classified after correction for chance agreement (κ: 0.70) is not much smaller than the proportion agreement without correction for chance agreement (0.72). This is the case because, by design, all 15 pain types were represented in the vignettes about equally often. Because, under the original standards, the reduction in proportion correct from 0.72 to 0.70 is less than 0.02, presumably a κ calculated under the modified standards to parallel it would be only 0.02 to 0.03 lower than the calculated proportion correct of 0.83. This would result in an estimated κ of 0.81 or 0.80—maybe a little less, because of some categories becoming relatively more popular. The physicians expressed a generally high confidence in the correctness of their classifications.
It should be noted that the goal of this research was not to prove the ultimate “correctness” of the BR-SCI-PT. Taxonomy aims to make distinctions among cases encountered in reality, based on characteristics that are relevant to a particular purpose. The Bryce/Ragnarsson taxonomy differentiates pain after SCI based on presumed etiology and location, because these characteristics are assumed to be relevant to treatment. However, for the practical usefulness of distinctions in the taxonomy there is as of now limited support; further basic and clinical research needs to provide evidence for the relevance of these particular distinctions to such issues as pain natural course, impact on quality of life, and/or treatment. In that sense, the “validity” of the taxonomy is to be assessed in future research. The study reported here just addressed the reliability of the taxonomy: the degree to which clinicians can agree with one another on assigning cases (pain descriptions reported by individuals) to the classification's categories.
The use of vignettes to study knowledge, attitudes, judgments, or decision making has a long tradition in social science (32,33), nursing (34), and medicine. In medicine, vignette approaches have been used in studies of (chronic) pain management and treatment, mostly to elucidate clinicians' ability to assess pain and their preferred intervention strategies (35–38). Vignette studies of diagnostic processes used in evaluating pain are very much limited in number (39). The advantage of a vignette approach is that a great many situations (cases) can be presented in a short time, each of which can be constructed so as to present a combination of characteristics (whether essential to the decision to be made or “distractors”) that are of interest to the researcher.
A disadvantage of vignettes is that vignettes may be different from real-world cases (in terms of mode of presentation, complexity, etc.). The decisions respondents make with respect to vignettes may not reflect their real-life decisions, because they want to present themselves as better than they are (attitude studies), have more or less resources than they have in reality, etc. These problems would seem to be minimal in situations where knowledge is to be assessed (eg, in studies of the making of a diagnosis). The information clinicians typically require before making a diagnosis can be supplied together with the vignette (including laboratory reports and copies of radiographs) or can be made available “on demand” for studies that want to simulate clinical reasoning processes more realistically.
One limitation of our study is that the vignettes, although based on real life situations, were composed by one of the developers of the BR-SCI-PT (T.B.) and may have included information that would lead the subjects to classify a vignette in a certain way. This potential bias would presumably be absent if participating physicians were to interview and examine persons with pain themselves, without being presented selected information about the pain by the classification developers. There was no opportunity for respondents to collect and evaluate further information (history and physical, laboratory, and other tests) to confirm a diagnosis. In addition, the distribution of pain types was not necessarily realistic—easy-to-classify pain descriptions may have been underrepresented or overrepresented. To obtain the same type and quality of information for all categories in the BR-SCI-PT, all 15 pain types distinguished occurred about equally often in the vignettes, which presumably is different from what a clinician encounters. (One advantage of the “rectangular distribution” of the pain types' frequencies is a reduction in room for inflated interrater agreement because of guessing.)
As pointed out by an anonymous reviewer, another possible limitation is self-selection into the sample of respondents. Given the attrition rate of up to 70%, it is possible that only physicians with a greater knowledge of or interest in pain after SCI submitted responses. Consequently, results might be worse, if not much worse, with use of the BR-SCI-PT by nonknowledgeable SCI specialists.
Feedback from the physicians on the classification task allowed us to evaluate what specific problems there were in the description of the taxonomic classes of the BR-SCI-PT and in the BR-SCI-PT training materials. This was clearly shown by the differences in allowed pains between the original standards developed by the investigators and the modified standards that included alternate classifications that were judged to be correct after review of the respondents' classifications. The newly allowed pains for the modified standards fell into several categories including (a) allowing more than one regional level determination (at-level and above-level or at-level and below level) if the pain distribution encompassed dermatomes within more than one regional level; (b) allowing bilaterally experienced pain of the same etiology to be classified separately for the left and the right sides of the body; (c) allowing the component pains of a pain syndrome to be valid in addition to but not omitting the pain syndrome itself (ie, mechanical musculoskeletal pain in addition to CRPS if both fit the criteria); and (d) allowing different etiologies for the pain if these other etiologies also fit the criteria. This latter category refers to instances when a vignette was not specific enough to allow only one correct answer for a specific subtype.
The differences between the original and modified standards might have been lessened if the instructions for completing the vignette assessment had been specific in detailing how to classify bilaterally experienced pains and how to classify component pains that are present in addition to a pain syndrome. In addition, a more inclusive, albeit accurate, initial dermatomal mapping of the extent of pains localized over areas of the body that included nonadjacent dermatomes especially, such as the shoulder region, would also likely have lessened the differences between the results under the original and the modified standards. Although the detailed description of the BR-SCI-PT that was sent to the subjects along with the vignettes included a brief description of the hallmarks of each of the 15 subtypes of pain, exact definitions and inclusion and exclusion criteria for each subtype of pain, such as might be found in the diagnostic manuals for psychiatric conditions, were not available. A more detailed instruction booklet with exact definitions and inclusion and exclusion criteria of each subtype of pain would likely have improved the accuracy in determining subtypes of pain. It would have improved the accuracy in differentiating neuropathic and nociceptive conditions from one another to a lesser degree and probably would had only minimal impact in improving the accuracy of differentiating regional level determinations.
Another area in which errors were made by subjects was in the classification of unusual types of headaches and other conditions that are not typically encountered in clinical practice by persons who specialize in the treatment of persons with SCI. Because the BR-PT-SCI was designed to be a comprehensive taxonomy, these errors are to be expected.
Nevertheless, it is clear that further detailing of the taxa in the BR-SCI-PT is both necessary and possible. This study provides information to improve the clarity and didactic quality of the detailed description. Further improvement to a degree depends on the results of clinical and basic research, which can offer the materials needed to improve the inclusion and exclusion criteria for each of the 15 pain types, just as was done in the many years of development of the Diagnostic and Statistical Manual of the American Psychiatric Association (40) and other diagnostic systems. Availability of exact definitions and inclusion and exclusion criteria for each of the subtypes might raise the overall interrater reliability of the taxonomy for determining subtypes of pain closer toward a “gold standard” of 0.90 for κ. Development of exact definitions and inclusion and exclusion criteria should be pursued in future studies. The development process should be evidence-based, with potential use of a Delphi process of expert opinions when research evidence is not available. The definitions should be comprehensive and include all the known elements of specific pains including: character, temporal course, location, aggravating and relieving factors, and associated signs and symptoms. In addition, the BR-SCI-PT, or even just the portions of the taxonomy that seem to show the best agreement such at the nociceptive-neuropathic and above-level/at-level/below-level distinctions or a combination of the two, should be used to classify pains in both epidemiological studies and interventional clinical trials. These studies will help determine the practical usefulness of the categories that make up the BR-SCI-PT.
CONCLUSION
These findings suggest that the BR-SCI-PT, which is a comprehensive taxonomy of pains typical after SCI, can be used by specialist clinicians after limited training in classifying pains presented in vignette format. Further development of the system will largely depend on further developments in basic and clinical science. However, improvement in the description of and the instructions for use of the current taxonomy is possible, based on the results of this study. Ease and reliability of application are necessary conditions for the use of taxonomic systems. It is expected that such improvements will increase the reliability of the taxonomy, thereby improving the communication among clinicians, clinical researchers, and basic scientists with an interest in pain after SCI.
Acknowledgments
The authors appreciate the efforts of their 39 colleagues who gave of their time in exchange for very minor remuneration. We thank Ayana Jones for help with administrative and data processing tasks.
Footnotes
This research was supported by Grant 905 from the American Paraplegia Society and Grant H133N000027 from the National Institute on Disability and Rehabilitation Research, Office of Special Education Services, Department of Education.
REFERENCES
- Bryce TN, Ragnarsson KT. Epidemiology and classification of pain after spinal cord injury. Top Spinal Cord Inj Rehabil. 2001;7:1–17. [Google Scholar]
- Bayley JC, Cochran TP, Sledge CB. The weight-bearing shoulder. The impingement syndrome in paraplegics. J Bone Joint Surg Am. 1987;69:676–678. [PubMed] [Google Scholar]
- Nichols PJ, Norman PA, Ennis JR. Wheelchair user's shoulder? Shoulder pain in patients with spinal cord lesions. Scand J Rehabil Med. 1979;11:29–32. [PubMed] [Google Scholar]
- Paralyzed Veterans of America. Paralyzed Veterans of America Final Report: The PVA Needs Assessment Survey. Washington DC: Paralyzed Veterans of America; 1988. [Google Scholar]
- Holmes G. Contributions to Medical and Biological Research. Vol 1. New York: PB Hoeber; 1919. Pain of central origin; pp. 235–246. In: Anonymous. [Google Scholar]
- Riddoch G. The clinical features of central pain. Lancet. 1938;234:1150–1156. [Google Scholar]
- Davis L, Martin J. Studies upon spinal cord injuries. II. the nature and treatment of pain. J Neurosurg. 1947;4:483–491. doi: 10.3171/jns.1947.4.6.0483. [DOI] [PubMed] [Google Scholar]
- Pollock LJ, Brown M, Boshes B, et al. Pain below the level of injury of the spinal cord. JAMA. 1951;146:319–322. doi: 10.1001/archneurpsyc.1951.02320030056005. [DOI] [PubMed] [Google Scholar]
- Michaelis LS. The problem of pain in paraplegia and tetraplegia. Bull NY Acad Med. 1970;46:88–96. [PMC free article] [PubMed] [Google Scholar]
- Kaplan LI, Grynbaum BB, Lloyd KE, Rusk HA. Pain and spasticity in patients with spinal cord dysfunction. Results of a follow-up study. JAMA. 1962;182:918–925. doi: 10.1001/jama.1962.03050480024006. [DOI] [PubMed] [Google Scholar]
- Burke DC. Pain in paraplegia. Paraplegia. 1973;10:297–313. doi: 10.1038/sc.1973.54. [DOI] [PubMed] [Google Scholar]
- Burke D, Woodward J. Pain and phantom sensation in spinal paralysis. In: Vinken P, Bruyn G, editors. Handbook of Clinical Neurology. North Holland: Amsterdam; 1976. pp. 489–499. [Google Scholar]
- Maury M. About pain and its treatment in paraplegics. Paraplegia. 1978;15:349–352. doi: 10.1038/sc.1977.53. [DOI] [PubMed] [Google Scholar]
- Donovan WH, Dimitrijevic MR, Dahm L, Dimitrijevic M. Neurophysiological approaches to chronic pain following spinal cord injury. Paraplegia. 1982;20:135–146. doi: 10.1038/sc.1982.27. [DOI] [PubMed] [Google Scholar]
- Tunks E. Pain in spinal cord injured patients. In: Bloch RF, Basbaum M, editors. Management of Spinal Cord Injuries. Baltimore, MD: Williams & Wilkins; 1986. pp. 180–211. [Google Scholar]
- Anke AG, Stenehjem AE, Stanghelle JK. Pain and life quality within 2 years of spinal cord injury. Paraplegia. 1995;33:555–559. doi: 10.1038/sc.1995.120. [DOI] [PubMed] [Google Scholar]
- New PW, Lim TC, Hill ST, Brown DJ. A survey of pain during rehabilitation after acute spinal cord injury. Spinal Cord. 1997;35:658–663. doi: 10.1038/sj.sc.3100472. [DOI] [PubMed] [Google Scholar]
- Stormer S, Gerner HJ, Gruninger W, et al. Chronic pain/dysaesthesiae in spinal cord injury patients: results of a multicentre study. Spinal Cord. 1997;35:446–455. doi: 10.1038/sj.sc.3100411. [DOI] [PubMed] [Google Scholar]
- Siddall PJ, Taylor DA, Cousins MJ. Classification of pain following spinal cord injury. Spinal Cord. 1997;35:69–75. doi: 10.1038/sj.sc.3100365. [DOI] [PubMed] [Google Scholar]
- Cardenas DD, Turner JA, Warms CA, Marshall HM. Classification of chronic pain associated with spinal cord injuries. Arch Phys Med Rehabil. 2002;83:1708–1714. doi: 10.1053/apmr.2002.35651. [DOI] [PubMed] [Google Scholar]
- Hicken BL, Putzke JD, Richards JS. Classification of pain following spinal cord injury: literature review and future directions. In: Burchiel KJ, Yezierski RP, editors. Spinal Cord Injury Pain: Assessment, Mechanisms, Management. Vol 23. Seattle, WA: International Association for the Study of Pain Press; 2002. pp. 25–38. In: [Google Scholar]
- Munro D. Two-year end-results in the total rehabilitation of veterans with spinal-cord and cauda-equina injuries. New Engl J Med. 1950;242:1–16. doi: 10.1056/NEJM195001052420101. [DOI] [PubMed] [Google Scholar]
- Kennedy RH. The new viewpoint toward spinal cord injuries. Ann Surg. 1946;124:1057–1065. [PubMed] [Google Scholar]
- Siddall PJ, Taylor DA, McClelland JM, Rutkowski SB, Cousins MJ. Pain report and the relationship of pain to physical factors in the first 6 months following spinal cord injury. Pain. 1999;81:187–197. doi: 10.1016/s0304-3959(99)00023-8. [DOI] [PubMed] [Google Scholar]
- Siddall PJ, Yezierski RP, Loeser JD. Pain following spinal cord injury: clinical features, prevalence, and taxonomy. IASP Newsletter. 2000;3:3–7. [Google Scholar]
- Ragnarsson KT. Management of pain in persons with spinal cord injury. J Spinal Cord Med. 1997;20:186–199. [PubMed] [Google Scholar]
- Bryce TN, Ragnarsson KT. Pain after spinal cord injury. Phys Med Rehabil Clin North Am. 2000;11:157–168. [PubMed] [Google Scholar]
- Bryce TN, Ragnarsson KT. Pain management in persons with spinal cord injury. In: Lin VW, Cardenas DD, Cutter NC, et al., editors. Spinal Cord Medicine: Principles and Practice. New York: Demos Medical Publishing; 2003. pp. 441–460. [Google Scholar]
- Richards JS, Hicken BL, Putzke JD, Ness T, Kezar L. Reliability characteristics of the Donovan Spinal Cord Injury Pain Classification System. Arch Phys Med Rehabil. 2002;83:1290–1294. doi: 10.1053/apmr.2002.33636. [DOI] [PubMed] [Google Scholar]
- Putzke JD, Richards JS, Ness T, Kezar L. Test-retest reliability of the Donovan Spinal Cord Injury Pain Classification scheme. Spinal Cord. 2003;41:239–241. doi: 10.1038/sj.sc.3101434. [DOI] [PubMed] [Google Scholar]
- Putzke JD, Richards JS, Ness T, Kezar L. Interrater reliability of the international association for the study of pain and tunks' spinal cord injury pain classification schemes. Am J Phys Med Rehabil. 2003;82:437–440. [PubMed] [Google Scholar]
- Rossi PH, Nock SL. Measuring Social Judgments: The Factorial Survey Approach. Beverly Hills, CA: Sage; 1982. [Google Scholar]
- Dijkers M. The Social Standing of Individuals and Families: An Empirical Investigation. Detroit, MI: Wayne State University; 1978. [Google Scholar]
- Ludwick R, Zeller RA. The factorial survey: an experimental method to replicate real world problems. Nurs Res. 2001;50:129–133. doi: 10.1097/00006199-200103000-00009. [DOI] [PubMed] [Google Scholar]
- Hamers JP, van den Hout MA, Halfens RJ, Abu-Saad HH, Heijltjes AE. Differences in pain assessment and decisions regarding the administration of analgesics between novices, intermediates and experts in pediatric nursing. Int J Nurs Stud. 1997;34:325–334. doi: 10.1016/s0020-7489(97)00024-2. [DOI] [PubMed] [Google Scholar]
- Fanurik D, Koh JL, Schmitz ML, Harrison RD, Roberson PK, Killebrew P. Pain assessment and treatment in children with cognitive impairment: a survey of nurses' and physicians' beliefs. Clin J Pain. 1999;15:304–312. doi: 10.1097/00002508-199912000-00007. [DOI] [PubMed] [Google Scholar]
- Rainville J, Carlson N, Polatin P, Gatchel RJ, Indahl A. Exploration of physicians' recommendations for activities in chronic low back pain. Spine. 2000;25:2210–2220. doi: 10.1097/00007632-200009010-00012. [DOI] [PubMed] [Google Scholar]
- Loveman E, Gale A. Factors influencing nurses' inferences about patient pain. Br J Nurs. 2000;9:334–337. doi: 10.12968/bjon.2000.9.6.6336. [DOI] [PubMed] [Google Scholar]
- Marcus DA, Nash JM, Turk DC. Diagnosing recurring headaches: IHS criteria and beyond. Headache. 1994;34:329–336. doi: 10.1111/j.1526-4610.1994.hed3406329.x. [DOI] [PubMed] [Google Scholar]
- American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. 4th ed. Arlington, VA: American Psychiatric Publishing; 2000. [Google Scholar]









