Skip to main content
European Spine Journal logoLink to European Spine Journal
. 2005 Nov 18;15(6):720–730. doi: 10.1007/s00586-005-1029-9

Validity and interobserver agreement of a new radiographic grading system for intervertebral disc degeneration: Part I. Lumbar spine

Hans-Joachim Wilke 1,, Friederike Rohlmann 1, Cornelia Neidlinger-Wilke 1, Karin Werner 1, Lutz Claes 1, Annette Kettler 1
PMCID: PMC3489460  PMID: 16328226

Abstract

Many different radiographic grading systems for disc degeneration are described in literature. However, only a few of them are tested for interobserver agreement and none for validity. Furthermore, most of them are based on a subjective terminology. The aim of this study, therefore, is to combine these systems to a new one in which all subjective terms are replaced by more objective ones and to test this new system for validity and interobserver agreement. Since lumbar and cervical discs need to be graded differently, this study was divided into the present Part I for the lumbar and a Part II for the cervical spine. The new radiographic grading system covers the three variables “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis”. On lateral and postero-anterior radiographs, each of these three variables first has to be graded individually. Then, the “Overall Degree of Degeneration” is assigned on a four-point scale from 0 (no degeneration) to 3 (severe degeneration). For validation, the radiographic degrees of degeneration of 44 lumbar discs were compared to the respective macroscopic ones, which were defined as “real” degrees of degeneration. The agreement between observers with different levels of experience was determined using the radiographs of 84 lumbar discs. Agreement was quantified using quadratic weighted Kappa coefficients (Kappa) with 95% confidence limits (95% CL). The validation of the new radiographic grading system revealed a substantial agreement between the radiographic and the “real” macroscopic overall degree of degeneration (Kappa=0.714, 95% CL: 0.587–0.841). The radiographic grades, however, tended to be slightly lower than the “real” ones. The interobserver agreement was substantial for all the three variables and for the overall degree of degeneration (Kappa=0.787, 95% CL: 0.702–0.872). However, the inexperienced observer tended to assign slightly lower degrees of degeneration than the experienced one. In conclusion, we believe that the new radiographic grading system is an almost objective, valid and reliable tool to quantify the degree of degeneration of individual lumbar intervertebral discs. However, the user should always remember that the “real” degree of degeneration tends to be underestimated and that slight differences between the ratings of observers with different levels of experience have to be expected.

Keywords: Classification, Degeneration, Lumbar intervertebral disc, Reliability, Validity

Introduction

The morphology of intervertebral disc degeneration has often been described in literature [1, 911, 14, 18, 28, 33, 34]. Especially for research purposes, however, these changes need to be quantified. Therefore, in the past, many different grading systems have been developed especially for the lumbar and also for the cervical spine [21]. Some of these systems can only be used in vitro such as macroscopic or histologic grading systems. In contrast, some others are also applicable in clinical practice such as those based on plain radiographs, discography, computed tomography or magnetic resonance imaging. Out of these, magnetic resonance imaging has become increasingly popular since the intervertebral disc itself can be visualised and the procedure itself is not invasive. Nevertheless, grading systems based on plain radiographs still have several advantages. First, in contrast to discograms, they are less invasive. Second, in contrast to magnetic resonance imaging and computed tomography, they only require a standard X-ray machine and are much cheaper. And third, plain radiographs are often taken for diagnostic or follow-up purposes and, thus, are often already available.

A grading system has to fulfil certain requirements to become a valuable tool. First of all, the ratings should be the same irrespective of the experience of the observer. And second, they should be valid. Thus, they should reflect the “real” degree of degeneration. However, out of all the nine radiographic grading systems for lumbar or cervical disc degeneration found in literature [6, 17, 19, 20, 2325, 31], only three are tested for interobserver agreement [19, 23, 24] and none for validity. Furthermore, most of them are based on terms such as “mild”, “severe”, “small” or “large” [6, 17, 19, 20, 23]. Since such terms are not well defined and tend to be subjective, the interobserver agreement of the respective grading systems is expected to be worse if more subjective terms were used.

Therefore, the first aim of this study was to combine the existing radiographic grading systems to a new one, in which all subjective terms were replaced by more objective ones. The second aim was to test this new grading system for validity and agreement between experienced and unexperienced observers. Due to the uncinate processes and the smaller dimensions of the cervical spine, lumbar and cervical discs need to be graded in a different way. In order to prevent confusion, this study was therefore divided into the present Part I for the lumbar and a Part II for the cervical spine.

Materials and methods

The new grading system covers the three main radiographic signs of disc degeneration: “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis” (Table 1). On lateral and postero-anterior radiographs each of these three variables first has to be graded individually on a scale from 0 to 3. Based on the sum of these three scores, the “Overall Degree of Degeneration” is assigned to each disc on a four-point scale from 0 (no degeneration) to 3 (severe degeneration).

Table 1.

New radiographic grading system for lumbar intervertebral disc degeneration modified according to the systems found in literature

Radiographic grading system for lumbar intervertebral disc degeneration (based on lateral and postero-anterior radiographs)
Height loss Osteophyte formation Diffuse sclerosis Overall degree of degeneration
Anterior and posterior height loss with respect to the individual height before degeneration Sum of points of eight edges
No osteophytes: 0 points
<3 mm: 1 point
≥3 mm but <6 mm: 2 points
≥6 mm: 3 points
Sum of points of both adjacent vertebral bodies
No sclerosis: 0 points
0.25 partially or completely affected: 1 point
0.5 partially or completely affected: 2 points
>0.5 partially or completely affected: 3 points
Sum of points of “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis”
0=0%
1=<33%
2=≥33 but <66%
3=≥66%
0=0 points
1=1–8 points
2=9–16 points
3=17–24 points
0=0 points
1=1–2 points
2=3–4 points
3=5–6 points
0 point = grade 0 (no degeneration)
1–3 points = grade 1 (mild degeneration)
4–6 points = grade 2 (moderate degeneration)
7–9 points = grade 3 (severe degeneration)

The three variables “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis” are first graded individually on a scale from 0 to 3. The “Overall Degree of Degeneration” is then assigned according to the sum of these three scores

“Height Loss” is defined as the average anterior and posterior (but not central) decrease in disc height referred to the respective height before degeneration. The anterior height before degeneration is estimated based on the normal values reported by Frobin et al. [15] (Fig. 1; Table 2). To account for interindividual differences, the ranges of normal disc height should be considered rather than their mean value. The posterior height before degeneration is estimated as being smaller or as high but not higher than the respective anterior height [12]. The central disc height was not included into the assessment of “Height Loss” since at this position the height increases in some cases of osteoporosis (fish-vertebra deformity).

Fig. 1.

Fig. 1

To assess the degree of height loss, first, the actual disc height has to be determined. For this purpose, the anterior and posterior edges of the adjacent vertebral bodies (small white circles) are defined as those points having the largest distance to the centre of the vertebral body (black points). Then, the distance of each of these four edges to the midplane of the disc (dashed line) is measured. Finally, the sum of the two anterior distances is defined as actual anterior disc height, and the sum of the two posterior distances is defined as actual posterior disc height. This procedure is meant to support the estimation of actual disc height, but does not have to be carried out using drawings or digitisation. In a second step, this actual height is compared to the respective height before degeneration, which is estimated based on the normal values reported by Frobin et al. [15](Table 2 )

Table 2.

Normal values of anterior disc height normalised to the antero-posterior diameter of the cranial vertebral body (=100%) (mean of male and female subjects according to Frobin et al. [15])

Normal values of anterior disc height (modified according to Frobin et al. [15])
  Mean (%) Mean – 2 SD (%) Mean + 2 SD (%)
T12-L1
L1-2
L2-3
L3-4
L4-5
L5-S1
24
29
33
37
42
41
18
22
26
29
32
31
30
36
41
45
51
51

Mean mean value, SD standard deviation

The variable “Osteophyte Formation” is assessed in terms of the number and length of osteophytes growing at the two anterior, two posterior, two left lateral and two right lateral edges of the adjacent vertebral bodies (Fig. 2).

Fig. 2.

Fig. 2

To assess the variable “Ostophyte Formation”, the two anterior (e1, e2), two posterior (e3, e4), two right lateral (e5, e6) and two left lateral edges (e7, e8) of the adjacent vertebral bodies are screened for osteophytes. Their number is counted and their length is measured along their long axis beginning at the former border of the vertebral body and ending at their tips (white lines in the edges e1, e2, e5, e6, e7 and e8)

The variable “Diffuse Sclerosis” is graded in terms of the number of predefined regions that are affected by sclerosis (Fig. 3). A thickening of the bony endplates should not be counted if it is not diffuse.

Fig. 3.

Fig. 3

The variable “Diffuse Sclerosis” is assessed on the lateral radiographs only. The lower half of the upper vertebral body and the upper half of the lower vertebral body are each divided into four regions. Then, the number of regions is counted, which are covered by sclerosis. Note that a partially covered region is counted as if it was completely covered. In this example, the number of affected regions (asterisk) would be three for the upper and three for the lower vertebral body

To validate the new radiographic grading system, first, the radiographic degrees of degeneration of 44 intervertebral discs from 16 fresh frozen mono or polysegmental human osteoligamentous lumbar spine specimens were determined. The age of the donors ranged between 16 and 91 years (mean 66 years) and none of them had a known history of trauma or spinal disease. These radiographic degrees of degeneration were then compared to the respective macroscopic ones, which were defined as “real” degrees of degeneration. For this purpose, the specimens were first x-rayed in the lateral and postero-anterior direction (43805 X-Ray System, Faxitron Series, Hewlett Packard, USA; film to source distance 61 cm) using a tube voltage of 45–50 kV and an exposure time of 5 min. Then, still being frozen, they were cut in the mid-sagittal plane. The cutting surfaces were photographed and stored for evaluation. To be able to directly compare the radiographic with the macroscopic degrees of degeneration, the macroscopic grading system also covered the three variables “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis” (Tables 1 and 3). However, macroscopically, the three variables “Nucleus Pulposus”, “Annulus Fibrosus” and “Endplate Cartilage” were added to reflect the “real” degree of degeneration as closely as possible (modified according to Thompson et al. [32]).

Table 3.

Macroscopic grading system for lumbar intervertebral disc degeneration used as the “gold standard” to define the “real” degree of degeneration (modified according to the systems found in literature)

Macroscopic grading system for lumbar intervertebral disc degeneration (based on mid-sagittal sections)
Nucleus pulposus Annulus fibrosus Endplate cartilage
Appearance Appearance Sum of points of both cartilaginous endplates
Normal: 0 points
Thickness irregular: 1 point
Focal defect(s): 2 points
(Almost) complete destruction: 3 points
0=Bulging gel
1=Fibrous tissue; loss of annular-nuclear demarcation
2=Focal clefts
3=Complete disruption of nucleus or complete transformation into tissue other than nucleus tissue
0=Discrete fibrous lamellas
1=Mucinous infiltration
2=Focal disruptions
3=Complete disruption of anterior plus posterior annulus or complete transformation into tissue other than annulus tissue
0=0 points
1=1–2 points
2=3–4 points
3=5–6 points
Height loss Osteophyte formation Diffuse sclerosis Overall degree of degeneration
Anterior and posterior height loss with respect to the individual height before degeneration Sum of points of four edges
No osteophytes: 0 points
<3 mm: 1 point
≥3 mm but <6 mm: 2 points
≥6 mm: 3 points
Sum of points of both adjacent vertebral bodies
No sclerosis: 0 points
0.25 partially or completely affected: 1 point
0.5 partially or completely affected: 2 points
>0.5 partially or completely affected: 3 points
Sum of points of “Nucleus Pulposus”, “Annulus Fibrosus”, “Endplate Cartilage”, “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis”
0=0%
1=<33%
2=≥33 but <66%
3=≥66%
0=0 points
1=1–4 points
2=5–8 points
3=9–12 points
0=0 points
1=1–2 points
2=3–4 points
3=5–6 points
0 point = grade 0 (no degeneration)
1–6 points = grade 1 (mild degeneration)
7–12 points = grade 2 (moderate degeneration)
13–18 points = grade 3 (severe degeneration)

The six variables “Nucleus Pulposus”, “Annulus Fibrosus”, “Endplate Cartilage”, “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis” are first graded individually on a scale from 0 to 3. The “Overall Degree of Degeneration” is then assigned according to the sum of these six scores. Note that the macroscopic and the radiographic grading system are almost identical for the three variables, namely, “Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis”

Using this modified macroscopic grading system, the “real” degree of degeneration of the 44 lumbar discs was determined by two observers independently. Both of them were familiar with disc degeneration and had several years of experience in spinal research. The “real” degree of degeneration was then defined as the mean value of the results of both the observers. Then, the 44 discs were additionally graded radiographically by one of these two observers. In order to ensure that this observer was not biased by the evaluation of the macroscopic slices carried out a few days before, the radiographs were blinded and put in a randomised order. The postero-anterior radiographs of four discs could not be evaluated due to poor quality. These four discs could therefore only be included for the variables “Height Loss” and “Diffuse Sclerosis”. The remaining 40 discs, however, could be evaluated completely. To statistically assess the agreement between the radiographic and the “real”, macroscopic degree of degeneration weighted Kappa coefficients (quadratic weights) with 95% confidence limits (95% CL) were calculated according to Fleiss and Cohen [13] using the software SAS 8.2 [30]. These calculations were carried out under the assumption of independency of the observation of each intervertebral disc.

In order to show whether the grade assigned to a disc depends on the degree of experience of the observer, the agreement between one experienced and one inexperienced observer was determined. Both observers graded the lateral and postero-anterior radiographs of 27 osteoligamentous mono or polysegmental spine specimens with an overall of 84 lumbar intervertebral discs. The age of the donors ranged between 16 and 92 years (mean 67 years) and none of them had a known history of trauma or spinal disease. The experienced observer was the one who also evaluated the macroscopic slices and the radiographs for validation. In contrast, the inexperienced observer, being a mechanical engineer without any medical training, had no experience in reading radiographs and was not familiar with disc degeneration. However, he was trained before grading the discs: the grading system was explained using some training radiographs. Furthermore, the radiographic appearance of the most common spinal diseases such as osteoporosis, osteoporotic fractures, fish-vertebra deformities, spondylolysis, Bechterew’s disease or spinal metastases was demonstrated in a 30 min session. Then, written instructions were handed over, in which the assessment of the three variables was explained again and the normal values of anterior lumbar disc height were listed similar to the Figs. 1, 2, 3 and to Table 2. Besides these instructions, the inexperienced observer did not get any further help during grading.

Statistically, the agreement between the ratings of the experienced and the inexperienced observer was evaluated using the same type of weighted Kappa coefficient as for validation [13]. For both, the validation and the assessment of the interobserver agreement, a Kappa of <0.00 was interpreted as poor agreement, 0.00–0.20 as slight agreement, 0.21–0.40 as fair agreement, 0.41–0.60 as moderate agreement, 0.61–0.80 as substantial agreement and >0.81 as almost perfect agreement [22].

Results

The agreement between the macroscopic ratings of the two experienced observers was almost perfect (Kappa between 0.874 and 0.920) for the overall degree of degeneration and the variables “Height Loss”, “Nucleus Pulposus”, “Annulus Fibrosus” and “Endplate Cartilage” (Table 4). For the variables “Osteophyte Formation” and “Diffuse Sclerosis” the agreement was somewhat lower, but still substantial (Kappa 0.675, respectively 0.707). These good agreements would almost have allowed to define the rating of only one observer as a “real” degree of degeneration. To further increase objectivity, however, the average ratings of both were used instead.

Table 4.

Agreement between the macroscopic ratings of the two experienced observers (weighted Kappa coefficients with 95% CL)

Interobserver agreement macroscopy (n=44 lumbar intervertebral discs)
  Kappa 95% CL
Lower Upper
Height loss
Osteophyte formation
Diffuse sclerosis
Nucleus pulposus
Annulus fibrosus
Endplate cartilage
Overall grade
0.905
0.675
0.707
0.879
0.874
0.913
0.920
0.848
0.537
0.504
0.796
0.787
0.852
0.865
0.962
0.814
0.910
0.963
0.961
0.973
0.975

CL confidence limits

The validation of the radiographic grading system revealed an almost perfect agreement with the macroscopic, “real” degree of degeneration for the variable “Height Loss” (Kappa 0.862) and a slightly lower but still substantial agreement for “Osteophyte Formation” (Kappa 0.613) (Table 5). For the overall degree of degeneration the agreement also was substantial, the radiographic grades, however, tended to be lower than the macroscopic ones: in 20 out of 40 discs the “real” overall degree of degeneration was underestimated, but in only three it was overestimated (Fig. 4). As to the variable “Diffuse Sclerosis”, Kappa was 0.343 reflecting an only fair agreement. In this case, much fewer sclerotic areas were detected radiographically than macroscopically.

Table 5.

Agreement between the radiographic and the macroscopic “real” degrees of degeneration of 40 and 44 lumbar intervertebral discs,respectively (weighted Kappa coefficients with 95% CL)

Agreement between radiography and macroscopy (n=44; n=40a lumbar intervertebral discs)
  Kappa 95% CL
Lower Upper
Height loss
Osteophyte formation
Diffuse sclerosis
Overall gradeb
0.862
0.613a
0.343
0.714a
0.783
0.463a
0.121
0.587a
0.941
0.763a
0.565
0.841a

CL confidence limits

bNote that the overall degree of degeneration covers only three variables in the radiographic grading system (“Height Loss”, “Osteophyte Formation” and “Diffuse Sclerosis”), but six variables in the macroscopic grading system (additionally “Nucleus Pulposus, “Annulus Fibrosus” and “Endplate Cartilage”)

Fig. 4.

Fig. 4

Agreement between the radiographic and the macroscopic “real” degree of degeneration of 40 and 44 lumbar intervertebral discs, respectively. Each field contains the number of discs rated with 0, 1, 2 or 3 points radiographically (rating of one experienced observer) and with 0, 0.5, 1, 1.5, 2, 2.5 or 3 points macroscopically (mean value of the ratings of two experienced observers)

The agreement between the radiographic ratings of the experienced and the inexperienced observer was substantial (Kappa between 0.681 and 0.798) for all the three variables as well as for the overall degree of degeneration (Table 6). However, the inexperienced observer generally tended to assign lower degrees of degeneration than the experienced one (Fig. 5). For example, concerning the overall degree of degeneration, 15 discs were rated 1° lower by the inexperienced observer but only one disc was rated 1° higher. Nevertheless, most ratings were identical: the same degree of “Height Loss” was assigned by both observers to 65% of all the discs, the same degree of “Osteophyte Formation” to 79%, the same degree of “Diffuse Sclerosis” to 80% and the same overall degree of degeneration to 81% of all the discs. The differences in grade assignment were never higher than 1° except for one disc concerning the variable “Osteophyte Formation” and two discs concerning the variable “Diffuse Sclerosis”, where the difference was 2°. Differences of more than 2° did not occur.

Table 6.

Agreement between the radiographic ratings of one experienced and one inexperienced observer (weighted Kappa coefficients with 95% CL)

Interobserver agreement radiography (n=84 lumbar intervertebral discs)
  Kappa 95% CL
Lower Upper
Height loss
Osteophyte formation
Diffuse sclerosis
Overall grade
0.798
0.687
0.681
0.787
0.713
0.559
0.490
0.702
0.884
0.814
0.872
0.872

CL confidence limits

Fig. 5.

Fig. 5

Agreement between the radiographic ratings of one experienced and one inexperienced observer. Each field contains the number of lumbar intervertebral discs rated with the respective scores

Discussion

In this study, the radiographic grading systems for lumbar intervertebral disc degeneration available from literature were combined to a new one, in which undefined and subjective terms were replaced by better defined and more objective ones. Finally, similar to the grading system of Mimura et al. [25], the height loss of the disc was estimated as the percentage decrease in height referred to the height before degeneration. Osteophytes were assessed in terms of their number and absolute length and the degree of sclerosis was determined according to the number of predefined areas that were affected.

Despite these attempts to create a more objective grading system than those known from literature, a certain degree of subjectivity still remained. In the assessment of the variable “Height Loss”, for example, the initial disc height still needs to be estimated. In vivo, this estimation becomes even more difficult due to the diurnal changes in disc height [2, 5, 29]. But even in vitro, its assessment is difficult due to the large spread of normal values [15]. Thus, a wide variety of different estimations are possible. Therefore, it is not surprising that the ratings of the two observers were not equal: for the inexperienced observer the height loss of the discs often seemed to be less severe than for the experienced one.

In contrast, the variable “Osteophyte Formation” could be defined much more objectively . Nevertheless, the inexperienced observer tended to see fewer osteophytes, thus, for example, the inexperienced observer tended to define pointed edges as normal, whereas the experienced one tended to define them as osteophytes. Thus, even though the terms used in the new grading system are more objective than those used in the systems known from literature, interobserver differences still have to be expected. The tendencies seen in this study, however, should not be generalised to all experienced and inexperienced observers since the ratings of only one experienced and one inexperienced observer were compared with each other and also since the quality of the radiographs is not always and everywhere the same. For example, the alignment of the patient during X-raying may be more difficult than the alignment of a spine specimen.

However, despite these tendencies, the differences between the ratings of the two observers were only little: for the three variables and the overall degree of degeneration they did not differ by more than 1° in all except for three cases where the difference was 2°. Furthermore, the interobserver agreement was substantial with Kappa coefficients of 0.798 for “Height Loss”, 0.687 for “Osteophyte Formation”, 0.681 for “Diffuse Sclerosis” and 0.787 for the overall degree of degeneration. Even though, these values reflect the agreement between the two observers with different levels of experience, they were not much lower than those reported by Lane et al. [23] for three observers with similar experience. The Kappa coefficients of Lane et al. were 0.95 for “Narrowing”, 0.91 for “Osteophytes” and 0.93 for the “Summary Grades”. For “Sclerosis”, however, Lane et al. reported a Kappa of only 0.55. Thus, even though the three observers of Lane et al. were all experienced, their interobserver agreement for this variable was significantly lower than the respective agreement of the new system. In contrast to Lane et al., but similar to the present study, Madan et al. [24] reported the agreement between five observers with different levels of experience. Their interobserver Kappa coefficients varied between 0.351 and 0.673 for the overall disc grade and thus, were lower than the respective value of the new system. These results indicate that the use of undefined terms may work with experienced observers for the variables “Height Loss” and “Osteophyte Formation”, but does not work for the variable “Sclerosis” and not with inexperienced observers.

Similar to the work of Madan et al., the agreement between observers with different degrees of experience was also reported by Pfirrmann et al., who developed a grading system based on magnetic resonance images [27]. The Kappa coefficients reported by this group ranged between 0.74 and 0.81 for the overall disc grade. This range covers the respective value for the new radiographic system (0.787), but is higher than the range reported by Madan et al. (0.351–0.673) [24]. These differences between Pfirrmann et al. and Madan et al. indicate that the assessment of signal intensity and homogeneity on magnetic resonance images may per se be more objective than the assessment of bony structures and densities on radiographs. This would probably also be the case for Modic’s classification of vertebral body marrow changes [26]. According to Modic et al., these changes are associated with degenerative disc disease and, thus, are often used to quantify disc degeneration.

Due to their small number, each of the three variables “Height Loss”, “Diffuse Sclerosis” and “Osteophyte Formation” strongly influences the overall degree of degeneration. Thus, discs of one and the same overall degree of degeneration may have completely different appearances. Depending on the purpose of the study, it might therefore be advantageous to report the three variables individually instead of the overall degree of degeneration only. Another possibility to reduce the weight of each variable would be to include further variables such as “Listhesis” or “Disc Calcification” into the grading system. These variables are assumed to be associated with disc degeneration and can be seen on X-rays [7, 8, 35]. An objective grading, however, is difficult. The degree of listhesis seen on a radiograph, for example, strongly depends on the loading of the spine during X-raying. For instance, the degree of listhesis of one and the same patient may be completely different in a lying position when compared to a standing or sitting position. And whether calcifications can be seen on radiographs or not strongly depend on the quality of the radiograph and the voltage used. Therefore, these two variables were not included into the new grading system.

To validate the radiographic grading system, the macroscopic degree of degeneration was defined as being “real”. This definition was used since macroscopic slices directly reflect the changes within the disc, whereas radiographs only depict the surrounding bony structures. Histologically, disc degeneration shows regional variations within one and the same disc [4]. The macroscopic grading system used here, however, does not account for these differences. This also applies for the radiographic grading systems since the disc itself cannot be depicted. Thus, in both the grading systems, the macroscopic and the radiographic one, the disc is assessed as “average”.

Compared to the macroscopic “real” degrees of degeneration, the radiographic degrees of degeneration tended to be lower. This underestimation has two reasons: first, radiographically, the loss of intervertebral height, the formation of osteophytes and endplate sclerosis are indirect signs of degeneration, while changes within the disc itself cannot be seen directly. Thus, early degenerative changes, such as a discolouring of the nucleus cannot be detected on radiographs. Similarly Frobin et al. could show that signal loss within the intervertebral disc is possible without the radiographic loss of height [16]. In such cases, the disc may radiographically have grade 0, and macroscopically, however, grade 1. Thus, in the detection of early degenerative changes within the disc, magnetic resonance imaging may have certain advantages compared to plain radiography. However, according to Benneker et al., a magnetic resonance imaging score does not necessarily have to correlate better with morphology than a radiographic score [3].Also, the variable “Diffuse Sclerosis” is easily underestimated on radiographs. This underestimation may become even more pronounced if radiographs of patients instead of osteoligamentous specimens have to be rated, since on the radiographs of patients much more tissue surrounds the spine and influences the x-ray transparency around the vertebral bodies. The only variable, where radiography revealed a higher degree of degeneration than macroscopy was “Osteophyte Formation” since macroscopically, the assessment of osteophytes was restricted to the mid-sagittal plane.

Despite these discrepancies between the radiographic and the macroscopic “real” degrees of degeneration, however, the agreement for the overall degree of degeneration still was substantial (Fig. 6). Thus, the overall validity of the new radiographic grading system is deemed to be good.

Fig. 6.

Fig. 6

Examples of the four degrees of degeneration

Conclusions

In conclusion, we believe that the new radiographic grading system is an almost objective, valid and reliable tool if the degree of degeneration of individual lumbar intervertebral discs has to be quantified. However, the user should always remember that the radiographic degree of degeneration tends to be lower than the “real” macroscopic one and that slight differences between the ratings of observers with different degrees of experience have to be expected.

This study was focused on the agreement between one experienced and one inexperienced observer to evaluate the objectivity of the new system. Other parameters such as the intraobserver agreement, the agreement between observers with similar degrees of experience, the agreement between whole institutions or the effect of the quality of the radiographs on the ratings need to be investigated in future studies.

Footnotes

An erratum to this article can be found at http://dx.doi.org/10.1007/s00586-006-1078-8

References

  • 1.Adams MA, Dolan P, Hutton WC. The stages of disc degeneration as revealed by discograms. J Bone Joint Surg Br. 1986;68(1):36–41. doi: 10.1302/0301-620X.68B1.3941139. [DOI] [PubMed] [Google Scholar]
  • 2.Althoff I, Brinckmann P, Frobin W, Sandover J, Burton K. An improved method of stature measurement for quantitative determination of spinal loading. Application to sitting postures and whole body vibration. Spine. 1992;17(6):682–693. doi: 10.1097/00007632-199206000-00008. [DOI] [PubMed] [Google Scholar]
  • 3.Benneker LM, Heini PF, Anderson SE, Alini M, Ito K. Correlation of radiographic and MRI parameters to morphological and biochemical assessment of intervertebral disc degeneration. Eur Spine J. 2005;14(1):27–35. doi: 10.1007/s00586-004-0759-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Boos N, Weissbach S, Rohrbach H, Weiler C, Spratt KF, Nerlich AG. Classification of age-related changes in lumbar intervertebral discs: 2002 Volvo Award in basic science. Spine. 2002;27(23):2631–2644. doi: 10.1097/00007632-200212010-00002. [DOI] [PubMed] [Google Scholar]
  • 5.Botsford DJ, Esses SI, Ogilvie-Harris DJ. In vivo diurnal variation in intervertebral disc volume and morphology. Spine. 1994;19(8):935–940. doi: 10.1097/00007632-199404150-00012. [DOI] [PubMed] [Google Scholar]
  • 6.Brooker AE, Barter RW. Cervical spondylosis. A clinical study with comparative radiology. Brain. 1965;88(5):925–936. doi: 10.1093/brain/88.5.925. [DOI] [PubMed] [Google Scholar]
  • 7.Chanchairujira Radiology. 2004;230:499. doi: 10.1148/radiol.2302011842. [DOI] [PubMed] [Google Scholar]
  • 8.Cheng Skeletal Radiol. 1996;25:231. doi: 10.1007/s002560050070. [DOI] [PubMed] [Google Scholar]
  • 9.Collins DH (1949) The pathology of articular and spinal diseases. Edward Arnold & Co., CO
  • 10.Coventry MB. The intervertebral disc: its macroscopic anatomy and pathology: Part III. Pathological changes in the intervertebral disc lesion. J Bone Joint Surg (Am) 1945;27:460–473. [Google Scholar]
  • 11.Coventry MB. The intervertebral disc: its microscopic anatomy and pathology: Part II. Changes in the intervertebral disc concomitant with age. J Bone Joint Surg (Am) 1945;27:233–247. [Google Scholar]
  • 12.Farfan HF. Mechanical disorders of the low back. Philadelphia: Lea & Febiger,; 1973. [Google Scholar]
  • 13.Fleiss J, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas. 1973;33:613–619. [Google Scholar]
  • 14.Friberg S, Hirsch C. Anatomical and clinical studies on lumbar disc degeneration. Acta Orthop Scand. 1949;19:222–242. doi: 10.3109/17453674908991095. [DOI] [PubMed] [Google Scholar]
  • 15.Frobin W, Brinckmann P, Biggemann M, Tillotson M, Burton K. Precision measurement of disc height, vertebral height and sagittal plane displacement from lateral radiographic views of the lumbar spine. Clin Biomech (Bristol, Avon) 1997;12(Suppl. 1):S1–S63. doi: 10.1016/s0268-0033(96)00067-8. [DOI] [PubMed] [Google Scholar]
  • 16.Frobin W, Brinckmann P, Kramer M, Hartwig E. Height of lumbar discs measured from radiographs compared with degeneration and height classified from MR images. Eur Radiol. 2001;11(2):263–269. doi: 10.1007/s003300000556. [DOI] [PubMed] [Google Scholar]
  • 17.Gordon SJ, Yang KH, Mayer PJ, Mace AH, Jr, Kish VL, Radin EL. Mechanism of disc rupture. A preliminary report. Spine. 1991;16(4):450–456. doi: 10.1097/00007632-199104000-00011. [DOI] [PubMed] [Google Scholar]
  • 18.Hirsch C. Some morphological changes in the cervical spine during ageing. In: Hirsch C, Zotterman Y, editors. Cervical pain. Oxford: Pergamon Press,; 1972. pp. 21–32. [Google Scholar]
  • 19.Kellgren JH, Jeffrey MR, Ball J (1963) In: The epidemiology of chronic rheumatism. Vol. II: Atlas of standard radiographs of arthritis. Blackwell Scientific Publications, Oxford, pp 14–19
  • 20.Kellgren JH, Lawrence JS. Rheumatism in miners. II. X-ray study. Br J Ind Med. 1952;9(3):197–207. doi: 10.1136/oem.9.3.197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kettler A, Wilke H-J (accepted) Review of existing grading systems for cervical and lumbar disc and facet joint degeneration. Eur Spine J [DOI] [PMC free article] [PubMed]
  • 22.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–174. [PubMed] [Google Scholar]
  • 23.Lane NE, Nevitt MC, Genant HK, Hochberg MC. Reliability of new indices of radiographic osteoarthritis of the hand and hip and lumbar disc degeneration. J Rheumatol. 1993;20(11):1911–1918. [PubMed] [Google Scholar]
  • 24.Madan SS, Rai A, Harley JM. Interobserver error in interpretation of the radiographs for degeneration of the lumbar spine. Iowa Orthop J. 2003;23:51–56. [PMC free article] [PubMed] [Google Scholar]
  • 25.Mimura M, Panjabi MM, Oxland TR, Crisco JJ, Yamamoto I, Vasavada A. Disc degeneration affects the multidirectional flexibility of the lumbar spine. Spine. 1994;19(12):1371–1380. doi: 10.1097/00007632-199406000-00011. [DOI] [PubMed] [Google Scholar]
  • 26.Modic MT, Steinberg PM, Ross JS, Masaryk TJ, Carter JR. Degenerative disk disease: assessment of changes in vertebral body marrow with MR imaging. Radiology. 1988;166(1 Pt 1):193–199. doi: 10.1148/radiology.166.1.3336678. [DOI] [PubMed] [Google Scholar]
  • 27.Pfirrmann CW, Metzdorf A, Zanetti M, Hodler J, Boos N. Magnetic resonance classification of lumbar intervertebral disc degeneration. Spine. 2001;26(17):1873–1878. doi: 10.1097/00007632-200109010-00011. [DOI] [PubMed] [Google Scholar]
  • 28.Resnick D. Degenerative diseases of the vertebral column. Radiology. 1985;156(1):3–14. doi: 10.1148/radiology.156.1.3923556. [DOI] [PubMed] [Google Scholar]
  • 29.Roberts N, Hogg D, Whitehouse GH, Dangerfield P. Quantitative analysis of diurnal variation in volume and water content of lumbar intervertebral discs. Clin Anat. 1998;11(1):1–8. doi: 10.1002/(SICI)1098-2353(1998)11:1&#x0003c;1::AID-CA1&#x0003e;3.0.CO;2-Z. [DOI] [PubMed] [Google Scholar]
  • 30.SAS (1999) SAS Institute Inc., Cary, NC, USA
  • 31.Silberstein CE. The evolution of degenerative changes in the cervical spine and an investigation into the “Joints of Luschka”. Clin Orthop. 1965;40:184–204. [PubMed] [Google Scholar]
  • 32.Thompson JP, Pearce RH, Schechter MT, Adams ME, Tsang IK, Bishop PB. Preliminary evaluation of a scheme for grading the gross morphology of the human intervertebral disc. Spine. 1990;15(5):411–415. doi: 10.1097/00007632-199005000-00012. [DOI] [PubMed] [Google Scholar]
  • 33.Töndury G. The behaviour of the cervical discs during life. In: Hirsch C, Zotterman Y, editors. Cervical pain. Oxford:  ; 1972. pp. 59–66. [Google Scholar]
  • 34.Vernon-Roberts B, Pirie CJ. Degenerative changes in the intervertebral discs of the lumbar spine and their sequelae. Rheumatol Rehabil. 1977;16(1):13–21. doi: 10.1093/rheumatology/16.1.13. [DOI] [PubMed] [Google Scholar]
  • 35.Vogt Spine. 1999;24:2536. doi: 10.1097/00007632-199912010-00016. [DOI] [PubMed] [Google Scholar]

Articles from European Spine Journal are provided here courtesy of Springer-Verlag

RESOURCES