Abstract
Objective:
To evaluate the validity and reliability of the cervical vertebral maturation (CVM) method with a longitudinal sample.
Materials and Methods:
Eighty-six cephalograms from 18 subjects (5 males and 13 females) were selected from the longitudinal database. Total mandibular length was measured on each film; an increased rate served as the gold standard in examination of the validity of the CVM method. Eleven orthodontists, after receiving intensive training in the CVM method, evaluated all films twice. Kendall's W and the weighted kappa statistic were employed.
Results:
Kendall's W values were higher than 0.8 at both times, indicating strong interobserver reproducibility, but interobserver agreement was documented twice at less than 50%. A wide range of intraobserver agreement was noted (40.7%–79.1%), and substantial intraobserver reproducibility was proved by kappa values (0.53–0.86). With regard to validity, moderate agreement was reported between the gold standard and observer staging at the initial time (kappa values 0.44–0.61). However, agreement seemed to be unacceptable for clinical use, especially in cervical stage 3 (26.8%).
Conclusions:
Even though the validity and reliability of the CVM method proved statistically acceptable, we suggest that many other growth indicators should be taken into consideration in evaluating adolescent skeletal maturation.
Keywords: Growth and development, Cervical vertebral maturation, Reliability and validity
INTRODUCTION
Correction of malocclusion and treatment prognosis are strongly influenced by growth. The most favorable time to address many orthodontic problems with skeletal manifestations is during the period of pubertal growth. Assessment of skeletal maturation is essential in planning individual orthodontic treatment because of marked individual variations in timing, duration, and intensity of pubertal growth.
Lamparski1 noted that maturational changes in the cervical vertebrae on routine lateral cephalograms could be used as indicators to measure biologic age without additional radiographs. He developed the first method to assess skeletal age. Thereafter, several cervical vertebral maturation (CVM) methods were proposed, and some modifications were made and improvements achieved.2–5 However, all these methods depend on the observer's subjective judgment of morphologic changes in cervical vertebral bodies.1 The purely subjective assessment is easy to learn and to carry out in clinic. However, the subjectivity will cause significant errors in clinical usage in terms of validity and reliability.
The CVM method proposed by Baccetti et al.5 in 2005 was used worldwide in orthodontic clinics. Two characteristics of this method set it apart from others. One is that longitudinal data from radiographs were used in its development while other CVM methods used cross-sectional materials. The other is that it was established according to changes in the annual rate of increase in total mandibular length (Co-Gn) instead of by comparison with hand-wrist skeletal age.
The validity of Baccetti's method5 has not been proved with a proper gold standard. Because this method was established on the basis of longitudinal yearly changes in mandibular length, similar series of materials should be available and should serve as the gold standard. In terms of its reliability, some controversy continues.6,7 Therefore, the objective of this study was to evaluate the validity and reliability of the CVM method in skeletal maturation assessment with longitudinal cephalograms.
MATERIALS AND METHODS
Longitudinal samples were obtained from the Research Center on Craniofacial Growth and Development at Peking University. More than 900 orthodontically untreated subjects participated in this longitudinal project from 1990 through 1996. Sequential lateral cephalograms were taken at yearly intervals for consecutive years. Informed consent was obtained from all subjects and their parents. The study protocol was reviewed and approved by the institutional review board of Peking University. General selection criteria for the project were put forth in the previous study.8
In Baccetti's study,5 the maximum annual increase in Co-Gn over 6 consecutive years was defined as the peak of mandibular growth at puberty. The cephalogram before the peak mandibular growth was defined as cervical stage 3 (CS3), and the one after the peak was defined was CS4.5 Therefore, to test its validity, maximum mandibular annual growth must be found in our longitudinal sample, and the increase in the rate of mandibular total length should serve as the gold standard. Inclusion criteria were as follows: (1) The subject was younger than 10 years when his first cephalogram was taken; (2) at least four cephalograms obtained over consecutive years were available; (3) the two consecutive cephalograms with maximum annual Co-Gn increase were located in the middle of the longitudinal cephalograms of an individual subject; in another case if the maximum annual Co-Gn increase was noted during the first year, it was greater than 5 mm and had to exceed the following period of yearly incremental growth by at least 2 mm5,9–11; and (4) the lateral cephalograms at each period of time were of high clarity and good contrast. Finally, the longitudinal sample was limited to 18 subjects (5 males and 13 females) with 86 lateral cephalograms. The average age of the initial visit was 9 years 2 months.
Because the same cephalometer was used throughout this study, the magnification for each radiograph was identical. No corrections for magnification were made in linear measurement of cephalometric analysis. Mandibular length (Go-Gn) was measured directly on the cephalograms with a micrometer caliper, accurate to within 0.01 mm, to detect the change in mandibular growth velocity. Gold standards of CVM staging were defined for each subject. The cephalogram before the peak mandibular growth was defined as CS3. Films taken before the CS3 at yearly intervals were CS2 and CS1; films taken thereafter were CS4, CS5, and CS6, respectively.
All lateral cephalograms were digitized to uncompressed TIFF images using a professional dental film scanner (VIDAR Dental Film Digitizer, Vidar Systems Corporation, Herndon, Va) at 300-dpi optical resolution and 8-bit depth grayscale to ensure high imaging quality. Eleven orthodontists with an average of 17.2 years (range, 13–21 years) of clinical experience were invited as observers to stage the 86 digital cephalograms. All were from the Department of Orthodontics at Peking University School and Hospital of Stomatology and did not participate in the design of this study nor in construction of the sample. Each observer was given sufficient training in the CVM method by one investigator (Dr Zhao). The CVM method was explained to observers, and 20 clinical adolescent cephalograms were used to practice the CVM method. Then a hard copy of the schematic representation of the CVM method and the two samples of each stage in Baccetti's original article5 were distributed to every observer as references. Digital films were sequenced randomly, and the same copy was provided for observers to judge. After a 4-week washout period, observers were asked to stage the resequenced films.
Statistical Analysis
Because cervical staging is an ordinal variable, the Kendall coefficient of concordance (Kendall's W) and the weighted kappa were employed to assess agreement.6
The reliability of the CVM method is based on agreement among observers (interobserver agreement) at each time and agreement between two times for each observer (intraobserver agreement). Kendall's W, ranging from 0 (no agreement) to 1 (complete agreement), was used to assess interobserver agreement. Intermediate values of W indicate a greater or lesser degree of agreement among the various responses of observers. Intraobserver agreement was calculated through weighted kappa analysis. A kappa value respectively indicates poor (≤0), slight (0.01–0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80), and almost perfect (0.81–1.00) agreement.12 The validity of the CVM method was represented by agreement between the gold standard and estimated staging for the initial time if intraobserver agreement was acceptable. This was also calculated using the weighted kappa statistic. A P value less than .05 was considered significant. Statistical analysis was performed using the Statistical Package for the Social Sciences (SPSS), version 16.0 (SPSS Inc, Chicago, Ill) and Stata 9.0 (StataCorp, College Station, Tex).
In terms of the linear measurement of mandibular length, 20 randomly selected cephalograms were remeasured 2 weeks later by the same investigator. Method errors were assessed using Dahlberg's formula, and systematic errors were ascertained using paired t-tests, similar to the recommendations of Houston.13 Method errors did not exceed 0.2 mm and were negligible. Paired t-tests demonstrated no statistically significant differences in measurements (P > .05).
RESULTS
Table 1 depicts the agreement of 11 observers on cervical staging at two times with the use of Kendall's W statistic. For 86 adolescent cephalograms, Kendall's W values were 0.831 for the initial time and 0.838 a month later. Both values were greater than 0.8, which showed strong statistical agreement among observers regarding cervical staging. However, high Kendall's W values do not support good percent agreement. A total of 4730 interobserver observations were compared. In Table 2, the percent agreement is shown as 39.3% (1858/4730) for the initial time and 44.9% (2122/4730) 1 month later. Percent disagreement values one stage apart were 42.8% (2026/4730) and 41.7% (1974/4730), respectively; this was similar to the percent agreement. In other words, in the assessment 1 month later, observers′ chances to judge the same cervical stage as the first time were similar to judging a previous or a later stage by mistake. Percent disagreement at more than two stages apart was reduced significantly.
Table 1.
Table 2.
With regard to intraobserver agreement between the two time points, values of weighted kappa coefficients ranged from 0.53–0.86 (Table 3), statistically indicating substantial intraobserver agreement. This suggests that cervical staging by the CVM method can be repeatedly assessed with similar results and is thus reliable.
Table 3.
Table 4 shows that the intraobserver percent agreement of each observer ranged from 40.7% (35/86) to 79.1% (68/86). The wide range indicates great variation in intraobserver agreement among different orthodontists. Table 5 presents overall intraobserver percent agreement and disagreement when data provided by 11 observers are combined. A total of 946 intraobserver observations were analyzed. The 56.9% (538/946) intraobserver agreement was greater than the 34.4% (326/946) one-stage-apart disagreement.
Table 4.
Table 5.
Because statistically substantial intraobserver agreement existed between evaluations at the two times, indicating no significant difference between them, only the results of staging at the initial time were examined in the evaluation of validity. For 11 observers, the values of weighted kappa coefficients ranged from 0.44–0.61 (Table 6), which indicates moderate agreement. Therefore, the validity of the CVM method is statistically acceptable.
Table 6.
Table 7 shows the percent agreement and stage-apart disagreement of validity at various cervical vertebral stages. A relatively higher percent agreement is seen at cervical stage 6 (54.5%) of the gold standard, and the least agreement is noted at cervical stage 3 (26.8%), followed by cervical stage 2 (34.3%). Larger one-stage-apart disagreement was found at cervical stage 2 (51.7%), stage 3 (49.5%), and stage 5 (51.0%). Moreover, two-stage-apart disagreement at cervical stage 1 (25.8%), stage 3 (22.2%), and stage 4 (20.2%) of the gold standard is a little higher compared with the others.
Table 7.
DISCUSSION
Longitudinal research is an essential method in the study of the growth and development of the body and the face.1,14–16 When the first CVM method was developed in 1972, Lamparski1 suggested that a longitudinal sample should be employed to eliminate much of the individual variation that was present in his study. The CVM method examined in this study was proposed through longitudinal data with reference to mandibular growth.5 Therefore, to test the validity of this CVM method, longitudinal samples were used, and the yearly mandibular growth rate served as the gold standard, instead of the judgment of experts as suggested in another study.7
A limitation of this study was that consecutive cephalograms for each subject numbered fewer than six. Because the CVM method consists of six stages, ideally a longer period of longitudinal data should be used. Because of the limited source, only 18 subjects, each with more than four cephalograms, were finally enrolled.
The reliability of the CVM method was tested separately by interobserver and intraobserver agreement. Strong agreement was found at both times (Kendall's W values >0.8). The result was a little better than the finding in the previous study.6 A reasonable explanation as suggested by Baccetti et al.17 was that the observers had received more intensive training and practice. However, interobserver percent agreement among observers was still below 50%; this was similar to the results of the previous study.6 This meant that observers in this study had more opportunity to disagree with cervical staging as assessed by others. With respect to intraobserver agreement, the weighted kappa values showed substantial agreement, but the average percent agreement was only 56.9% with a wide range (40.7%–79.1%). This suggested that 1 month later, the observers had nearly half the opportunity to arrive at a different conclusion in cervical vertebral assessment, and great variation was noted in the response of observers.
The validity of this method was proved statistically by the moderate agreement between the gold standard and observers' staging at the initial time in this study, but the percent agreement of various stages seemed unacceptable for clinical use. Our area of greatest concern in skeletal evaluation is whether the adolescent pubertal growth spurt has arrived. Therefore, cervical stage 3, which represents the beginning of pubertal growth, is the most meaningful in CVM assessment. However, much to our disappointment, agreement for cervical stage 3 was only 26.8%, which is the lowest value among the six cervical stages.
The accuracy and reproducibility of the CVM method leave much to be desired. Factors that affect them are listed as follows.
The “measurements” are purely subjective.1 The transition in the shape of cervical vertebral bodies is a consecutive and gradual process. It is difficult to exactly define and identify the gradual appearance of the concavity in the inferior border of the vertebral bodies. By definition, a square shape has equal length and height of the vertebral body. An arbitrarily set border of what is considered square must be applied. So, the difference between horizontally rectangular, square, and vertically rectangular shapes depends on the researcher's arbitrary decision.18
The shapes of cervical vertebrae show marked variation from subject to subject. Sometimes the shape and inferior borders of C2–C4 cannot fulfill the definition of a cervical stage at the same time. For example, in some cases (eg, subject #17 in this study), the shape of the C3 body has changed to horizontally rectangular or even square, but a distinct concavity of the inferior border of C3 does not appear. Furthermore, we noted in the clinic that the shapes of C3 and C4 bodies were still horizontally rectangular in some adult patients (Figure 1).
Could the mandibular growth rate be taken as a reference to establish the CVM method? When most CVM methods1–3,19,20 were developed, except those proposed by Baccetti et al.,4,5 hand-wrist skeletal ages (stages) were used as the gold standard to measure the general growth rate. Controversy about the relationship between facial (mandibular) growth and general bodily growth continues. Some studies9,21,22 support a high degree of association between them. Other studies23,24 show a low degree of association. Scammon's growth curve25 of different tissues may explain in part the weak association. Craniofacial growth is affected not only by the general bodily curve but also by the neural growth curve. Moreover, Mitani24 found that the amount and timing of mandibular growth seem to be more variable than the other areas he studied. It is well known that evaluation of skeletal maturation for patients before orthodontic treatment is conducted to judge whether tooth movement and skeletal modification will be affected by their growth and development. Even though mandibular growth plays an important role in adolescent orthodontic treatment, the response of other parts of the face to adolescent growth should be taken into consideration. Therefore, we believe there is no positive answer yet to the question of whether the mandibular growth rate can be used as a reference to establish the CVM method. If it cannot be used, the validity of this method will be unreliable.
Although evaluating skeletal maturation with the CVM method can provide some useful information for orthodontic treatment planning, considerable variation in skeletal maturation and timing of craniofacial growth at puberty indicates that this method should be used only in conjunction with other indicators such as overall bodily growth, sexual maturation, and so forth. It must be remembered that errors can occur when one diagnostic test is relied on too heavily.2
CONCLUSIONS
The reliability of the CVM method was proved statistically with strong interobserver agreement and substantial intraobserver agreement. However, the percent interobserver agreement was below 50% at both times. Percent intraobserver agreement varied widely among observers (range, 40.7%–79.1%).
The validity of the CVM method was also supported statistically, with moderate agreement between the gold standard and observer staging at the initial time. However, percent agreement, especially at cervical stage 3, left much to be desired.
The CVM method should be used with other growth indicators in the evaluation of skeletal maturation.
REFERENCES
- 1.Lamparski D. Skeletal Age Assessment Utilizing Cervical Vertebrae [thesis] Pittsburgh, Pa: University of Pittsburgh; 1972. [Google Scholar]
- 2.Hassel B, Farman A. G. Skeletal maturation evaluation using cervical vertebrae. Am J Orthod Dentofacial Orthop. 1995;107:58–66. doi: 10.1016/s0889-5406(95)70157-5. [DOI] [PubMed] [Google Scholar]
- 3.San Roman P, Palma J. C, Oteo M. D, Nevado E. Skeletal maturation determined by cervical vertebrae development. Eur J Orthod. 2002;24:303–311. doi: 10.1093/ejo/24.3.303. [DOI] [PubMed] [Google Scholar]
- 4.Baccetti T, Franchi L, McNamara J. A., Jr An improved version of the cervical vertebral maturation (CVM) method for the assessment of mandibular growth. Angle Orthod. 2002;72:316–323. doi: 10.1043/0003-3219(2002)072<0316:AIVOTC>2.0.CO;2. [DOI] [PubMed] [Google Scholar]
- 5.Baccetti T, Franchi L, McNamara J. A., Jr The cervical vertebral maturation (CVM) method for the assessment of optimal treatment timing in dentofacial orthopedics. Semin Orthod. 2005;11:119–129. [Google Scholar]
- 6.Gabriel D. B, Southard K. A, Qian F, Marshall S. D, Franciscus R. G, Southard T. E. Cervical vertebrae maturation method: poor reproducibility. Am J Orthod Dentofacial Orthop. 2009;136:478.e471–e477; discussion 478–480. doi: 10.1016/j.ajodo.2007.08.028. [DOI] [PubMed] [Google Scholar]
- 7.Ballrick J, Fields H, Vig K, Beck F, Germack J, Baccetti T. Reliability and validity of cervical vertebral maturation and hand-wrist radiographs. Proceedings of the 83rd General Session of the IADR/AADR/CADR. 2005:9–12. In. Baltimore, MD. Available at: http://iadr.confex.com/iadr/2005Balt/techprogram/abstract_62129.htm. Accessed August 15, 2011. [Google Scholar]
- 8.Chen L, Liu J, Xu T, Lin J. Longitudinal study of relative growth rates of the maxilla and the mandible according to quantitative cervical vertebral maturation. Am J Orthod Dentofacial Orthop. 2010;137:736.e731–e738. doi: 10.1016/j.ajodo.2009.12.022. [DOI] [PubMed] [Google Scholar]
- 9.Bergersen E. O. The male adolescent facial growth spurt: its prediction and relation to skeletal maturation. Angle Orthod. 1972;42:319–338. doi: 10.1043/0003-3219(1972)042<0319:TMAFGS>2.0.CO;2. [DOI] [PubMed] [Google Scholar]
- 10.Gu Y, McNamara J. A. Mandibular growth changes and cervical vertebral maturation: a cephalometric implant study. Angle Orthod. 2007;77:947–953. doi: 10.2319/071006-284.1. [DOI] [PubMed] [Google Scholar]
- 11.Franchi L, Baccetti T, McNamara J. A., Jr Mandibular growth as related to cervical vertebral maturation and body height. Am J Orthod Dentofacial Orthop. 2000;118:335–340. doi: 10.1067/mod.2000.107009. [DOI] [PubMed] [Google Scholar]
- 12.Blackman N. J. M, Koval J. J. Interval estimation for Cohen's kappa as a measure of agreement. Stat Med. 2000;19:723–741. doi: 10.1002/(sici)1097-0258(20000315)19:5<723::aid-sim379>3.0.co;2-a. [DOI] [PubMed] [Google Scholar]
- 13.Houston W. J. The analysis of errors in orthodontic measurements. Am J Orthod. 1983;83:382–390. doi: 10.1016/0002-9416(83)90322-6. [DOI] [PubMed] [Google Scholar]
- 14.Scammon R. E. The first seriatim study of human growth. Am J Phys Anthropol. 1927;10:329–336. doi: 10.1002/ajpa.1330540103. [DOI] [PubMed] [Google Scholar]
- 15.Moyers R. E. Handbook of Orthodontics 4th ed. Chicago, Ill: Year Book Medical Publishers; 1988. [Google Scholar]
- 16.Franchi L, Baccetti T, McNamara J. A., Jr Postpubertal assessment of treatment timing for maxillary expansion and protraction therapy followed by fixed appliances. Am J Orthod Dentofacial Orthop. 2004;126:555–568. doi: 10.1016/j.ajodo.2003.10.036. [DOI] [PubMed] [Google Scholar]
- 17.Baccetti T, Franchi L, McNamara J. A., Jr Reproducibility of the CVM method: a reply. Am J Orthod Dentofacial Orthop. 2010;137:446–447. doi: 10.1016/j.ajodo.2010.02.010. [DOI] [PubMed] [Google Scholar]
- 18.Fudalej P, Bollen A. M. Effectiveness of the cervical vertebral maturation method to predict postpeak circumpubertal growth of craniofacial structures. Am J Orthod Dentofacial Orthop. 2010;137:59–65. doi: 10.1016/j.ajodo.2008.01.018. [DOI] [PubMed] [Google Scholar]
- 19.Chen L. L, Xu T. M, Jiang J. H, Zhang X. Z, Lin J. X. Quantitative cervical vertebral maturation assessment in adolescents with normal occlusion: a mixed longitudinal study. Am J Orthod Dentofacial Orthop. 2008;134:720.e721–e727. doi: 10.1016/j.ajodo.2008.03.014. [DOI] [PubMed] [Google Scholar]
- 20.Chatzigianni A, Halazonetis D. J. Geometric morphometric evaluation of cervical vertebrae shape and its relationship to skeletal maturation. Am J Orthod Dentofacial Orthop. 2009;136:481.e481–489; discussion 481–483. doi: 10.1016/j.ajodo.2009.04.017. [DOI] [PubMed] [Google Scholar]
- 21.Singh I. J, Savara B. S, Miller P. A. Interrelations of selected measurements of the face and body in pre-adolescent and adolescent girls. Growth. 1967;31:119–131. [PubMed] [Google Scholar]
- 22.Rose G. J. A cross-sectional study of the relationship of facial areas with several body dimensions. Angle Orthod. 1960;30:6–13. [Google Scholar]
- 23.Howells W. W. A factorial study of constitutional type. Am J Phys Anthropol. 1952;10:91–118. doi: 10.1002/ajpa.1330100120. [DOI] [PubMed] [Google Scholar]
- 24.Mitani H, Sato K. Comparison of mandibular growth with other variables during puberty. Angle Orthod. 1992;62:217–222. doi: 10.1043/0003-3219(1992)062<0217:COMGWO>2.0.CO;2. [DOI] [PubMed] [Google Scholar]
- 25.Scammon R. E. The measurement of the body in childhood. In: Harris J, editor. The Measurement of Man. Minneapolis, Minn: University of Minnesota Press; 1930. pp. 171–215. [Google Scholar]