History
Hip osteoarthritis represents one of the most prevalent diseases affecting older adults, consistently ranking as one of the most common causes of functional disability in addition to carrying an immense socioeconomic burden [20]. Published reports from the World Health Organization indicate that approximately 10% of men and 18% of women older than 60 years of age have symptomatic osteoarthritis [29]. The level of functional disability can be highly variable, but it is estimated that approximately 80% of those with osteoarthritis have some limitation in movement and up to one-third of them can be considered “severely disabled.” It is an issue that will only increase in incidence over time, because it is estimated that the proportion of people older than 60 years of age will triple by 2050 [29] Ackerman et al. [1] reviewed the lifetime risk of THA in males and females using registry data from five countries, finding the lifetime risk to be as high as one in seven women in Norway and one in 10 men in Finland by 2013; women consistently had higher lifetime risks of THA in their review. Epidemiologic studies evaluating hip osteoarthritis have demonstrated marked regional differences in the prevalence of the disease with Asians and Africans having a prevalence of 1.2% and 2.8% and North Americans and Europeans having a prevalence of 7.2% and 20.1%, respectively [12, 14].
Numerous classification systems for hip osteoarthritis have been proposed, including the Tönnis classification, Croft classification [13] as well as the Kellgren-Lawrence classification previously reviewed in this section [17].
The Tönnis classification originally rose from a series of research articles published in 1972 by Professor Dietrich Tönnis and his colleagues in Dortmund, Germany [5, 6]. The aim of these studies was to develop a quantitative method to differentiate between normal and dysplastic juvenile hips. In their article entitled “A New Method for Roentgenologic Evaluation of the Hip Joint—the Hip Factor,” the authors assessed 817 adult hip radiographs (patient age 21-50 years, excluding patients who had undergone hip surgery) to develop a quantitative measurement for assessing hip dysplasia, ie, “the hip factor.” As part of the analysis, the authors grouped the patients into three separate grades of osteoarthritis based on the evaluation of a standard AP pelvis radiograph, which became the foundation of the classification scheme now bearing Tönnis’ name [6].
Purpose
Creating a new classification system for a particular pathology may be done for a number of reasons such as to improve effectiveness of communication among providers and/or researchers or even to dictate management of the pathology. Tönnis’ classification was initially created by him and his colleagues for the purposes of research to serve as a qualitative grade for severity of degenerative radiographic changes in adults [6] and was later used by Tönnis to correlate femoral/acetabular anteversion to severity of arthrosis [25]. In these articles, nothing is said regarding the choice to create a new system in light of other systems having already been published at that time [16].
As a qualitative analysis based off of a plain AP radiograph of the pelvis, the Tönnis classification offers the advantage of easy application to a clinic setting when compared with schemes requiring advanced imaging or quantitative measurements. Such characteristics make the scheme appealing for use in daily clinical practice as a potential tool to help subdivide surgical management. This warrants exploration into the reliability, and moreover the possibility in guiding management, of the Tönnis classification.
Description of the Tönnis Classification
The Tönnis classification, as originally described in 1972 by Busse et al. [6], consists of three progressive degrees of degenerative changes to the hip; it was later republished by Tönnis and Heinecke in 1999 [25] with the addition of a Grade 0, or hip absent of arthrosis. Grade 1 indicates slight narrowing of the joint space, slight lipping at the joint margin, and slight sclerosis of the femoral head or acetabulum; Grade 2 indicates the presence of small bony cysts, further narrowing of the joint space, and moderate loss of femoral head sphericity; Grade 3 is the most severe and indicates large cysts, severe narrowing of the joint space, severe femoral head deformity, and avascular necrosis (Table 1).
Table 1.
Tönnis grading scale of hip osteoarthritis

In addition to Tönnis’ classification, there have been a number of other well-described classification schemes for osteoarthritis; these include schemes published by Croft [13], Kellgren-Lawrence [16], the International Knee Documentation Committee [30], Fairbank [15], Altman et al. [3], and Ahlbäck [2]. Of the aforementioned classifications, only the Kellgren-Lawrence and Croft classifications are applicable to the hip with the remainder describing the knee specifically. The Kellgren-Lawrence scale is a 5-point grading scale from 0 to 4;0 indicates no joint space narrowing nor reactive changes (no osteoarthritis [OA]); 1 indicates doubtful joint space narrowing with possible lipping osteophytes (doubtful OA); 2 indicates definite osteophytes with possible joint space narrowing (mild OA); 3 indicates moderate osteophytes with definite joint space narrowing, some sclerosis, and possible bony deformity (moderate OA); and 4 indicates progression to large osteophytes, severe sclerosis, and definite bone end deformity (severe OA). Croft et al. [13] devised a 6-point grading scale from 0 to 5; 0 indicates no radiographic abnormalities; 1 indicates only osteophytosis; 2 indicates joint space narrowing only; 3 indicates the presence of two out of the following: osteophytosis, joint space narrowing, presence of cysts, and subchondral sclerosis; 4 indicates three of the aforementioned criteria; and 5 indicates progression to femoral head deformity.
Validation and Reliability
The Tönnis classification is widely utilized by arthroplasty surgeons, arthroscopic surgeons, rheumatologists, radiologists, and physical therapists [4, 9, 10, 24]. Despite its widespread use, the utility of the Tönnis classification has been a point of contention because of conflicting data regarding its reliability [7, 11, 18, 19, 23, 27, 28].
In a 2008 study that included 63 patients who underwent a Bernese periacetabular osteotomy approximately 20 years prior, Steppacher et al. [23] evaluated the validity of the Tönnis classification using two orthopaedic surgeons to grade 50 pelvic radiographs on two separate occasions. They found substantial interobserver (κ = 0.74) and intraobserver reliability (κ = 0.73-0.76) using the standardized Landis and Koch benchmarks for agreement [18, 23].
Clohisy et al. [11] evaluated reliability using 77 patients with femoroacetabular impingement, developmental dysplasia of the hip, or no hip pain and found slightly lower reliability than that of Steppacher et al. with moderate interobserver reliability (κ = 0.59) and intraobserver reliability (κ = 0.60). Their study used five hip specialists as well as a fellow to interpret the images. The authors attributed the lower agreement in their report compared with that of Steppacher to the inclusion of a control group.
By contrast, a critical review of the Tönnis classification used three orthopaedic surgeons to classify the hip radiographs of 61 patients by Tönnis grade, divided into two cohorts (one included candidates for hip preservation surgery, whereas the control group consisted of patients without hip pain). This study found only slight to fair interobserver reliability (κ = 0.173-0.397) and fair intraobserver reliability (κ = 0.364-0.397). The most frequent cause of disagreement in this study involved differentiating Grade 0 from Grade 1 hips [28].
Nepple et al. [19] also focused on younger patients with hip pain in a study evaluating 25 radiologic parameters of dysplasia and OA in 70 patients undergoing hip preservation surgery. Four hip specialists interpreted the radiographs. The average patient age was 31 years, and 55 of the 70 patients had femoroacetabular impingement with the remainder diagnosed with acetabular dysplasia. They found grading patients by the Tönnis classification had only fair interobserver reliability (κ = 0.22) and moderate intraobserver reliability (κ = 0.53). Like Valera et al. [28], the radiographs in this study were weighted heavily toward early arthritis with only 4.5% of patients graded as either Grade 2 or 3. In contrast, joint space width was found to have substantial inter- and intraobserver reliability (κ = 0.62 and 0.71, respectively) [19].
Another reliability analysis by Carlisle et al. used five physicians of various disciplines and levels of training to evaluate 45 patients with developmental dysplasia of the hip, femoroacetabular impingement, or normal anatomy. One orthopaedic attending, two physiatry attendings, one orthopaedic fellow, and two orthopaedic residents interpreted the images. They found only slight interobserver agreement for Tönnis grade (κ = 0.17) but moderate intraobserver reproducibility (κ = 0.57) [7].
In a study by Troelsen et al. [26], four observers (medical student, orthopaedic resident, orthopaedic attending, and radiology attending) reviewed 25 pelvic radiographs and subsequently their associated CT scans and graded them according to Tönnis’ classification. They found poor interobserver reliability with κ values ranging from -0.02 to 0.33; they found that interobserver agreement increased when Tönnis grade was dichotomized to Grades 0 to 1 and 2 to 3 (κ values 0.20-0.39) and that joint space width < 2 mm on plain radiographs was a more reliable marker of OA (κ values 0.40-0.46) [27].
Limitations
The development of the Tönnis classification system was based on studies that specifically focused on the hip. By extension, it most directly applies to a spheroidal (ie, ball-and-socket) synovial joint that is reinforced by the presence of a fibrocartilaginous lip (the acetabular labrum). Because other diarthrodial joints have different anatomy, and different roles in weightbearing and motion, the Tönnis grading system cannot be applied broadly to all joints.
Studies on the classification seem to suggest less validity in its lower grades because studies with higher numbers of low-grade OA demonstrate lower validity for the Tönnis classification [11]. This is especially important in one common application of the Tönnis classification, which is to help guide the decision of whether a patient may be a good candidate for hip preservation surgery, which is known to be less effective in patients who have even mild arthritis [31]. In addition, because the Tönnis classification relies exclusively on radiographic findings, other variables that may play a role in the success of hip preservation surgery (such as articular cartilage health, three-dimensional femoroacetabular anatomy, labral integrity, among others) obviously are not considered under the Tönnis rubric, and adequate evaluation of those factors generally calls for advanced imaging.
Ultimately, a major criticism of the Tönnis classification system is that it is subjective. The Tönnis classification has been criticized as being unclear in its terminology as well as for its failure of overlapping parameters [28]. Five major radiographic findings are used in the classification: presence of sclerosis, joint space width, head sphericity, cyst size, and osteophyte formation, the latter of which is described as “lipping at the joint margins.” None of these parameters includes quantitative definitions. For example, Tönnis Grade 2 arthritis includes “small” subchondral cysts, and Grade 3 is defined by “large” cysts, but there are no sizes designated in the original article.
In addition, Tönnis’ original article does not help the user decide which grade to use if findings from two different grades are present on one radiograph. For example, if a patient has moderate loss of sphericity of the femoral head (a Tönnis Grade 2 finding) alongside only slight narrowing of the joint (a Grade 1 finding)—such as might occur in a patient with early Legg-Calvé-Perthes disease—the Tönnis classification is unclear about whether this would be a patient with Grade 1 or Grade 2 arthritis. Instead, the classification assumes a stepwise progression in which all parameters radiographically worsen over time equally in accordance with the described classification, which is not always the case. As noted by Valera et al. [28], although sphericity of the femoral head is one of the more reliably reproducible parameters used by Tönnis, it could easily lead to confusion in grading in patients with nondegenerative cam deformity and no radiographic findings of OA.
Conclusions
Although the Tönnis classification has limitations, it remains a simple system that provides a qualitative description of commonly obtained radiographic imaging of the hip and continues to be frequently used in clinical practice as well as research. It relies on the visual assessment of an AP pelvis radiograph and does not require additional time or resources to make digital or manual measurements of the radiographs.
The subjective nature of the classification and the lack of consensus on its reliability (particularly at early stages of hip arthritis, where the user wishes to distinguish Grade 0 from Grade 1 [11, 19, 28]) make it difficult to recommend its widespread use, particularly given that alternative classification schemes including the Kellgren-Lawrence and Croft systems have been demonstrated in studies to be more reliable measures.
In a reliability study by Reijman et al. [22], the Kellgren-Lawrence system had higher interobserver reliability (κ = 0.68) than Croft’s (κ = 0.52) and demonstrated both a stronger association with clinical symptoms of hip OA as well as being predictive of the eventual need for hip replacement. Similarly, additional studies have demonstrated support for the validity of the Kellgren-Lawrence scheme in particular [21].
Several studies have found that patients with higher grade Tönnis grades who undergo hip preservation surgery have poorer patient-reported outcome scores and are more likely to undergo premature conversion to THA [8]. This suggests that despite the problems of reliability, the Tönnis classification may be used to good effect as a tool of communication, prognosis, and research in the right circumstances.
In our opinion, the Tönnis classification is a straightforward qualitative description of a stepwise pattern of hip OA but has limited utility in the research setting. Its reliability has not consistently demonstrated superiority over other classification systems. Ultimately, without stronger evidence supporting its reliability or validity, it cannot be recommended as a tool with which to routinely guide management and treatment options.
Acknowledgments
We thank Michael Neininger for his work in providing English translations of the original German Tönnis papers in which the classification scheme was founded.
Footnotes
Each author certifies that he or she has no commercial associations (eg, consultancies, stock ownership, equity interest, patent/licensing arrangements, etc) that might pose a conflict of interest in connection with the submitted article.
All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.
References
- 1.Ackerman IN, Bohensky MA, de Steiger R, Brand CA, Eskelinen A, Fenstad AM, Furnes O, Graves SE, Haapakoski J, Mäkelä K. Lifetime risk of primary total hip replacement surgery for osteoarthritis from 2003-013: a multi-national analysis using national registry data. Arthritis Care Res. 2017;69:1659–1667. [DOI] [PubMed] [Google Scholar]
- 2.Ahlbäck S. Osteoarthrosis of the knee. A radiographic investigation. Acta Radiol Diagn (Stockh). 1968;Suppl 277:7–72. [PubMed] [Google Scholar]
- 3.Altman R, Asch E, Bloch D, Bole G, Borenstein D, Brandt K, Christy W, Cooke T, Greenwald R, Hochberg M. Development of criteria for the classification and reporting of osteoarthritis: classification of osteoarthritis of the knee. Arthritis Rheumatol. 1986;29:1039–1049. [DOI] [PubMed] [Google Scholar]
- 4.Bittersohl B, Steppacher S, Haamberg T, Kim Y-J, Werlen S, Beck M, Siebenrock K-A, Mamisch TC. Cartilage damage in femoroacetabular impingement (FAI): preliminary results on comparison of standard diagnostic vs delayed gadolinium-enhanced magnetic resonance imaging of cartilage (dGEMRIC). Osteoarthritis Cartilage. 2009;17:1297–1306. [DOI] [PubMed] [Google Scholar]
- 5.Brückl R, Hepp W, Tönnis D. [Differentiation of normal and dysplastic juvenile hip joints by means of the summarized hip factor] [in German]. Arch Orthop Unfallchir. 1971;74:13–32. [DOI] [PubMed] [Google Scholar]
- 6.Busse J, Gasteiger W, Tönnis D. [A new method for roentgenologic evaluation of the hip joint—the hip factor] [in German]. Arch Orthop Unfallchir. 1971;72:1–9. [DOI] [PubMed] [Google Scholar]
- 7.Carlisle JC, Zebala LP, Shia DS, Hunt D, Morgan PM, Prather H, Wright RW, Steger-May K, Clohisy JC. Reliability of various observers in determining common radiographic parameters of adult hip structural anatomy. Iowa Orthop J. 2011;31:52–58. [PMC free article] [PubMed] [Google Scholar]
- 8.Chandrasekaran S, Darwish N, Gui C, Lodhia P, Suarez-Ahedo C, Domb BG. Outcomes of hip arthroscopy in patients with Tönnis grade-2 osteoarthritis at a mean 2-year follow-up: evaluation using a matched-pair analysis with Tönnis grade-0 and grade-1 cohorts. J Bone Joint Surg Am. 2016;98:973–982. [DOI] [PubMed] [Google Scholar]
- 9.Cibulka MT, Threlkeld J. The early clinical diagnosis of osteoarthritis of the hip. J Orthop Sports Phys Ther. 2004;34:461–467. [DOI] [PubMed] [Google Scholar]
- 10.Clohisy JC, Carlisle JC, Beaulé PE, Kim Y-J, Trousdale RT, Sierra RJ, Leunig M, Schoenecker PL, Millis MB. A systematic approach to the plain radiographic evaluation of the young adult hip. J Bone Joint Surg Am. 2008;90(Suppl 4):47–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Clohisy JC, Carlisle JC, Trousdale R, Kim Y-J, Beaule PE, Morgan P, Steger-May K, Schoenecker PL, Millis M. Radiographic evaluation of the hip has limited reliability. Clin Orthop Relat Res. 2009;467:666–675. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Cooper C, Javaid MK, Arden N. Epidemiology of osteoarthritis. In: Arden N, ed. Atlas of Osteoarthritis. Tarporley, UK: Springer Healthcare Communications; 2014:21–36. [Google Scholar]
- 13.Croft P, Cooper C, Wickham C, Coggon D. Defining osteoarthritis of the hip for epidemiologic studies. Am J Epidemiol. 1990;132:514–522. [DOI] [PubMed] [Google Scholar]
- 14.Dagenais S, Garbedian S, Wai EK. Systematic review of the prevalence of radiographic primary hip osteoarthritis. Clin Orthop Relat Res. 2009;467:623;637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Fairbank T. Knee joint changes after meniscectomy. Bone Joint J. 1948;30:664–670. [PubMed] [Google Scholar]
- 16.Kellgren J, Lawrence J. Radiological assessment of osteo-arthrosis. Ann Rheum Dis. 1957;16:494–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kohn MD, Sassoon AA, Fernando ND. Classifications in Brief: Kellgren-Lawrence classification of osteoarthritis. Clin Orthop Relat Res. 2016;474:1886–1893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174. [PubMed] [Google Scholar]
- 19.Nepple JJ, Martell JM, Kim Y-J, Zaltz I, Millis MB, Podeszwa DA, Sucato DJ, Sink EL, Clohisy JC; ANCHOR Study Group. Interobserver and intraobserver reliability of the radiographic analysis of femoroacetabular impingement and dysplasia using computer-assisted measurements. Am J Sports Med. 2014;42:2393–2401. [DOI] [PubMed] [Google Scholar]
- 20.Nho SJ, Kymes SM, Callaghan JJ, Felson DT. The burden of hip osteoarthritis in the United States: epidemiologic and economic considerations. J Am Acad Orthop Surg. 2013;21(Suppl 1):S1–S6. [DOI] [PubMed] [Google Scholar]
- 21.Reijman M, Hazes J, Koes B, Verhagen A, Bierma-Zeinstra S. Validity, reliability, and applicability of seven definitions of hip osteoarthritis used in epidemiological studies: a systematic appraisal. Ann Rheum Dis. 2004;63:226–232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Reijman M, Hazes J, Pols H, Bernsen R, Koes B, Bierma-Zeinstra S. Validity and reliability of three definitions of hip osteoarthritis: cross sectional and longitudinal approach. Ann Rheum Dis. 2004;63:1427–1433. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Steppacher SD, Tannast M, Ganz R, Siebenrock KA. Mean 20-year followup of Bernese periacetabular osteotomy. Clin Orthop Relat Res. 2008;466:1633–1644. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Terjesen T, Gunderson RB. Radiographic evaluation of osteoarthritis of the hip: an inter-observer study of 61 hips treated for late-detected developmental hip dislocation. Acta Orthop. 2012;83:185–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Tönnis D, Heinecke A. Acetabular and femoral anteversion: relationship with osteoarthritis of the hip. J Bone Joint Surg Am. 1999;81:1747–1770. [DOI] [PubMed] [Google Scholar]
- 26.Troelsen A, Elmengaard B, Søballe K. Medium-term outcome of periacetabular osteotomy and predictors of conversion to total hip replacement. J Bone Joint Surg Am. 2009;91:2169–2179. [DOI] [PubMed] [Google Scholar]
- 27.Troelsen A, Rømer L, Kring S, Elmengaard B, Søballe K. Assessment of hip dysplasia and osteoarthritis: variability of different methods. Acta Radiol. 2010;51:187–193. [DOI] [PubMed] [Google Scholar]
- 28.Valera M, Ibañez N, Sancho R, Tey M. Reliability of Tönnis classification in early hip arthritis: a useless reference for hip-preserving surgery. Arch Orthop Trauma Surg. 2016;136:27–33. [DOI] [PubMed] [Google Scholar]
- 29.Wittenauer R, Smith L, Aden K. Background Paper 6.12 Osteoarthritis. Geneva, Switzerland: World Health Organization; 2013. [Google Scholar]
- 30.Wright RW, Ross JR, Haas AK, Huston LJ, Garofoli EA, Harris D, Patel K, Pearson D, Schutzman J, Tarabichi M. Osteoarthritis classification scales: interobserver reliability and arthroscopic correlation. J Bone Joint Surg Am. 2014;96:1145–1151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Zhang W, Doherty M, Arden N, Bannwarth B, Bijlsma J, Gunther K-P, Hauselmann HJ, Herrero-Beaumont G, Jordan K, Kaklamanis P. EULAR evidence based recommendations for the management of hip osteoarthritis: report of a task force of the EULAR Standing Committee for International Clinical Studies Including Therapeutics (ESCISIT). Ann Rheum Dis. 2005;64:669–681. [DOI] [PMC free article] [PubMed] [Google Scholar]
