Abstract
Background
Although the Neer and AO/OTA classifications have been widely accepted, observer reliability studies of these two classifications have questioned their reliability and reproducibility to date. We developed an entirely new classification, the Mitsuzawa classification, for dislocated and displaced proximal humeral fractures and tested all three classifications for their intra- and interobserver reliability.
Methods
Two experienced shoulder surgeons and two orthopedic residents independently evaluated the Xray (xR) values of 100 proximal humeral fractures (PHFs). The inclusion criteria for PHFs were (1) fracture-dislocation of the glenohumeral joint, (2) severely displaced fracture that required arthroplasty, such as hemi-arthroplasty or reverse shoulder arthroplasty, and (3) age > 18 years. Four reviewers classified all 100 fractures according to the Neer, AO/OTA, and Mitsuzawa classifications on two occasions. The intraobserver reliability was calculated using a Cohen κ statistic, while the interobserver reliability was calculated using a Fleiss κ statistic.
Results
The average intraobserver agreements for the Neer, AO/OTA, and Mitsuzawa classifications were 0.57 (moderate), 0.67 (substantial), and 0.77 (substantial), respectively. The average interobserver agreements for the Neer, AO/OTA, and Mitsuzawa classifications were 0.49 (moderate), 0.56 (moderate), and 0.73 (substantial), respectively. The most common fracture type in each classification was an anterior dislocated fracture with a greater tuberosity fragment, which corresponded to A3a (57 cases) in the Mitsuzawa classification.
Conclusions
The Mitsuzawa classification of PHF incorporates different perspectives regarding glenohumeral compatibility, assessment before and after shoulder dislocation reduction, and the degree of displacement of the proximal stump of the humeral shaft. Compared with the Neer and AO/OTA classifications, our new classification system adopted a user-friendly flowchart format and provided satisfactory intra- and interobserver reliability.
Level of evidence
Level IV.
Keywords: Proximal humeral fracture, Neer classification, AO/OTA classification, Novel, Dislocation, Displacement
Introduction
Proximal humeral fracture (PHF) accounts for approximately 6% of all fractures, and significantly impairs the activities of daily living related to the patients’ upper extremity [1]. PHFs occur frequently in patients over 65 years of age. While most PHFs are minimally displaced, some follow variable and complex patterns of displacement and may also be accompanied by shoulder dislocation [2]. The previous studies demonstrated that surgery is superior to conservative treatment in terms of pain and range of motion, but this remains controversial [3, 4].
The classification of fractures should be a reliable tool of communication for physicians in clinical practice and education, and should also allow standardization in research. The two most commonly used classifications for PHFs are the Neer and AO/OTA. According to Neer’s original classification, displacement was more than 1 cm of displacement or 45° of angulation [5]. Updated in 2002, the Neer classification identified four main fragments and 16 Fracture subtypes [6]. According to the Neer classification, 98–99% of PHFs are classifiable [7]. The AO/OTA classification is based on the original Muller classification and was last updated in 2018 [8, 9]. The AO/OTA classification identifies three main fracture types based on the number of fragments, which were then categorized into subgroups based on the fragment location and degree of comminution, resulting in a total of 13 fracture subtypes.
Despite the widespread use of these two classifications, several intra- and interobserver reliability studies have questioned their reliability and reproducibility [10–12]. In addition, neither classification accounts for subgroups before and after dislocation reduction. These classifications focus solely on fracture morphology and are not suitable for fracture-dislocations or fractures with severe displacement. These fractures differ significantly from common PHFs, whose severity poses a potential risk of neurovascular injury. In these cases, further physical and imaging assessments are required, along with prompt collaboration with other medical specialists. However, existing classifications fail to effectively categorize dislocated or displaced PHFs. To address these limitations, we developed the Mitsuzawa classification—a user-friendly, comprehensive flowchart-based system.
The primary purpose of this study was to verify whether the new classification for dislocated and displaced PHF produced satisfactory agreement; the second was to compare the intra- and interobserver reliabilities of the new classification with those of the Neer and AO/OTA classifications.
Materials and methods
Study protocol
This study was performed in accordance with the principles of the Declaration of Helsinki (as revised in 2013) and was approved by the Ethics Committee of our institution (approval no. 22239). Informed consent was obtained from all patients before their inclusion in the study and after the anonymous publication of the results.
Two experienced shoulder surgeons and two orthopedic surgery residents independently evaluated the X-ray (xR) values of 100 PHFs treated at our institution between 2011 and 2023. The inclusion criteria for PHFs were (1) fracture-dislocation of the glenohumeral joint, (2) severely displaced fracture that required arthroplasty, such as hemi-arthroplasty or reverse shoulder arthroplasty, and (3) age > 18 years.
Before starting the evaluation, all four reviewers met for a 30-min training session to discuss the content and character of Neer (add one category of unclassifiable, total 17 categories), AO/OTA (add one category of unclassifiable, total 14 categories), and Mitsuzawa classifications (total 21 categories).
Mitsuzawa classification
A flowchart and schematic of the Mitsuzawa classification are shown in Fig. 1. First, PHFs were divided into two types according to the relationship between the humeral head and shaft. If the humeral head is positioned more medially, it is classified as type A, whereas if the humeral shaft is positioned more medially, it is classified as type B. This new classification was designed specifically for dislocated and displaced PHFs; type A primarily applies to dislocated fractures, while type B is used for displaced fractures.
Fig. 1.
Flowchart and schema of Mitsuzawa classification
Type A was further divided into six subtypes according to the relationship with the glenohumeral joint. A1 exhibited no dislocation and A2 exhibited caudal subluxation. A3-5 show anterior dislocation. In A3, there is no fracture line between the head and the shaft. When there is a fracture line between the head and the shaft, the direction of the humeral head cartilage is determined as A4 (caudal) or A5 (cranial). A6 exhibited posterior dislocation. In type A, if reduction is attempted, the xR after reduction determines the subtypes as: a (possible), b (caudal subluxation), and c (impossible).
Type B was further divided into three subtypes according to the extent of humeral shaft displacement. In B1, the medial point of the proximal humeral shaft is lateral to the glenoid line. In B2, the point of the proximal humeral shaft is medial to the glenoid line. In B3, the lateral point of the proximal humeral shaft is medial to the glenoid line.
After including all subtypes (a, b, and c) in A3-6, the total number of categories was 21.
Statistical analysis
Four reviewers classified all 100 fractures according to the Neer, AO/OTA, and Mitsuzawa classifications on two occasions. The second evaluation (Time 2) was conducted 3 to 4 months after the first evaluation (Time 1). Prior to the second evaluation, the 30-min training session was not repeated; instead, each reviewer independently reviewed the three classifications. The intraobserver reliability was calculated using a Cohen κ statistic, while the interobserver reliability was calculated using a Fleiss κ statistic. According to Landis and Koch, the strength of the agreement is categorized as follows [13]: A κ value of 0.00-0.20, slight; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, substantial; 0.81-1.00, almost perfect. All data analyses were performed using SPSS v25 (SPSS Corp., Armonk, NY, USA).
Results
The intraobserver agreement data are presented in Table 1. The average intraobserver agreements among four observers for the Neer, AO/OTA, and Mitsuzawa classifications were 0.57 ± 0.02 (moderate), 0.67 ± 0.03 (substantial), and 0.77 ± 0.06 (substantial), respectively.
Table 1.
Intraobserver Kappa statistics for Neer, AO/OTA and Mitsuzawa classifications for proximal humeral fractures
| Neer | AO/OTA | Mitsuzawa | |
|---|---|---|---|
| Observer 1 | 0.58 | 0.69 | 0.82 |
| Observer 2 | 0.59 | 0.73 | 0.75 |
| Observer 3 | 0.56 | 0.63 | 0.85 |
| Observer 4 | 0.54 | 0.65 | 0.68 |
| Mean(SD) | 0.57(0.02) | 0.67(0.03) | 0.77(0.06) |
The interobserver agreement data are presented in Table 2. At Time 1, the interobserver agreements for the Neer, AO/OTA, and Mitsuzawa classifications were 0.48 (0.44–0.53), 0.54 (0.49–0.60), and 0.72 (0.69–0.76), respectively. At Time 2, they were 0.50 (0.46–0.55), 0.58 (0.53–0.63), and 0.73 (0.69–0.76), respectively. The average interobserver agreements across Time 1 and 2 for the Neer, AO/OTA, and Mitsuzawa classifications were 0.49 (moderate), 0.56 (moderate), and 0.73 (substantial), respectively.
Table 2.
Interobserver Kappa statistics for Neer, AO/OTA and Mitsuzawa classifications for proximal humeral fractures
| Neer | AO/OTA | Mitsuzawa | |
|---|---|---|---|
| Time 1 | 0.48(0.44–0.53) | 0.54(0.49–0.60) | 0.72(0.69–0.76) |
| Time 2 | 0.50(0.46–0.55) | 0.58(0.53–0.63) | 0.73(0.69–0.76) |
| Average | 0.49 | 0.56 | 0.73 |
The proportion of 100 cases classified using the three classifications by the corresponding author (Time 1) is shown in Table 3. One case under the Neer classification and four cases under the AO/OTA classification were unclassifiable. The most frequent fracture type in all classifications was an anterior dislocated fracture with a greater tuberosity fragment, which corresponded to A3a (57 cases) in the Mitsuzawa classification. This was followed by A2 (12 cases) and B2 (7 cases).
Table 3.
The proportion of 100 cases classified (corresponding author, Time 1) using Neer, AO/OTA, and Mitsuzawa classifications
| Neer | AO/OTA | Mitsuzawa | ||||||
|---|---|---|---|---|---|---|---|---|
| 1part | 1 | 11A1.1 | 63 | A1 | 4 | |||
| 2part AN | 0 | 11A1.2 | 0 | A2 | 12 | |||
| 2part SN | 8 | 11A2.1 | 5 | A3 | 0 | |||
| 2part GT | 1 | 11A2.2 | 0 | A3a | 57 | |||
| 2part LT | 0 | 11A2.3 | 0 | A3b | 1 | |||
| 2part AD | 64 | 11A3 | 0 | A3c | 5 | |||
| 2part PD | 1 | 11B1.1 | 16 | A4 | 1 | |||
| 3part GT | 7 | 11B1.2 | 0 | A4a | 1 | |||
| 3part LT | 0 | 11C1.1 | 7 | A4b | 0 | |||
| 3part AD | 6 | 11C1.3 | 0 | A4c | 3 | |||
| 3part PD | 1 | 11C3.1 | 4 | A5 | 2 | |||
| 4part AD | 4 | 11C3.2 | 0 | A5a | 2 | |||
| 4part PD | 0 | 11C3.3 | 1 | A5b | 0 | |||
| 4part VI | 5 | unclassifiable | 4 | A5c | 2 | |||
| 4part LFD | 0 | A6 | 1 | |||||
| 4part AS | 1 | A6a | 0 | |||||
| unclassifiable | 1 | A6b | 0 | |||||
| A6c | 1 | |||||||
| B1 | 0 | |||||||
| B2 | 7 | |||||||
| B3 | 1 | |||||||
AN, anatomical neck; SN, surgical neck; GT, greater tuberosity; LT, lesser tuberosity; AD, anterior dislocation; PD, posterior dislocation; VI, valgus impacted; LFD, lateral fracture-dislocation; AS, articular surface
Discussion
An ideal fracture classification system should not only be reliable and reproducible but also user-friendly. In this study, we developed an entirely new classification, the Mitsuzawa classification, for PHF from a completely different perspective that takes into consideration glenohumeral compatibility, assessment before and after the reduction of shoulder dislocation, and amount of displacement of the proximal stump of the humeral shaft. These characteristics are not included in the Neer or AO/OTA classifications. Compared with the Neer and AO/OTA classifications, our new classification system adopted a user-friendly flowchart format and provided satisfactory intra- and interobserver reliability.
The primary reason for developing this new classification was to evaluate the severity of dislocated and displaced proximal humeral fractures, which cannot be adequately classified using the Neer and AO/OTA classifications. These severe types of fractures carry a potential risk of neurovascular injury, necessitating further assessment and prompt collaboration with other medical specialists. For types A4, A5, B2, and B3, special attention should be given to the possibility of axillary artery and brachial plexus injuries. Additionally, in type b fractures, the presence of interposed soft tissue may impede reduction and cause caudal subluxation, which should also be addressed with caution. In practice, type A3a is the most common type of anterior fracture-dislocation with a greater tuberosity fragment that can be reduced successfully. Conversely, type A3c represents an unfortunate iatrogenic fracture between the head and the shaft. These situations should be avoided by considering the possibility of an occult fracture line before attempting reduction.
The Neer classification has been widely used since it was first described in 1970. There are two versions of the Neer classification. The first version contains 16 possible categories, which was the updated version in 2002 and is now only used for research purposes [6]. The second version is a simplified short version that identifies the number of displacements of the four main fragments and is most commonly used in clinical practice. However, according to previous studies, the agreement between both versions of the Neer classification did not achieve satisfactory results, falling into only moderate to fair [10, 11].
The AO/OTA classification was first published in 1996 and was updated in 2008 and 2018 [9]. The AO/OTA classification identifies three main fracture types based on the number of fragments, which are then categorized into subgroups based on fragment location and degree of comminution. The 2018 version of the AO/OTA classification addresses previous concerns and represents specific schemes for possible 13 categories. Although the Neer classification clearly defines the amount of displacement (more than 1 cm or 45°), the AO/OTA classification is inferior to the Neer classification because it lacks a definition of displacement.
The number of studies that compared the Neer classification with the AO/OTA classification is limited. Papakonstantinou et al. demonstrated that the overall interobserver agreement was moderate in the Neer classification (κ = 0.40–0.58), and fair to moderate in the AO/OTA classification (κ = 0.31–0.54) [11]. Marmor et al. recently reported overall worse results as it found agreement to be fair in the Neer classification (κ = 0.27–0.33), and also fair in the AO/OTA classification (κ = 0.21–0.27) [12]. Our study exhibited a better interobserver agreement for the Neer and AO/OTA classifications (0.49 and 0.56, respectively), and both of them equate moderate agreement. As per previous studies, the high number of categories, the low quality of xR images, and the different experiences and expertise in the shoulder field might be possible factors for the unsatisfactory agreements. Although the number of categories (21) in the Mitsuzawa classification was larger and included four reviewers, the intra- and interobserver reliability were 0.77 and 0.73 (substantial), respectively, which was better than the other two classifications.
The quality of the images significantly influenced the agreement between reviewers. Low-quality xR is one of the causes of fracture misinterpretation and inappropriate classification; therefore, an optimal anteroposterior (AP) projection xR is essential for ensuring the reliability and reproducibility of the classification. This is particularly true for the Mitsuzawa classification, where assessing the glenohumeral compatibility and displacement of the humeral shaft requires a true AP view of the shoulder joint. Ideally, the radiation beam should penetrate through the glenohumeral joint as precisely as possible. Additionally, the shape of the calcar at the proximal end of the shaft should be evaluated with the forearm in 30 deg of external rotation. Gravity should also be considered, as shoulder joint compatibility (reduction, subluxation, and dislocation) can change depending on the standing or supine position and the use of an arm sling with a triangular bandage. For type A fractures, the standing position is recommended because glenohumeral incompatibility is most evident in this posture. For type B fractures, the supine position is preferred, as the absence of gravitational pull makes the displacement more apparent. Otherwise, the dislocation or degree of displacement may be underestimated. In practice, however, achieving these ideal positions may be difficult due to patient pain. Although the aforementioned conditions affect the three classification systems similarly, the Mitsuzawa classification achieved better intra- and interobserver agreements than the other two classifications.
There are several limitations in the present study. The first limitation is its nonrandomized design and the small number of cases and reviewers. Future randomized controlled trials with a larger number of cases and more reviewers could provide more robust evidence. The second limitation is the nonuniform quality of the xR series. The low quality of xR images may have contributed to the disagreements among reviewers. However, this issue is common in clinical settings, and the clarity of the images likely affected all three classifications similarly. Third, the current study did not evaluate computed tomography (CT) images of PHFs. In cases involving fracture lines around the lesser tubercle, CT scans provide greater analytical power than plain radiography. Due to the presence of osseous overlap, CT seems to be superior, particularly in three- or four-part fractures. Some studies have tried to improve the interobserver agreement of both the Neer and AO/OTA classifications by using CT scans and also with 3D reconstruction images; however, the results were not significant [14]. Nevertheless, these advanced imaging modalities play an important role in helping us understand comprehensive images of fracture patterns, plan surgical approach, and recognize the orientation of fracture lines during surgery. As this is the first study to develop a new classification system, we compared three classification systems using only xR. In future studies, the agreement between the Mitsuzawa classification and the other two classification systems should be evaluated using CT scans and 3D reconstruction images.
At our institution, the Mitsuzawa classification is applied to cases involving dislocations or significant displacement of the humeral shaft. In severe injury cases where physical signs suggest neurovascular injury, we promptly perform CT angiography for evaluation without hesitation. Additionally, if such injuries are confirmed, comprehensive management is essential. This includes determining whether to perform a closed or open reduction, deciding if surgery should be urgent or elective, and consulting relevant specialists as needed.
To the best of our knowledge, there is no classification for dislocated and displaced PHFs. The patient cohort in the current study consisted of PHFs with fracture-dislocation of the glenohumeral joint or severely displaced fractures that required arthroplasty. Although rare, these fracture patterns can damage the axillary artery and brachial plexus [15]. The possibility of concomitant neurovascular injury can further increase unfavorable outcomes. One reason for developing this new classification is to predict neurovascular injury associated with PHFs. In future studies, we should attempt to predict which categories should be suspected of axillary artery injury and require CT angiography.
Conclusion
The new classification system for dislocated and displaced PHFs showed promising agreement in this study. Further evaluation with a larger patient cohort is required to confirm its reliability and utility in clinical practice.
Acknowledgements
None.
Author contributions
Sadaki Mitsuzawa: Conceptualization, Methodology, Investigation, Data curation, Writing – original draft, Writing – review & editing. Hisataka Takeuchi: Investigation, Validation. Kenta Ijiri: Investigation, Validation. Yuya Furusho: Investigation, Validation. Shinnosuke Yamashita: Validation. Yoshihiro Tsukamoto: Validation. Satoshi Ota: Validation. Eijiro Onishi: Validation. Tadashi Yasuda: Validation, Supervision.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Data availability
No datasets were generated or analysed during the current study.
Declarations
Ethics approval and consent to participate
This study was performed in line with the principles of the Declaration of Helsinki. This study was approved by the Institutional Review Board. Informed consent was obtained from all patients/parents before inclusion in the study and anonymous publication of the results.
Consent for publication
Consent for publication was obtained from all individual participants included in the study.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Court-Brown CM, Caesar B. Epidemiology of adult fractures: a review. Injury. 2006;37:691–7. 10.1016/j.injury.2006.04.130. [DOI] [PubMed] [Google Scholar]
- 2.Passaretti D, Candela V, Sessa P, Gumina S. Epidemiology of proximal humeral fractures: a detailed survey of 711 patients in a metropolitan area. J Shoulder Elb Surg. 2017;26:2117–24. 10.1016/j.jse.2017.05.029. [DOI] [PubMed] [Google Scholar]
- 3.Handoll H, Brealey S, Rangan A, Torgerson D, Dennis L, Armstrong A, et al. Protocol for the ProFHER (PROximal fracture of the Humerus: evaluation by Randomisation) trial: a pragmatic multi-centre randomised controlled trial of surgical versus non-surgical treatment for proximal fracture of the humerus in adults. BMC Musculoskelet Disord. 2009;16:10140. 10.1186/1471-2474-10-140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Misra A, Kapur R, Maffulli N. Complex proximal humeral fractures in adults – a systematic review of management. Injury. 2001;32:363–72. 10.1016/s0020-1383(00)00242-4. [DOI] [PubMed] [Google Scholar]
- 5.Neer CS. Displaced proximal humeral fractures. I. classification and evaluation. J Bone Joint Surg Am. 1970;52:1077–89. [PubMed] [Google Scholar]
- 6.Neer CS. Four-segment classification of proximal humeral fractures: purpose and reliable use. J Shoulder Elb Surg. 2002;11:389–400. 10.1067/mse.2002.124346. [DOI] [PubMed] [Google Scholar]
- 7.Tamai K, Ishige N, Kuroda S, Ohno W, Itoh H, Hashiguchi H, et al. Four-segment classification of proximal humeral fractures revisited: a multicenter study on 509 cases. J Shoulder Elb Surg. 2009;18:845–50. 10.1016/j.jse.2009.01.018. [DOI] [PubMed] [Google Scholar]
- 8.Müller ME, Koch P, Nazarian S, Schatzker J. Humerus = 1. The Comprehensive classification of fractures of Long bones. Berlin, Heidelberg: Springer; 1990. 10.1007/978-3-642-61261-9_4. [Google Scholar]
- 9.Meinberg EG, Agel J, Roberts CS, Karam MD, Kellam JF. Fracture and dislocation classification compendium-2018. J Orthop Trauma. 2018;32(Suppl 1):S1–170. 10.1097/bot.0000000000001063. [DOI] [PubMed] [Google Scholar]
- 10.Sidor ML, Zuckerman JD, Lyon T, Koval K, Cuomo F, Schoenberg N. The neer classification system for proximal humeral fractures: an assessment of interobserver reliability and intraobserver reproducibility. J Bone Joint Surg Am. 1993;75:1745–50. 10.2106/00004623-199312000-00002. [DOI] [PubMed] [Google Scholar]
- 11.Papakonstantinou MK, Hart MJ, Farrugia R, Gabbe BJ, Moaveni AK, Bavel DV, et al. Interobserver agreement of Neer and AO classifications for proximal humeral fractures. ANZ J Surg. 2016;86:280–4. 10.1111/ans.13451. [DOI] [PubMed] [Google Scholar]
- 12.Marmor MT, Agel J, Dumpe J, Kellam JF, Marecek GS, Meinberg E, et al. Comparison of the neer classification to the 2018 update of the Orthopedic Trauma Association/AO fracture classification for classifying proximal humerus fractures. OTA int. 2022;16(3):5. 10.1097/oi9.0000000000000200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–74. [PubMed] [Google Scholar]
- 14.Berkes MB, Dines JS, Little MTM, Garner MR, Shifflett GD, Lazaro LE, et al. The impact of three-dimensional CT imaging on intraobserver and interobserver reliability of proximal humeral fracture classifications and treatment recommendations. J Bone Joint Surg Am. 2014;96(15):1281–6. 10.2106/jbjs.m.00199. [DOI] [PubMed] [Google Scholar]
- 15.Mitsuzawa S, Yamashita S, Tsukamoto Y, Takeuchi H, Ota S, Onishi E, et al. Axillary Artery Injury Associated with Dislocated or Displaced Proximal Humeral fracture: a report of 3 cases. JBJS Case Connect. 2024;14(3). 10.2106/jbjs.cc.24.00006. e24.00006. [DOI] [PMC free article] [PubMed]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No datasets were generated or analysed during the current study.

