Abstract
Background
Irrespective of the treatment method, union is the ultimate goal of any fracture treatment. However, nowadays, rather than the physician-based clinico-radiological methods, the patient-reported outcome measurements assessing their quality of life and function are gaining much popularity. This is specifically true in the part of the world where the patient needs almost complete degree of hip/knee flexion – for social, cultural, religious or occupational reason(s). The ability to squat can assess the mobility and stability of joints and thus the quality of squatting is a proxy reflection of the functional outcome after fixation of lower limb fracture. Thus, we studied to determine the inter-observer and intra-observer reliability of Radiographic Union Score for Tibia (RUST) and Squat and Smile (S & S) test in clinical photograph. We further calculated the sensitivity and specificity of S & S test in predicting healing of lower limb fracture fixed by intramedullary interlocking nail considering RUST as the gold standard.
Methods
This was a retrospective study of prospectively collected data of solid Surgical Implant Generation Network (SIGN) intramedullary interlocking nailing from a single, university-based, high volume tertiary center where 56 consecutive adults with either tibial or femoral shaft fractures fixed with a SIGN nail within one year and not requiring any surgery till minimum of eighteen-month follow-up were included. Cases without either Anterior-Posterior (AP) view and/or Lateral (Lat.) view follow-up x-ray(s) or proper S & S clinical photograph (at least 1.5-year post fixation) were excluded. The x-rays (RUST criteria) and clinical photograph (S & S grading) were scored by two independent and blinded observers each and repeated after 1 month.
Results
The overall intra-observer reliability was from 0.773 to 0.825 and inter-observer reliability from 0.635 to 0.757 for RUST scoring which was from 0.687 to 0.785 and from 0.301 to 0.650 respectively for S & S scoring. The sensitivity and specificity of S & S in predicting fracture healing were up to 82.22% and 63.64% respectively.
Conclusion
The S & S test is reliable to predict the healing of lower limb fracture fixed with an intramedullary nail. The test is more useful to determine healed fractures than to determine non-healed ones. (sensitivity being higher than specificity)
Keywords: Squat and smile test, RUST scoring, Healing, Prediction, SIGN nailing, Functional outcome
Highlights
-
•
Rate of mortality is more in co-infected group exposed to higher temperature than group exposed to low temperature.
-
•
A temperature dependent increase in intensity of Argulus was observed in the experimental group.
-
•
Co-infected group exposed to elevated temperature impacts the innate immune parameters and antioxidant stress enzymes.
-
•
Variation in temperature can accelerate the intensity of co-infection.
1. Introduction
Union is the ultimate goal of treating any fracture. Determining union is one of the important clinical and fundamental decision in orthopedics regarding weight bearing status, time for implant removal, determining and intervening for abnormal bone healing like delayed union, nonunion etc.
In current clinical perspective, not only the physician-centered clinico-radiological methods of fracture outcome assessment but also the functional assessment tools to assess the quality of life are gaining much popularity. This is specifically true in the part of the world where the patients need complete or almost complete degree of hip and knee flexion to carry their daily activities and floor level activities – for social (food preparation, sanitation), cultural, religious or occupational reason(s). This implies that the patient's ability to squat can assess the mobility and stability of the joints and the quality of squatting is a proxy reflection of the functional outcome after operative fixation of lower limb fracture.
Thus, the squat and smile (S & S) test as a functional assessment tool of squatting ability was conceived by Surgical Implant Generation Network (SIGN) to quickly assess fracture healing of tibia and femur and provide feedback to surgeons from a remote location,1 especially in low and middle-income countries (LMIC) where there are no or limited facilities of x-ray and/or unreliable electricity etc. The S & S can be easily assessed on a simple photograph made from a smart phone or camera and can be shared with a surgeon for evaluation and feedback.1 Such an alternative assessment tool may be useful when patient has difficulty in physical follow up (bad roads, costly transportation, remote stay from cities etc. specially in LMIC's) and during travel restrictions (corona virus pandemic and recent global locked-down). It's surprising to know that it may require a drive of 2006 km on very rough, sometimes scarcely passable roads over 8 days to reach 11 patients in LMIC's – about 183 km for one patient.2 In addition to the variable clinical and/or radiological assessment criteria of fracture healing among the studies, the variation in inter-observer and intra-observer agreement3,4 has created significant disagreement among orthopedic surgeons in both clinical and research settings.5 Surprisingly, we are aware of only one study on prediction of fracture healing with the quality of squatting. (with shorter follow up that our study)6
Thus we aim to determine whether the S & S test is a suitable proxy tool to assess the fracture healing of lower limb long bone after intramedullary interlocking nailing. We calculated the inter-observer and intra-observer reliability (κ) of Radiographic Union Score for Tibia (RUST) and S & S grading (on clinical photograph). We further calculated the sensitivity and specificity of S & S grading in predicting healing of lower limb long bone fractures using RUST as a the gold standard.
2. Material and methods
This was a retrospective study of prospectively collected data related to the solid Surgical Implant Generation Network (SIGN) intramedullary interlocking nail fixation of lower limb fractures. After necessary approval from Institutional Review Committee (IRC) and SIGN Fracture Care International, Richland, US; the study was done at a single, high volume, university based tertiary care center. Out of 61 patients assessed for eligibility within the study period of one year, 56 adults were included. Traumatic, closed or Gustilo grade I open tibial or femoral shaft fracture fixed with a SIGN nail and not requiring any surgery till last follow up (of minimum eighteen months) were included in the study. Cases without either Anterior-Posterior (AP) view and/or Lateral (Lat.) view follow-up x-ray(s) or proper S & S clinical photograph (at least 1.5-year post fixation) were excluded.
2.1. Data extraction/collection
The relevant clinico-demographical details, fracture personality (open or closed fracture and AO classification of fracture), surgical details on approach (antegrade femur, retrograde femur and tibia), follow-up x-rays and clinical photograph of S & S of eligible patients were extracted on real time basis from SIGN Online Surgical Database (SOSD) from www.signsurgery.org. All the observers were SIGN trained and accredited orthopedic surgeons in trauma care for about eight to ten years. The respective observers were supplied the digital copy of high resolution x-rays and clinical photograph amenable to the desired magnification. A separate supplementary pro forma was used for evaluating the x-rays and S & S clinical photographs for each observer for different settings respectively. Data entry of every fifth patient was verified manually with the SIGN online surgical database.
2.1.1. Determination of Radiographic Union Score for Tibia (RUST)
Standard AP view and Lat. view x-rays of the leg (tibia) or thigh (femur) of the eligible patients showing whole or reasonable portion of the limb focused at fracture site were retrieved from SIGN online surgical database. A set of two independent observers (observer 1 and observer 2) scored the fracture healing of each of the four cortices separately (medial and lateral cortices in AP view; and anterior and posterior cortices in Lat. view) based on the RUST criteria (Table: 1)7.
Table 1.
Radiographic union score for tibia (RUST).
| Callus | Fracture Line | Score per Cortex |
|---|---|---|
| Absent | Visible | 1 |
| Present | Visible | 2 |
| Present | Not visible | 3 |
The score for all four cortices was added to give total RUST score ranging from 4 (definitely not healed) to 12 (definitely healed) where a score of ≥9 was considered as the radiological union of fracture which corresponds to the bridging callus in a minimum of three out of four cortices (Fig: 1). RUST was assessed and dichotomized, with a score of ≥9 classified as healed and <9 as not healed. Although RUST was initially used for assessment of healing of tibia fracture, currently it has been used extensively for other typical long bone like femur as well.8,9
Fig. 1.
Radiographic Union Score for Tibia (RUST) assessment.
2.1.2. Determination of squat and smile (S & S) scoring (on clinical photograph)
The corresponding S & S photograph with concealed identity and demonstrating the lower body showing squat posture was retrieved from the database. Similarly, another set of two independent observers (observer 3 and observer 4) scored S & S clinical photograph by visual assessment based on the degree of knee/hip flexion, and the requirement of support or use of assistive device (e.g. table, chair, wall, floor, stool etc.) for squatting into a 0 to 2 scale.6 (Fig: 2) This domain was developed from the literature supporting the importance of the normal squat in activities of daily living. Because the most of the smile ratings were unsure or inconsistent between observers, we decided not to utilize it in assigning a score (Table: 2).6 Functional scores from the S & S photograph were assessed and dichotomized, with a score of 2 classified as normal squatting and <2 classified as abnormal squatting.
Fig. 2.
Squat and smile (S & S) assessment.
Table 2.
Squat and Smile scoring.
| Knee Flexion/Squat break | Assistive device used | Score |
|---|---|---|
| Squat didn't break 90° at knee | Yes or No | 0 |
| Squat break 90° at knee | Yes | 1 |
| Squat break 90° at knee | No | 2 |
Same set of two respective independent observers repeated the radiological scoring and S & S scoring again for another setting after one month (setting 2), all of whom were blinded to their previous corresponding readings (setting 1) to avoid biases.
3. Calculations
3.1. Calculation of inter-observer and intra-observer reliability
The inter-observer and intra-observer reliability of RUST scoring and S & S scoring was calculated separately by pairwise comparison as below and expressed as weighted kappa (κ) along with 95% confidence interval and percentage agreement.
3.1.1. RUST scoring
Intra-observer reliability:
Observer 1's first reading (1A) with Observer 1's second reading (1B)
Observer 2's first reading (2A) with Observer 2's second reading (2B)
Inter-observer reliability:
Observer 1's first reading (1A) with Observer 2's first reading (2A)
Observer 1's first reading (1A) with Observer 2's second reading (2B)
Observer 1's second reading (1B) with Observer 2's first reading (2A)
Observer 1's second reading (1B) with Observer 2's second reading (2B)
3.1.2. S & S scoring
Intra-observer reliability:
Observer 3's first reading (3A) with Observer 3's second reading (3B)
Observer 4's first reading (4A) with Observer 4's second reading (4B)
Inter-observer reliability:
Observer 3's first reading (3A) with Observer 4's first reading (4A)
Observer 3's first reading (3A) with Observer 4's second reading (4B)
Observer 3's second reading (3B) with Observer 4's first reading (4A)
Observer 3's second reading (3B) with Observer 4's second reading (4B)
3.2. Calculation of sensitivity and specificity
The sensitivity and specificity of the S & S score for detecting fracture healing was calculated using RUST score as the gold standard. We calculated sensitivity as no. of cases that had healed RUST score along with normal S & S score divided by total no. of cases that were healed based on RUST score. Similarly, the specificity was calculated as no. of cases that had non-healed RUST score along with abnormal S & S score divided by total no. of cases that were non-healed based on RUST score.
4. Results
SIGN online surgical database of 43 males (between 16 and 70 years) and 13 females (between 16 and 60 years) who satisfied the inclusion criteria were retrospectively analyzed which consisted predominantly of males (76.79%) and with tibia fracture (57.15%), mostly which were closed (Table: 3). The mean follow up period was 20.51 ± 2.34 months, ranging from 18 months to 25 months.
Table 3.
Cohort characteristics.
| Age in years (Mean ± SD) | Total | 32.19 ± 15.33 |
| Male | 32.37 ± 15.34 | |
| Female | 31.61 ± 14.91 | |
| Gender (n, percentage) | Male | 43 (76.79%) |
| Female | 13 (23.21%) | |
| Fracture type (n, percentage) | Closed | 49 (87.50%) |
| Gustilo grade I | 7 (12.50%) | |
| Fracture classification (AO): | ||
| Femur diaphysis (32) | ||
| A (Simple): n | A1, A2, A3 | 3, 7, 4 |
| B (Wedge): n | B2, B3 | 5, 5 |
| C (Multifragmentary): n | C2, C3 | 0, 0 |
| Tibia diaphysis (42) | ||
| A (Simple): n | A1, A2, A3 | 6, 10, 5 |
| B (Wedge): n | B2, B3 | 7, 4 |
| C (Multifragmentary): n | C2, C3 | 0, 0 |
| Level of fracture: | ||
| Femur diaphysis (n) | Proximal, Middle, Distal | 4, 15, 5 |
| Tibia diaphysis (n) | Proximal, Middle, Distal | 3, 14, 15 |
| Approach (n, percentage) | Antegrade Femur | 18 (32.14%) |
| Retrograde Femur | 6 (10.71%) | |
| Tibia (Transpatellar) | 32 (57.15%) | |
RUST score demonstrated overall radiological healing in 78.57% (44 out of 56) to 83.92% (47 out of 56) of cases as rated by two different observers (Observer 1 and Observer 2) at two different settings (Setting 1 and Setting 2) one month apart (Table: 4). The corresponding cross tabulation of non-healed against healed fracture showed relatively consistent results since the minimum follow-up period was 18 months after fracture fixation and the observers were blinded to their previous corresponding readings (Table: 5). We further calculated their inter-observer and intra-observer reliability in RUST scoring (as weighted kappa, 95% confidence interval and percent agreement). We found that overall there was substantial to almost perfect10 intra-observer reliability (κ = 0.773 to 0.825) and substantial inter-observer reliability10 (κ = 0.635 to 0.757) for RUST scoring(Table: 6).
Table 4.
RUST score among different observers in different settings.
| Observers | Settings | RUST score (frequency) |
||||
|---|---|---|---|---|---|---|
| ≤4 | 5–6 | 7–8 | 9–11 | 12 | ||
| Observer no. 1 | 1 (1A) | 1 | 2 | 8 | 15 | 30 |
| 2 (1B) | 1 | 2 | 8 | 18 | 27 | |
| Observer no. 2 | 1 (2A) | 1 | 2 | 6 | 19 | 28 |
| 2 (2B) | 2 | 4 | 6 | 20 | 24 | |
RUST = Radiographic Union Score for Tibia.
Table 5.
RUST scoring (Healed versus Non-healed).
| RUST Scoring Score < 9: Non-healed (NH) ≥ 9: Healed (H) |
Observer no. 1 |
Observer no. 2 |
||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Setting 1 (H1A) |
Setting 2 (H1B) |
Setting 1 (H2A) |
Setting 2 (H2B) |
|||||||
| H | NH | H | NH | H | NH | H | NH | |||
| Observer no. 1 | Setting 1 (H1A) | H | -- | -- | -- | -- | ||||
| NH | ||||||||||
| Setting 2 (H1B) | H | 43 | 2 | -- | -- | -- | ||||
| NH | 2 | 9 | ||||||||
| Observer no. 2 | Setting 1 (H2A) | H | 44 | 3 | 43 | 4 | -- | -- | ||
| NH | 1 | 8 | 2 | 7 | ||||||
| Setting 2 (H2B) | H | 42 | 2 | 42 | 2 | 44 | 0 | -- | ||
| NH | 3 | 9 | 3 | 9 | 3 | 9 | ||||
Table 6.
Inter-observer and intra-observer reliability for RUST scoring: Weighted kappa (95% confidence interval; percent agreement).
| Reliability of RUST Scoring | Observer no. 1 |
Observer no. 2 |
|||
|---|---|---|---|---|---|
| Setting 1 (H1A) | Setting 2 (H1B) | Setting 1 (H2A) | Setting 2 (H2B) | ||
| Observer no. 1 | Setting 1 (H1A) | 1.00 (1,1; 100%) | -- | -- | -- |
| Setting 2 (H1B) | 0.7737 (0.56, 0.98; 92.85%) | 1.00 (1,1; 100%) | -- | -- | |
| Observer no. 2 | Setting 1 (H2A) | 0.7570 (0.53, 0.98; 92.85%) | 0.6355 (0.37, 0.90; 89.28%) | 1.00 (1,1; 100%) | -- |
| Setting 2 (H2B) | 0.7265 (0.50, 0.95; 91.07%) | 0.7265 (0.50, 0.95; 91.07%) | 0.8250 (0.64, 0.89; 94.64%) | 1.00 (1,1; 100%) | |
Similarly, S & S score demonstrated overall normal squat in 69.64% (39 out of 56) to 73.21% (41 out of 56) of cases as rated by two different observers (Observer 3 and Observer 4) at two different settings (Setting 1 and Setting 2) one month apart (Table: 7). The corresponding cross tabulation of abnormal against normal squat showed relatively consistent results (Table: 8). We further calculated their inter-observer and intra-observer reliability in S & S scoring (as weighted kappa, 95% confidence interval and percent agreement). We found that overall there was substantial intra-observer reliability10 (κ = 0.687 to 0.785) and fair to substantial inter-observer reliability10 (κ = 0.301 to 0.650) for S & S scoring (Table: 9).
Table 7.
Squat and Smile score among different observers in different settings.
| Observer | Settings | Squat and Smile score (frequency) |
||
|---|---|---|---|---|
| 0 | 1 | 2 | ||
| Observer no. 3 | 1 (3A) | 6 | 10 | 40 |
| 2 (3B) | 8 | 9 | 39 | |
| Observer no. 4 | 1 (4A) | 7 | 9 | 40 |
| 2 (4B) | 6 | 9 | 41 | |
Table 8.
Squat and Smile scoring (Normal versus Abnormal).
| Squat and Smile Scoring Score < 2: Abnormal (ABN) = 2: Normal (N) |
Observer no. 3 |
Observer no. 4 |
||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Setting 1 (S3A) |
Setting 2 (S3B) |
Setting 1 (S4A) |
Setting 2 (S4B) |
|||||||
| N | ABN | N | ABN | N | ABN | N | ABN | |||
| Observer no. 3 | Setting 1 (S3A) | N | -- | -- | -- | -- | ||||
| ABN | ||||||||||
| Setting 2 (S3B) | N | 37 | 2 | -- | -- | -- | ||||
| ABN | 3 | 14 | ||||||||
| Observer no. 4 | Setting 1 (S4A) | N | 36 | 4 | 35 | 5 | -- | -- | ||
| ABN | 4 | 12 | 4 | 12 | ||||||
| Setting 2 (S4B) | N | 33 | 8 | 32 | 9 | 37 | 4 | -- | ||
| ABN | 7 | 8 | 7 | 8 | 3 | 12 | ||||
Table 9.
Inter-observer and intra-observer reliability for Squat and Smile scoring: Weighted kappa (95% confidence interval; percent agreement).
| Reliability of Squat and Smile Scoring | Observer no. 3 |
Observer no. 4 |
|||
|---|---|---|---|---|---|
| Setting 1 (S3A) | Setting 2 (S3B) | Setting 1 (S4A) | Setting 2 (S4B) | ||
| Observer no. 3 | Setting 1 (S3A) | 1.00 (1,1; 100%) | -- | -- | -- |
| Setting 2 (S3B) | 0.7852 (0.61, 0.96; 91.07%) | 1.00 (1,1; 100%) | -- | -- | |
| Observer no. 4 | Setting 1 (S4A) | 0.6500 (0.43, 0.87; 85.71%) | 0.6135 (0.39, 0.84; 83.92%) | 1.00 (1,1; 100%) | -- |
| Setting 2 (S4B) | 0.3312 (0.06, 0.60; 73.21%) | 0.3011 (0.03, 0.57; 71.42%) | 0.6879 (0.47, 0.90; 87.5%) | 1.00 (1,1; 100%) | |
Furthermore, we tabulated the proportions of healed and not healed fractures as rated by observers 1 and 2 each (at settings 1 and 2) against normal and abnormal squatting as rated by observers 3 and 4 each (at settings 1 and 2), which showed predominantly of healed fractures and/or normal squatting for their corresponding observers. (Table: 10 The sensitivity and specificity of S & S assessment in predicting fracture healing was calculated for each observer/setting. The maximum sensitivity and specificity calculated were 82.22% and 63.64% respectively considering RUST score as the gold standard (Table: 11).
Table 10.
RUST score versus Squat and Smile (S & S) score.
| COMPARISON | Observer 1 (RUST score) |
Observer 2 (RUST score) |
||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Setting 1 (H1A) | Setting 2 (H1B) | Setting 1 (H2A) | Setting 2 (H2B) | |||||||
| Observer 3 (S & S score) | Setting 1 (S3A) | Status | Healed | Not Healed | Healed | Not Healed | Healed | Not Healed | Healed | Not Healed |
| Normal | 33 | 7 | 35 | 5 | 34 | 6 | 33 | 7 | ||
| Abnormal | 12 | 4 | 10 | 6 | 13 | 3 | 11 | 5 | ||
| Setting 2 (S3B) | Normal | 32 | 7 | 34 | 5 | 33 | 6 | 32 | 7 | |
| Abnormal | 13 | 4 | 11 | 6 | 14 | 3 | 12 | 5 | ||
| Observer 4 (S & S score) | Setting 1 (S4A) | Normal | 34 | 6 | 35 | 5 | 34 | 6 | 33 | 7 |
| Abnormal | 11 | 5 | 10 | 6 | 13 | 3 | 11 | 5 | ||
| Setting 2 (S4B) | Normal | 37 | 4 | 37 | 4 | 37 | 4 | 36 | 5 | |
| Abnormal | 8 | 7 | 8 | 7 | 10 | 5 | 8 | 7 | ||
Table 11.
Sensitivity and Specificity of Squat and Smile (S & S) score.
| CALCULATION | Observer 1 (RUST score) |
Observer 2 (RUST score) |
||||
|---|---|---|---|---|---|---|
| Setting 1 (H1A) | Setting 2 (H1B) | Setting 1 (H2A) | Setting 2 (H2B) | |||
| Observer 3 (S & S score) | Setting 1 (S3A) | Sensitivity | 73.33 | 77.78 | 72.34 | 75.00 |
| Specificity | 36.36 | 54.55 | 33.33 | 41.67 | ||
| Setting 2 (S3B) | Sensitivity | 71.11 | 75.56 | 70.21 | 72.73 | |
| Specificity | 36.36 | 54.55 | 33.33 | 41.67 | ||
| Observer 4 (S & S score) | Setting 1 (S4A) | Sensitivity | 75.56 | 77.78 | 72.34 | 75.00 |
| Specificity | 45.45 | 54.55 | 33.33 | 41.67 | ||
| Setting 2 (S4B) | Sensitivity | 82.22 | 82.22 | 78.72 | 81.82 | |
| Specificity | 63.64 | 63.64 | 55.56 | 58.33 | ||
5. Discussion
Rather than the conventional physician-rated outcome assessment, patient reported outcome measures (PROM) considering the functional restoration to pre-injury status should be the ideal assessment tool to determine the quality of any fracture repair. Thus we studied the reliability, sensitivity and specificity of S & S test as a proxy tool to assess fracture healing. The necessary data was extracted from the SIGN online surgical database (SOSD) which is probably one of the largest trauma surgery databases of the world.11
Currently available tools for fracture healing assessment are imaging studies, mechanical assessment, serological markers and clinical examination. Although Computed Tomography (CT) scan is superior than x-ray with high sensitivity (almost 100%) and specificity (62%) for nonunion,5 we have used x-rays as it is cheaper, easily available, fast, most familiar, with lower radiation exposure and without implant artefact. Few other studies have shown that it does not define union (bridging callus in three out of four cortices) with sufficient accuracy (false positive rate of 20% and a false negative rate of 11%) and thus it has been recommended to combine radiological tool with clinical assessment before interpretation, decision and patient counselling.3,12
Radiographic Union Score for Tibia (RUST) was derived from the scale developed by Hammer et al.13 since the Hammer scale had shown poor correlation with mechanical strength.14 The modified version of RUST (mRUST)8 has recently been claimed to have better inter-observer and intra-observer agreement than RUST.7,15,16 Modified RUST is specifically useful for assessment of healing in meta-diaphyseal fractures of a long bone,8 dysplastic or poor quality bones (Osteogenesis Imperfecta)17 and prediction of nonunion in tibia/femur after nailing.18 We have used conventional RUST criteria in our study as it is already validated and popular tool which is considered as the gold standard by many surgeons. Depending upon the criteria used, variable inter-observer agreement for healing (0.57–0.89) has been reported, however the overall impression of healing (κ = 0.89), the number of cortices bridged by callus (κ = 0.82) and visible fracture line (κ = 0.83) were found most perfect.19 RUST has also been validated in various animal studies (rat tibial shaft fracture after external fixation,20 poly-ether-ether-ketone (PEEK) plate fixation of osteotomised femur of rat21) and biomechanical testing (sheep osteotomy model fixed by intramedullary nail).22 Moreover, mRUST is mostly popular among researchers and usually documented in current trials. In our study, each fracture was scored using RUST criteria by two independent observers having nearly similar clinical experience and at two different settings one month apart to calculate inter-observer and intra-observer reliability, who were blinded to each other as well as to their own prior observation to avoid recall bias. Although the grade 4 gait (normal ability to walk and jog) reliably distinguished the healers (RUST ≥ 7) and non-healers (RUST < 7) after nailing of tibia at 6 months (p = 0.056),23 we used S & S test to make the study and assessment simpler, quicker and easier specially to surgeons from LMIC's. It was possible to predict union and non-union reliably (p < 0.001, κ = 0.71 to 0.79) by presence of bridging callus as early as 4 months after the intramedullary nailing of tibia/femur,24 but we assessed at minimum of eighteen months as the prevailing convention, which we thought to be cost effective and more accurate as well.
Although all the domains of the S & S test (squat depth, support needed and facial expression) had high test-retest reliability (k = 0.95, 0.92, and 0.96 respectively), only the first two showed strong inter-rater reliability (k = 0.75 and 0.78 respectively) with high specificity (0.95 and 0.97 respectively), which also correlated with reoperation rate at 1 year for femur fracture (p < 0.01) as well as with EuroQol 5-Dimensions (EQ-5D) index (Swahili version).1Thus, we had not considered facial expression to grade the quality of squat in our study. Moreover, smile/facial expression is more of a subjective perception of the patient (variable pain threshold) as well as that of the observer (variable interpretation) and even few patients can squat comfortably despite nonunion. (particularly hypertrophic transverse nonunion)
Surprisingly, the only available study6 to correlate the Squat and Smile test with fracture healing through SIGN nailing follow-up database had shown substantiate inter-observer reliability for S & S (κ = 0.73–0.78) and almost perfect for radiographs (κ = 0.94), whereas it was fair to substantiate (κ = 0.30–0.65) and substantiate (κ = 0.63–0.75) respectively in our study. The intra-observer reliability for S & S test was substantiate (0.75) in their study similar to ours (κ = 0.68–0.78) while it was almost perfect for radiographs (κ = 0.77–0.82) in our study, which was not reported in their study.6 They found a poor sensitivity (11%) and specificity (85%) of S & S test in predicting non-healing of fracture at the one-year follow-up,6 whereas in our study, we found overall good sensitivity (up to 82.22%) and specificity (up to 63.64%) in predicting fracture healing at minimum one-and-a-half-year follow-up.
Since the randomized study25 has shown no difference in functional outcome of tibia fracture fixed by either solid (SIGN) nail or commercial hollow nail, we feel that the findings of this study on solid (SIGN) nail shall be valid for tibia fracture fixed with a commercial hollow nail as well. However, irrespective of any bone (femur of tibia) or nail design (solid or hollow), we feel that the test is more useful in determining healed fractures than in determining non-healed ones. (sensitivity being higher than specificity)
Minimizing the likelihood of measurement bias by using a standard tool for radiographic scoring (RUST)7 and by standardization of Squat and Smile grading among the two observers as described in literature6; calculation of inter and intra-observer reliability by two independent and blinded observers with almost equal clinical experience each for evaluating the x-rays and squatting quality, with reevaluation after a month are the strengths of our study. Moreover, unlike prior study, we evaluated at minimum follow-up of eighteen months (against twelve months), and predicted the probability of healing of fracture.
We conducted this retrospective study at our single center without standardization of x-ray views or image quality on squat and smile photographs and without power analysis. Both the RUST score and the S & S score were dichotomized (healed versus not healed and normal versus abnormal respectively), with a sense of crude measurement and neglecting the possibility of intermediate outcome, which are the weaknesses of our study. We have not studied the relationship between surgical approach related anterior knee pain and squatting quality. Thus, we suggest more standardized, objective and elaborative assessment tools in future studies.
“Negative selection bias” is the typical challenge in low-middle income countries (LMIC's) where the patients may only return if they feel they have a problem – likely due to unavailability/costly transportation, or simply do not want to spend money if they have no complains. If such proportion is larger, it will over-represent those with a problem and/or unaddressed complications and project that surgery results in bad/inferior outcomes in LMIC's.
6. Conclusions
Squat and Smile (S & S) test is a reliable test and can predict the healing of fractures of lower limb fixed with an intramedullary interlocking nail with substantial accuracy. The test is mostly useful in low-resource setting lacking the x-ray facilities, transportation and financial issues including special situations of travel restrictions. (e.g. current corona pandemic, locked-down etc.)
Financial disclosure
None.
Declaration of competing interest
None.
References
- 1.Wu H.H., Liu M., Challa S.T., Morshed S., Eliezer E.N., Haonga B.T. Development of squat-and-smile test as proxy for femoral shaft fracture-healing in patients in Dar es Salaam, Tanzania. J Bone Jt Surg - Am. 2019;101(4):353–359. doi: 10.2106/JBJS.18.00387. [DOI] [PubMed] [Google Scholar]
- 2.Young S. Orthopaedic trauma surgery in low-income countries. Acta Orthop. 2014;85(suppl 356) doi: 10.3109/17453674.2014.937924. [DOI] [PubMed] [Google Scholar]
- 3.Bhandari M., Guyatt G.H., Swiontkowski M.F., Tornetta P., Srpague S., Schemitsch E.H. A lack of consensus in the assessment of fracture healing among orthopaedic surgeons. J Orthop Trauma. 2002;16(8):562–566. doi: 10.1097/00005131-200209000-00004. [DOI] [PubMed] [Google Scholar]
- 4.Corrales L.A., Morshed S., Bhandari M., Miclau T. Variability in the assessment of fracture-healing in orthopaedic trauma studies. J Bone Jt Surg - Ser A. 2008;90(9):1862–1868. doi: 10.2106/JBJS.G.01580. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Morshed S., Corrales L., Genant H., Miclau T. Outcome assessment in clinical trials of fracture-healing. J Bone Jt Surg - Ser A. 2008;90(SUPPL. 1):62–67. doi: 10.2106/JBJS.G.01556. [DOI] [PubMed] [Google Scholar]
- 6.Alves K.M., Lerner A., Silva G.S., Katz J.N. Surgical implant generation Network implant follow-up: assessment of squat and smile and fracture healing. J Orthop Trauma. 2020;34(4):174–179. doi: 10.1097/BOT.0000000000001671. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Whelan D.B., Bhandari M., Stephen D., Kreder H., McKee M.D., Zdero R. Development of the radiographic union score for tibial fractures for the assessment of tibial fracture healing after intramedullary fixation. J Trauma - Inj Infect Crit Care. 2010;68(3):629–632. doi: 10.1097/TA.0b013e3181a7c16d. [DOI] [PubMed] [Google Scholar]
- 8.Litrenta J., Tornetta P., Mehta S., Jones C., O’Toole R.V., Bhandari M. Determination of radiographic healing: an assessment of consistency using RUST and modified RUST in metadiaphyseal fractures. J Orthop Trauma. 2015;29(11):516–520. doi: 10.1097/BOT.0000000000000390. [DOI] [PubMed] [Google Scholar]
- 9.Leow J.M., Clement N.D., Simpson A.H.W.R. Application of the Radiographic Union Scale for Tibial fractures (RUST): assessment of healing rate and time of tibial fractures managed with intramedullary nailing. Orthop Traumatol Surg Res. 2020;106(1):89–93. doi: 10.1016/j.otsr.2019.10.010. [DOI] [PubMed] [Google Scholar]
- 10.Landis J.R., Koch G.G. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–174. doi: 10.1016/j.earlhumdev.2006.05.022. [DOI] [PubMed] [Google Scholar]
- 11.Shearer D., Cunningham B., Zirkle L.G. Population characteristics and clinical outcomes from the SIGN online surgical database. Tech Orthop. 2009;24(4):273–276. doi: 10.1097/BTO.0b013e3181c3e761. [DOI] [Google Scholar]
- 12.Davis B.J., Roberts P.J., Moorcroft C.I., Brown M.F., Thomas P.B.M., Wade R.H. Reliability of radiographs in defining union of internally fixed fractures. Injury. 2004;35(6):557–561. doi: 10.1016/S0020-1383(03)00262-6. [DOI] [PubMed] [Google Scholar]
- 13.Hammer R.R.R., Hammerby S., Lindholm B. Accuracy of radiologic assessment of tibial shaft fracture union in humans. Clin Orthop Relat Res. 1985;199:233–238. doi: 10.1097/00003086-198510000-00033. [DOI] [PubMed] [Google Scholar]
- 14.Atwan Y., Schemitsch E.H. Radiographic evaluations: which are most effective to follow fracture healing? Injury. 2020 doi: 10.1016/j.injury.2019.12.028. [DOI] [PubMed] [Google Scholar]
- 15.Kooistra B.W., Dijkman B.G., Busse J.W., Sprague S., Schemitsch E.H., Bhandari M. The radiographic union scale in tibial fractures: reliability and validity. J Orthop Trauma. 2010;24(suppl 1) doi: 10.1097/BOT.0b013e3181ca3fd1. [DOI] [PubMed] [Google Scholar]
- 16.Çekiç E., Alici E., Yeşil M. Reliability of the radiographic union score for tibial fractures. Acta Orthop Traumatol Turcica. 2014;48(5):533–540. doi: 10.3944/AOTT.2014.14.0026. [DOI] [PubMed] [Google Scholar]
- 17.Franzone J.M., Finkelstein M.S., Rogers K.J., Kruse R.W. Evaluation of fracture and osteotomy union in the setting of Osteogenesis Imperfecta: reliability of the modified radiographic union score for tibial fractures (RUST) J Pediatr Orthop. 2020;40(1):48–52. doi: 10.1097/BPO.0000000000001068. [DOI] [PubMed] [Google Scholar]
- 18.Perlepe V., Cerato A., Putineanu D., Bugli C., Heynen G., Omoumi P. Value of a radiographic score for the assessment of healing of nailed femoral and tibial shaft fractures: a retrospective preliminary study. Eur J Radiol. 2018;98(October 2017):36–40. doi: 10.1016/j.ejrad.2017.10.020. [DOI] [PubMed] [Google Scholar]
- 19.Whelan D.B., Bhandari M., McKee M.D., Guyatt G.H., Kreder H.J., Stephen D. Interobserver and intraobserver variation in the assessment of the healing of tibial fractures after intramedullary fixation. J Bone Joint Surg Br. 2002;84-B(1):15–18. doi: 10.1302/0301-620x.84b1.0840015. [DOI] [PubMed] [Google Scholar]
- 20.Tawonsawatruk T., Hamilton D.F., Simpson A.H.R.W. Validation of the use of radiographic fracture-healing scores in a small animal model. J Orthop Res. 2014;32(9):1117–1119. doi: 10.1002/jor.22665. [DOI] [PubMed] [Google Scholar]
- 21.Fiset S., Godbout C., Crookshank M.C., Zdero R., Nauth A., Schemitsch E.H. Experimental validation of the radiographic union score for tibia fractures (RUST) using micro-computed Tomography scanning and biomechanical testing in an in-vivo rat model. J Bone Jt Surg. 2018:1871–1878. doi: 10.2106/JBJS.18.00035. [DOI] [PubMed] [Google Scholar]
- 22.Litrenta J., Tornetta P., Ricci W., Sanders R.W., O’Toole R.V., Nascone J.W. In vivo correlation of radiographic scoring (radiographic union scale for tibia fractures) and biomechanical data in a sheep osteotomy model: can we define union radiographically? J Orthop Trauma. 2017;31(3):127–130. doi: 10.1097/BOT.0000000000000753. [DOI] [PubMed] [Google Scholar]
- 23.Macri F., Marques L.F., Backer R.C., Santos M.J., Belangero W.D. Validation of a standardised gait score to predict the healing of tibial fractures. J Bone Jt Surg - Ser B. 2012;94 B(4):544–548. doi: 10.1302/0301-620X.94B4.27927. [DOI] [PubMed] [Google Scholar]
- 24.DiSilvio F, Foyil S, Schiffman B, Bernstein M, Summers H, Lack WD. Long bone union accurately predicted by cortical bridging within 4 months. JBJS Open Access 2018;3(4)e0012 s:1-5. doi:10.2106/jbjs.oa.18.00012. [DOI] [PMC free article] [PubMed]
- 25.Maharjan R., Shrestha B.P., Chaudhary P., Rijal R., Shah Kalawar R.P. Functional outcome of patients of tibial fracture treated with solid nail (SIGN nail) versus conventional hollow nail – a randomized trial. J Clin Orthop Trauma. 2021;12(1):148–160. doi: 10.1016/j.jcot.2020.07.006. [DOI] [PMC free article] [PubMed] [Google Scholar]


