ABSTRACT
Purpose
This study aimed to translate the T‐14 questionnaire into Swedish and validate it in a Swedish pediatric population with tonsil‐related disease.
Methods
A prospective validation study was conducted across five Swedish otorhinolaryngology departments from October 2022 to June 2024. Translation followed a standardized forward–backward procedure. Caregivers of patients (< 16 years) scheduled for tonsil surgery completed the Swedish T‐14 and EQ‐5D‐Y VAS (EuroQol five‐Dimensional questionnaire for Youth, Visual Analog Scale [VAS]) at baseline, 2 weeks later, and 6 months postoperatively. A healthy control group was included for comparison. Psychometric evaluation included known‐groups validity (Mann–Whitney U), construct validity (confirmatory factor analysis), convergent validity (Spearman's correlation), internal consistency (Cronbach's α), test–retest reliability (intraclass correlation coefficient), and responsiveness (Wilcoxon signed‐rank, Cohen's d).
Results
A total of 249 participants were included (case group 169, control group 80). Known‐groups validity was confirmed, with significantly higher baseline T‐14 scores in the case group (p < 0.001). Confirmatory factor analysis supported the predefined two‐factor model (comparative fit index = 0.94, Tucker–Lewis index = 0.92, standardized root mean square residual = 0.061), although root mean square error of approximation (0.099) indicated some misfit. Convergent validity was demonstrated by a weak but significant negative correlation with EQ‐5D‐Y VAS (rs = −0.32; p < 0.001). Internal consistency was good (α = 0.83), and test–retest reliability showed good stability (ICC = 0.77). Responsiveness was strong, with a large reduction in T‐14 scores 6 months after surgery (p < 0.001; Cohen's d = 1.89).
Conclusion
T‐14 demonstrated good validity, reliability, and responsiveness, effectively distinguishing patients from healthy controls. It is suitable for clinical use and enables comparisons with international studies.
Level of Evidence
Level III—non‐randomized controlled cohort.
Keywords: child, patient‐reported outcome measure, T‐14, tonsil surgery, validation study
1. Introduction
Tonsil surgery ranks among the most frequently performed pediatric procedures under general anesthesia in Sweden. Despite its widespread use, the indications for tonsil and adenoid surgery remain a subject of debate. To standardize and evaluate indications and outcomes across providers and over time, it is crucial to measure the burden of disease before and after surgery. Currently, and in a Swedish context, there is a lack of validated instruments to assess the impact of tonsil and adenoid disease on quality of life (QoL) and subsequent changes following surgical intervention.
The Swedish Quality Register for Tonsil Surgery (SQTS) serves as a nationwide database that systematically tracks clinical practices, postoperative complications, and patient‐reported outcome measures (PROMs) after benign tonsil surgery [1]. Analysis of SQTS longitudinal data from 2009 to 2018 demonstrated a 15%–20% reduction in the proportion of patients achieving complete symptom resolution at 6 months after surgery [2]. This decline was observed across all pediatric indications and types of surgery and could not be explained by changes in patient demographics, surgical indications, or techniques. The SQTS currently lacks validated PROMs.
Fundamentally, tonsil and adenoid surgery are performed on otherwise healthy children, and each procedure carries a small but real risk of complications, subsequent morbidity, and, in rare cases, mortality. Therefore, documenting and monitoring symptom burden both before and after surgery is essential to evaluate the benefits of the intervention. This is important not only from a health economic perspective but also for individual person‐centered care. The use of PROMs in clinical practice and research is rapidly increasing, underscoring the need for culturally and linguistically adapted instruments [3]. This is particularly relevant in the context of multinational and multicultural studies, which require validated translations of QoL tools to enable meaningful comparisons across populations.
The Paediatric Throat Disorders Outcome Test (T‐14), developed and validated in the United Kingdom by Hopkins et al., is a 14‐item caregiver‐reported questionnaire that assesses both infectious and obstructive tonsil symptoms and their impact on QoL in children under the age of 16 [4]. The T‐14 has also been translated and validated in Spanish [5]. The instrument has been utilized in multiple studies conducted in England and Australia to assess throat disorders and postoperative outcomes following tonsil surgery [6, 7, 8, 9]. Currently, no equivalent validated instrument is available in Swedish. A validated version of the T‐14 instrument would fulfill a critical need in both Swedish tonsil research and routine clinical practice.
This study aims to translate the T‐14 questionnaire into Swedish and validate the instrument in a Swedish pediatric population with tonsil‐related disease.
2. Material and Methods
2.1. Study Group and Data Collection
A prospective validation study was performed in five Swedish otorhinolaryngological departments: the hospitals of Region Jönköping County (Ryhov, Eksjö and Värnamo), Karolinska University Hospital, Helsingborg Hospital, Södra Älvsborg Hospital in Borås, and Capio Örebro. These sites represent a broad geographic distribution across Sweden and include both university and smaller regional hospitals, as well as institutions operating under public and private governance. Data collection took place between October 2022 and June 2024. Caregivers for children under 16 years scheduled for tonsil surgery, with or without concurrent adenoidectomy, were eligible for inclusion in the case group. The participants in the case group were recruited either during the visit when the indication for tonsil surgery was registered, from the surgical waiting list, or on the day of surgery. Exclusion criteria included malignant disease, syndromic conditions, and craniofacial malformations. A convenience sampling method was used to recruit caregivers of healthy children for a control group. For inclusion, caregivers confirmed that their children had no known history of tonsil‐related problems or symptoms. All caregivers were required to understand spoken and written Swedish.
In total, 249 caregivers were enrolled in the study, including 169 cases and 80 controls. At 2 weeks (retest), 62/111 (55.9%) and 77/80 (96.3%) in the case and control groups respectively, participated, Figure 1. In the case group, 58 participants were recruited on the day of surgery, which prevented the collection of test–retest data at the two‐week follow‐up. At 6 months post‐surgery, 114/169 (67.5%) in the case group provided follow‐up data. At inclusion, demographic information including age, sex, surgical type (i.e., tonsillectomy or tonsillotomy with or without adenoidectomy), and surgical indication were collected by the attending otolaryngologist. The indication was divided into two subgroups: surgery due to Obstruction or Infection. The Obstruction group included patients with indications such as habitual snoring/obstructive sleep apnea (OSA), or hypertrophic tonsils leading to dysphagia or oral motor dysfunction. The Infection group comprised children with recurrent or chronic tonsillitis, peritonsillar abscess, and tonsillitis associated with systemic complications. The indication for surgery was determined by the surgeon based on the patient's clinical history and physical examination findings. In line with standard Swedish clinical practice, polysomnography was not performed to confirm OSA in children older than 18 months when the clinical history and physical findings strongly suggested airway obstruction [10].
FIGURE 1.

Flowchart illustrating participant recruitment and data collection at baseline, two‐week retest, and six‐month follow‐up.
Alongside the T‐14 questionnaire, the participants completed the full EuroQol five‐dimensional questionnaire for youth (EQ‐5D‐Y). For the present study, only the VAS component of the EQ‐5D‐Y was included in the analysis [11]. The VAS is structured as a vertical scale ranging from 0 “worst health imaginable” to 100 “best health imaginable.” Caregivers were instructed to indicate their child's current health status by marking a point directly on the VAS scale [12].
After 2 weeks, caregivers in the control and case groups completed the T‐14 and EQ‐5D‐Y VAS questionnaires again for test–retest analysis. Caregivers for patients who were recruited on the day of surgery did not complete follow‐up questionnaires after 2 weeks. Six months after surgery, caregivers in the case group were asked to complete the T‐14 and EQ‐5D‐Y VAS questionnaires, as well as respond to an ad hoc item from SQTS. This register question evaluates the outcome of surgery by asking caregivers to select one of four options regarding their child's symptoms: “My problems are gone,” “My problems are almost gone,” “My problems remain,” or “My problems have worsened.”
All questionnaires were provided in paper format and returned either directly to the clinic or by prepaid mail.
2.2. Paediatric Throat Disorders Outcome Test
The instrument consists of 14 items addressing a range of tonsil‐related symptoms and associated problems. Its underlying two‐factor structure, as proposed in the original validation study, categorizes the items into two distinct domains: Obstruction and Infection (see Table 1) [4]. Responses to each item are rated on a six‐point Likert scale from zero to five, where 0 represents “no problem” and 5 represents “as bad as it could be.” The total score, calculated by summing the individual item scores, ranges from zero to 70, where higher values indicate a greater severity of the condition. The questionnaire is completed by caregivers, who report their child's symptom burden over the preceding 6 months.
TABLE 1.
Original 14‐item Paediatric Throat Disorders Outcome Test (T‐14) with items categorized by domains as suggested by Hopkins et al. [4].
| Item number | Obstructive domain | Item number | Infection domain |
|---|---|---|---|
| 1 | Snoring loudly during sleep | 3 | Many visits to the family doctor or A&E department |
| 2 | Irregular or stopped breathing (apnea) during sleep | 4 | Many phone calls to the doctor or NHS Direct |
| 10 | Breathing through the mouth during the day | 5 | Taking antibiotics over and over for less than 2 weeks at a time |
| 11 | Noisy breathing during the day | 6 | Taking antibiotics for more than 2 weeks straight |
| 12 | Problems with poor appetite, or poor eating habits (choking on food, etc.) | 7 | Frequent earache or ear infections |
| 14 | Daytime sleepiness | 8 | Repeated short‐term throat infections that last less than 2 weeks |
| 9 | Constant or chronic throat infections that last more than 2 weeks | ||
| 13 | Missing school days due to sore throat |
2.3. Translation and Adaptation
The original English version of the T‐14 was translated into Swedish by two independent translators. The authors synthesized these translations into a single version, which was then back‐translated into English by two independent translators. Both back translations were compared with the original instrument to ensure semantic and conceptual equivalence. Some items underwent linguistic adaptation. References to UK‐specific healthcare services, such as “NHS Direct” and “A&E department,” were generalized to “healthcare” and “doctor visits” in Swedish, making the items more contextually appropriate. Additionally, the Swedish version includes minor expansions, adding “preschool” to school absence and “tiredness” alongside daytime sleepiness. The Swedish version of the T‐14 is presented in Appendix 1.
2.4. Statistical Analysis
Known‐groups validity was assessed by comparing the T‐14 total scores between the case and control groups using a Mann–Whitney U test.
Confirmatory factor analysis (CFA) was used to evaluate the construct validity of the 14‐item questionnaire [13]. The CFA was performed based on the predefined two‐factor model, Infection and Obstruction, as specified by Hopkins et al. [4] Model fit was evaluated using standard indices, including the comparative fit index (CFI), Tucker–Lewis index (TLI), root mean square error of approximation (RMSEA), and standardized root mean square residual (SRMR). For the CFI and TLI, values ≥ 0.90 were considered acceptable fit, and for both the RMSEA and SRMR, values ≤ 0.08 were considered acceptable [14].
Convergent validity was evaluated using Spearman's rank‐order correlation to examine the association between baseline scores for the T‐14 scores and EQ‐5D‐Y VAS. Additionally, Spearman's rank correlation was applied to assess the relationship between changes in the T‐14 scores and self‐reported improvement in SQTS 6 months after the intervention within the case group.
The internal consistency was evaluated using Cronbach's alpha. Cronbach's alpha values ≥ 0.70 were considered acceptable, with values between 0.80 and 0.90 regarded as desirable, indicating that the items measured the same underlying construct without excessive redundancy [15]. Test–retest reliability was examined using the intraclass correlation coefficient (ICC), based on a two‐way random effects model to assess absolute agreement between repeated measurements [16]. The ICC was calculated for each individual item at baseline and at the two‐week follow‐up. To provide an overall reliability estimate, ICC values were aggregated and presented as the mean across all items, accompanied by the standard deviation (SD) and a 95% confidence interval (CI). ICC values equal to or greater than 0.75 were interpreted as good reliability over time [16]. Measurement error was evaluated using the standard error of measurement (SEM), calculated from the SD of baseline T‐14 scores and ICC in the case group [17].
To assess the responsiveness of the outcome measure following surgical intervention, changes in scores between baseline and 6 months postoperatively were analyzed. The Wilcoxon signed‐rank test was applied to evaluate whether the observed changes over time were statistically significant. In addition, Cohen's d was calculated to quantify the magnitude of change between the two time points. The effect size was computed as the mean difference divided by the pooled SD. Interpretation of Cohen's d followed conventional thresholds, where values of 0.2, 0.5, and 0.8 represent small, moderate, and large responsiveness, respectively [18].
The psychometric validation strategy was pre‐specified and aligned with international recommendations for PROMs, primarily guided by the core domains outlined in the COSMIN (COnsensus‐based Standards for the Selection of Health Measurement INstruments) framework [17].
All statistical analyses were performed using IBM SPSS Statistics, Version 29.0.2.0 (IBM Corp., Armonk, NY, USA), except for the CFA, which was conducted in jamovi (Version 2.2.5; The jamovi project, Sydney, Australia).
A significance level of p ≤ 0.05 was adopted for all statistical tests.
2.5. Missing Data
Internal non‐response, defined as missing answers to individual survey questions, was documented for each item. Of the 7028 expected responses, 38 were missing, representing approximately 0.5% of the dataset. A tabulated patterns analysis indicated that the distribution of missing data was random, with no evidence of systematic patterns [19].
3. Ethical Considerations
Ethical approval was granted by the Swedish Ethical Review Authority (Dnr 2022‐03966‐01). Both caregivers provided written informed consent, and the study adhered to the principles outlined in the Declaration of Helsinki.
4. Results
Baseline characteristics, including age, sex, surgical indications, type of procedure with T‐14 and EQ‐5D‐Y scores at baseline, the two‐week retest, and the six‐month follow‐up, along with the SQTS question, are presented in Table 2. There was no statistically significant difference in sex distribution between the groups; however, the mean age was significantly higher in the control group compared to the case group. At the six‐month follow‐up after surgery, 68.5% of parents reported that their child's throat‐related symptoms were completely gone, 28.8% indicated the symptoms were almost gone, and 2.7% stated they persisted, as indicated by the SQTS item. No parents reported worsening of symptoms.
TABLE 2.
Baseline characteristics of the studied population with T‐14 and EQ‐5D‐Y scores at baseline, retest and 6‐month follow‐up, and the SQTS question.
| Case group (n = 169) | Control group (n = 80) | p | |
|---|---|---|---|
| Age, mean (SD) | 5.2 (3.1) | 7.8 (3.5) | < 0.001 |
| Sex, boys, n (%) | 89 (52.7%) | 41 (51.2%) | 0.892 |
| Indication for surgery, n (%) | |||
| Obstruction | 150 (88.8%) | NA | |
| Infection | 19 (11.2%) | NA | |
| Type of tonsil surgery, n (%) | |||
| Tonsillectomy | 17 (10.1%) | NA | |
| Tonsillectomy + adenoidectomy | 42 (24.9%) | NA | |
| Tonsillotomy | 5 (3.0%) | NA | |
| Tonsillotomy + adenoidectomy | 105 (62.1%) | NA | |
| T‐14, mean (SD) | |||
| Baseline | 29.0 (11.3) a | 2.0 (2.4) b | < 0.001 |
| Retest (2 weeks) | 29.5 (11.5) c | 1.3 (1.9) d | < 0.001 |
| Follow‐up (6 months) | 5.7 (6.4) e | NA | |
| EQ‐5D VAS, mean (SD) | |||
| Baseline | 78.8 (16.1) a | 92.2 (9.6) b | < 0.001 |
| Retest (2 weeks) | 77.5 (14.8) c | 90.6 (9.8) d | < 0.001 |
| Follow‐up (6 months) | 91.7 (10.6) e | NA | |
| SQTS question, follow‐up (6 months), n (%) | |||
| My problems are gone | 76 (68.5%) | NA | |
| My problems are almost gone | 32 (28.8%) | NA | |
| My problems remain | 3 (2.7%) | NA | |
| My problems have worsened | 0 (0%) | NA | |
Abbreviation: NA = not applicable.
n = 169.
n = 80.
n = 62.
n = 77.
n = 114.
Baseline mean T‐14 scores for the case and control groups are presented in Table 2. When the case group was subdivided based on the primary indication for the planned surgery into an Obstruction group and an Infection group, the reported mean total scores were 28.6 (SD 11.4) and 32.2 (SD 11.1), respectively (p = 0.15). Figure 2 illustrates which questions each group scored highest on, highlighting the symptoms most prominent in each group. Compared with the controls, the case group demonstrated significantly higher scores across all items (p < 0.001). Within the case group, infection‐related items—including physician visits, telephone consultations, antibiotic use, ear and throat infections, and school absence—were more pronounced in the Infection subgroup (p < 0.001 to p = 0.05). In contrast, obstructive airway symptoms—such as snoring, mouth breathing, apnea, noisy breathing, daytime sleepiness, and problems with appetite or eating—were predominantly elevated in the Obstruction subgroup (p < 0.001 to p = 0.35).
FIGURE 2.

T‐14 mean item scores at baseline for the case group (n = 169), including the two subgroups: Infection (n = 150) and Obstruction (n = 19), and the control group (n = 80).
4.1. Known‐Groups Validity
At both baseline and retest, the participants in the case group reported significantly higher total T‐14 scores compared to the control group (p < 0.001), Table 2.
4.2. CFA
CFA factor loadings are presented in Table 3. All indicators exhibited p values below 0.001, confirming that each loading was statistically significant. Most factor loadings, both unstandardized and standardized, exceeded the recommended threshold of 0.70. The covariance between Factor 1 and Factor 2 was 1.33, with a standardized estimate of 0.64, which was statistically significant (p < 0.001).
TABLE 3.
Confirmatory factor analysis, standardized factor loadings for T‐14 items across the two predefined factors: Obstruction and Infection.
| Factor | Indicator | Estimate | SE | 95% CI Interval | p | Stand. estimate | |
|---|---|---|---|---|---|---|---|
| Lower | Upper | ||||||
| Factor 1 obstruction | Item 1 | 1.00 a | 0.89 | ||||
| Item 2 | 0.97 | 0.04 | 0.89 | 1.05 | < 0.001 | 0.87 | |
| Item 10 | 1.05 | 0.06 | 0.94 | 1.17 | < 0.001 | 0.88 | |
| Item 11 | 0.95 | 0.05 | 0.84 | 1.06 | < 0.001 | 0.85 | |
| Item 12 | 0.72 | 0.06 | 0.60 | 0.85 | < 0.001 | 0.66 | |
| Item 14 | 0.81 | 0.05 | 0.70 | 0.92 | < 0.001 | 0.76 | |
| Factor 2 infection | Item 3 | 1.00 a | 0.84 | ||||
| Item 4 | 0.95 | 0.04 | 0.87 | 1.04 | < 0.001 | 0.80 | |
| Item 5 | 0.63 | 0.06 | 0.51 | 0.74 | < 0.001 | 0.65 | |
| Item 6 | 0.26 | 0.03 | 0.19 | 0.32 | < 0.001 | 0.47 | |
| Item 7 | 0.60 | 0.07 | 0.47 | 0.74 | < 0.001 | 0.56 | |
| Item 8 | 1.02 | 0.07 | 0.87 | 1.16 | < 0.001 | 0.82 | |
| Item 9 | 0.81 | 0.06 | 0.68 | 0.93 | < 0.001 | 0.75 | |
| Item 13 | 1.05 | 0.07 | 0.92 | 1.19 | < 0.001 | 0.86 | |
Fixed parameter.
The CFA indicated an overall acceptable model fit, with a CFI of 0.940 and a TLI of 0.921, both exceeding the commonly recommended threshold of 0.90. The SRMR was 0.061, below the 0.08 criterion. However, the RMSEA was 0.099 (90% CI: 0.085–0.112), which is above the preferred cutoff of 0.08 and approaches the poor‐fit range (> 0.10).
In the case group, baseline T‐14 scores showed a significant negative correlation with EQ‐5D‐Y VAS general health (Spearman's rs = −0.32, p < 0.001). Additionally, changes in T‐14 scores were positively associated with the self‐reported SQTS item (rs = 0.23, p < 0.05; Table 2), where a lower SQTS score indicates a better perceived postoperative outcome in terms of symptom relief. Both correlations were weak but statistically significant, supporting the convergent validity of the T‐14.
4.3. Reliability
Cronbach's alpha of the T‐14 items in the case group indicated good reliability (0.83). Subgroup analyses showed Cronbach's alpha values of 0.86 for the Obstruction group and 0.77 for the Infection group, both reflecting acceptable to good internal consistency. The ICC analysis demonstrated good stability of the T‐14 total score over a two‐week interval. For all participants, the mean ICC was 0.86 (SD = 0.07; 95% CI: 0.82–0.90). In the case group, reliability was slightly lower, with an ICC of 0.77 (SD = 0.09; 95% CI: 0.72–0.82). The baseline SD in the case group was 11.3 points, resulting in an SEM of 5.42 points.
4.4. Responsiveness
The case group demonstrated a statistically significant improvement (p < 0.001) 6 months after surgery, reflected by lower T‐14 scores compared to baseline 29.0 (SD = 11.3) to 5.7 (SD = 6.4). This indicates that the T‐14 is responsive to change over time. The reduction in T‐14 scores was consistent in both the Obstruction and Infection subgroups (p < 0.001). Cohen's d for the change in T‐14 scores from baseline to follow‐up was 1.89 (95% CI: 1.58–2.19), indicating a large effect size, providing further evidence of the instrument's responsiveness.
5. Discussion
This study explored the validity and psychometric performance of the Swedish version of the T‐14 questionnaire. The results show that the T‐14 is a suitable instrument for assessing tonsil problems in children and demonstrated good internal consistency, good test–retest reliability, and responsiveness to clinical change. Caregivers of patients scheduled for tonsil surgery reported a significantly greater tonsil‐related impact on QoL at baseline compared to caregivers of healthy controls. This finding confirms the ability of the T‐14 to differentiate individuals with tonsil‐related health problems from healthy controls. Although this study was not designed to evaluate surgical outcomes, the findings suggest that tonsil surgery was associated with a substantial reduction in the reported symptom burden related to tonsil disease. At 6 months postoperatively, EQ‐5D‐Y VAS scores had nearly normalized, and T‐14 scores approached levels observed in healthy controls. Furthermore, the majority of caregivers reported a complete or almost complete resolution of problems on the SQTS item, providing additional evidence of perceived benefit. The T‐14 has previously been used in several studies in England and Australia to evaluate throat disorders and outcomes following tonsil surgery [6, 7, 8, 9]. The present study now offers a Swedish version, enabling international comparisons.
CFA supported a two‐factor structure corresponding to the Obstruction and Infection domains [4], with most standardized loadings exceeding 0.70. Item 6, Taking antibiotics for two weeks straight, yielded the lowest factor loading (0.47) within the Infection domain. This is probably a reflection of Swedish clinical practice, where prolonged antibiotic courses (> 10 days) are uncommon. Model fit indices (CFI = 0.940, TLI = 0.921, SRMR = 0.061) were within acceptable ranges, indicating good overall fit. However, the RMSEA value (0.099) exceeded the recommended threshold, suggesting some degree of model misfit. This result should be interpreted with caution because RMSEA often overestimates lack of fit in models with few degrees of freedom—a situation typical for PROMs that include a limited number of factors and items [20]. Consequently, in the validation of PROMs, overall evidence from CFI, TLI, SRMR, and factor loadings should be considered collectively rather than rejecting models based solely on RMSEA.
The negative correlation between T‐14 scores and EQ‐5D‐Y VAS suggests that a higher symptom burden is associated with poorer perceived general health, supporting convergent validity. Although the correlation was weak, it was statistically significant, despite EQ‐5D‐Y VAS being influenced by multiple factors beyond tonsil disease. Similarly, the positive association between changes in T‐14 scores and the SQTS item indicates that improvements in disease‐specific symptoms align with patient‐reported successful outcomes after surgery.
Internal consistency was strong (Cronbach's α = 0.83), confirming that the items collectively measure a coherent construct. Test–retest reliability was acceptable, with an ICC of 0.77 (95% CI: 0.72–0.82) for the case group, indicating good stability over a two‐week interval. When both case and control groups were combined, the ICC values increased further. This pattern may be explained by the fact that individuals reporting symptoms tend to exhibit greater variability in their responses over time, as symptom perception can fluctuate even when referring to the same six‐month period. In contrast, control participants typically report scores near zero with minimal change, resulting in higher overall stability. As clinical interpretation primarily concerns symptomatic individuals, the ICC for the case group is presented as the main indicator of reliability. Furthermore, ICC estimates are sensitive to small sample sizes, which can lead to a greater influence of extreme values, providing an additional possible explanation for the lower ICC observed in the case group [16]. Since SEM is dependent on ICC, this value may also have been affected by the limited sample size in the case group. A SEM of 5.42 points indicates that individual scores may fluctuate by approximately ±5 points due to measurement error. Given the T‐14 total score range of 0–70, this represents 7.7% of the scale. Small changes over time should therefore be interpreted with care, as minor variations may not represent true clinical improvement [17].
Two previous validation studies have been published [4, 5]. The original article introducing the T‐14 was published by Hopkins et al. (UK) [4]. This study examined a case group of 126 children referred with throat disorders (mean age 6.5 years) and 40 unaffected controls (mean age 6.1 years). Of the case group, 113 were scheduled for surgery, while 13 were managed with ‘watchful waiting’. Diagnostic categories included 25 patients with obstructive sleep apnea (OSA), 47 with recurrent acute tonsillitis, and 41 with both OSA and recurrent tonsillitis. Larrosa et al. (Spain) translated and cross‐culturally adapted the original T‐14 into Spanish and validated the instrument [5]. The study included 50 pediatric patients scheduled for tonsil surgery (mean age 4.6 years) and 50 asymptomatic controls (mean age 6.0 years). The primary surgical indication was obstructive symptoms in 30 patients, while 20 underwent surgery mainly for tonsillar infections. In comparison, our study included a surgical cohort of 169 patients with a mean age of 5.2 years and a control group of 80 children (mean age 7.8 years). The indication for surgery was predominantly obstruction (150 patients, 88.8%), whereas infection was the main indication in 19 patients (11.2%), which represents a smaller proportion compared to Hopkins and Larossa.
The results of the present study are broadly consistent with previous published studies. Hopkins et al. (UK) reported a similar Cronbach's alpha (0.84) and an exceptionally high ICC (0.98) [4]. In contrast, the Spanish validation by Larossa et al. demonstrated even stronger internal consistency (α = 0.93), and a Goodman–Kruskal reliability coefficient of ƴ = 0.83 [5]. Taken together, these findings support the conclusion that the T‐14 provides reliable and reproducible measurements across different populations. Responsiveness was equally strong in the present study, with T‐14 scores declining markedly 6 months after surgery 29.0 to 5.7 (p < 0.001). The large effect size (Cohen's d = 1.89) highlights the clinical relevance of this change. Reductions in the T‐14 score were consistent across both the Obstruction and Infection subgroups, indicating that the T‐14 is sensitive to change regardless of surgical indication. These results align closely with previous research: Hopkins et al. observed a decrease in mean T‐14 scores from 31.3 to 7.4 at 6 months postoperatively (SMA = 1.53, calculated on the preliminary 16‐item version) [4], while Larrosa et al. reported a reduction from 30.4 to 2.3 and an effect size of 2.09 [5], reflecting a very large improvement in disease‐specific symptoms. The findings of the present study therefore support the conclusion that the T‐14 is responsive to change over time.
Despite potential differences in surgical indications and variations in clinical practices across countries (such as the duration of antibiotic treatment), the use of an instrument like the T‐14 holds significant scientific value, as it facilitates international comparisons across nations and language areas. Nevertheless, validation within the local context remains essential. The integration of the T‐14 into national quality registers, such as the SQTS, could provide significant scientific and clinical value. Currently, the SQTS lacks validated PROMs, limiting its ability to evaluate symptom burden and surgical benefit comprehensively. The eventual incorporation of the T‐14 could enable longitudinal monitoring of symptom burden in patients undergoing tonsil surgery, allowing for the identification of trends and a more accurate evaluation of postoperative outcomes.
5.1. Limitations
Some limitations of this study should be acknowledged. The test–retest response rate in the case group was less than optimal, which may have introduced no‐response bias and may limit the generalizability of the reliability estimates. Difficulties in achieving high response rates are known challenges in survey‐based research [21]. Another factor contributing to the low response rates was that baseline data for a subset of participants were collected on the day of surgery, precluding the possibility of obtaining retest data for these individuals. The relatively small subgroup with infectious indications may restrict generalizability; however, this proportion reflects the national distribution of surgical indications in Sweden, where approximately 10.6% of pediatric tonsil surgeries in 2024 were performed for infectious reasons, with the majority due to airway obstruction [22]. Another limitation is the higher mean age in the control group compared to the case group. This difference is unlikely to have substantially affected the study results, as no pattern of symptom variation related to age was observed in the control group (i.e., consistently low T‐14 scores regardless of age). Proxy reporting by caregivers may also introduce bias through differences in perception or recall, which could affect validity. Still, most children in this study were under 8 years old and generally considered too young to reliably self‐report, making caregiver input the most practical option for obtaining outcome data. A methodological limitation is the absence of an objective reference standard for symptom burden. While polysomnography could theoretically serve as a comparator for children with OSA, no validated Swedish instrument currently exists that simultaneously addresses both infection‐ and obstruction‐related problems attributable to tonsillar disease. This lack of a gold standard limits the evaluation of criterion validity. Finally, the use of convenience sampling for the control group may reduce representativeness. Despite these limitations, the findings in the present study support the reliability, validity, and responsiveness of the Swedish version of the T‐14.
6. Conclusion
Validation of the Swedish version of the T‐14 questionnaire demonstrated good internal consistency, good test–retest reliability, and evidence of responsiveness to clinical change in a Swedish pediatric population undergoing tonsil surgery. It effectively differentiated patients with tonsillar disease from healthy controls. The use of a standardized instrument provides substantial scientific value by enabling international comparisons across nations and language groups. The T‐14 can be used to monitor indication thresholds and the effect of surgery in a Swedish context. It also enables comparisons between Swedish and international studies.
Funding
This research was supported by grants from Futurum, the Academy for Healthcare, Jönköping County Council; FORSS, the Research Council of South‑Eastern Sweden; and the Acta Oto‑Laryngologica Foundation. The funding organizations had no influence on study design or execution.
Conflicts of Interest
The authors declare no conflicts of interest.
Acknowledgments
The authors would like to thank Bo Rolander, statistician at Futurum, Region Jönköping County, for his valuable statistical guidance and for performing the confirmatory factor analysis.
Appendix 1. The Swedish Version of the Pediatric Throat Disorders Outcome Test, T‐14
| Fråga | Inga problem | Obetydliga problem | Lindriga problem | Måttliga problem | Svåra problem | Problemen är så illa de kan bli |
|---|---|---|---|---|---|---|
| 1. Snarkar högljutt under sömn | 0 | 1 | 2 | 3 | 4 | 5 |
| 2. Oregelbunden andning eller andningsuppehåll (apné) under sömn | 0 | 1 | 2 | 3 | 4 | 5 |
| 3. Upprepade läkarbesök | 0 | 1 | 2 | 3 | 4 | 5 |
| 4. Upprepade telefonkontakter med sjukvården | 0 | 1 | 2 | 3 | 4 | 5 |
| 5. Upprepade antibiotikabehandlingar som pågått kortare än 2 veckor åt gången | 0 | 1 | 2 | 3 | 4 | 5 |
| 6. Antibiotikabehandlingar som pågått i längre än 2 veckor i sträck | 0 | 1 | 2 | 3 | 4 | 5 |
| 7. Ofta ont i örat eller haft infektioner i örat | 0 | 1 | 2 | 3 | 4 | 5 |
| 8. Upprepade kortvariga halsinfektioner som varat kortare än 2 veckor. | 0 | 1 | 2 | 3 | 4 | 5 |
| 9. Långvariga, eller kroniska, halsinfektioner som varat längre än 2 veckor. | 0 | 1 | 2 | 3 | 4 | 5 |
| 10. Andas genom munnen under dagtid | 0 | 1 | 2 | 3 | 4 | 5 |
| 11. Högljudd andning under dagtid | 0 | 1 | 2 | 3 | 4 | 5 |
| 12. Problem med dålig aptit eller besvär vid matintag (sätter mat i halsen etc.) | 0 | 1 | 2 | 3 | 4 | 5 |
| 13. Frånvaro från skola/förskola på grund av. halsont | 0 | 1 | 2 | 3 | 4 | 5 |
| 14. Dagtrötthet eller dagsömnighet | 0 | 1 | 2 | 3 | 4 | 5 |
Data Availability Statement
Data are subject to Swedish data protection regulations and cannot be shared directly by the authors. Data may be available upon receipt of ethical approval and formally submitted application to the entity responsible for the research (www.rjl.se).
References
- 1. “Nationellt kvalitetsregister för öron‐ n‐oh,” The Swedish Quality Register for Tonsil Surgery, accessed January 14, 2026, https://ton.registercentrum.se/in‐english/the‐swedish‐quality‐register‐for‐tonsil‐surgery/p/HJV8b8hV.
- 2. Lundström F., Stalfors J., Østvoll E., and Sunnergren O., “Practice, Complications and Outcome in Swedish Tonsil Surgery 2009‐2018. An Observational Longitudinal National Cohort Study,” Acta Oto‐Laryngologica 140 (2020): 589–596. [DOI] [PubMed] [Google Scholar]
- 3. Bonsel J. M., Itiola A. J., Huberts A. S., Bonsel G. J., and Penton H., “The Use of Patient‐Reported Outcome Measures to Improve Patient‐Related Outcomes ‐ a Systematic Review,” Health and Quality of Life Outcomes 22 (2024): 101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Hopkins C., Fairley J., Yung M., Hore I., Balasubramaniam S., and Haggard M., “The 14‐Item Paediatric Throat Disorders Outcome Test: A Valid, Sensitive, Reliable, Parent‐Reported Outcome Measure for Paediatric Throat Disorders,” Journal of Laryngology and Otology 124 (2010): 306–314. [DOI] [PubMed] [Google Scholar]
- 5. Larrosa F., Samara L., Esteller E., et al., “Cross‐Cultural Adaptation and Validation of the Spanish Version of the Paediatric Throat Disorders Outcome Test (T‐14),” Clinical Otolaryngology 43 (2018): 598–603. [DOI] [PubMed] [Google Scholar]
- 6. Bennett S., Gao J., Osen E., and Myuran T., “Long‐Term Impact of Paediatric Tonsillectomy and Adenotonsillectomy on the Paediatric Throat Disorders Outcome Test: A Prospective Observational Study,” Journal of Laryngology and Otology 136 (2022): 1066–1070. [DOI] [PubMed] [Google Scholar]
- 7. Hogg E. S., Hampton T., Wright K., and Sharma S. D., “The Effect on the T‐14 Paediatric Throat Disorders Outcome Score of Delaying Adenotonsillectomy Surgery due to COVID‐19,” Annals of the Royal College of Surgeons of England 105 (2023): S18–s21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Huynh J., Woods C. M., and Ooi E. H., “An Observational Pragmatic Quality‐Of‐Life Study on Paediatric Tonsillectomy and Waiting for Surgery,” European Archives of Oto‐Rhino‐Laryngology 280 (2023): 885–890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Hopkins C., Almeyda R., Alreefy H., et al., “Multicentre Prospective Clinical Application of the T14 Paediatric Outcome Tool,” Journal of Laryngology and Otology 129 (2015): 980–985. [DOI] [PubMed] [Google Scholar]
- 10. “Vårdgivare. Obstruktiv sömnrelaterad andningsstörning (OSDB) hos barn,” accessed December 16, 2025, https://vardpersonal.1177.se/kunskapsstod/vardforlopp/obstruktiv‐somnrelaterad‐andningsstorning‐OSDB‐hos‐barn/.
- 11. Wille N., Badia X., Bonsel G., et al., “Development of the EQ‐5D‐Y: A Child‐Friendly Version of the EQ‐5D,” Quality of Life Research 19 (2010): 875–886. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Kreimeier S. and Greiner W., “EQ‐5D‐Y as a Health‐Related Quality of Life Instrument for Children and Adolescents: The Instrument's Characteristics, Development, Current Use, and Challenges of Developing Its Value Set,” Value in Health: The Journal of the International Society for Pharmacoeconomics and Outcomes Research 22 (2019): 31–37. [DOI] [PubMed] [Google Scholar]
- 13. DiStefano C. and Hess B., “Using Confirmatory Factor Analysis for Construct Validation: An Empirical Review,” Journal of Psychoeducational Assessment 23 (2005): 225–241. [Google Scholar]
- 14. Hu L. and Bentler P. M., “Cutoff Criteria for Fit Indexes in Covariance Structure Analysis: Conventional Criteria Versus New Alternatives,” Structural Equation Modeling: A Multidisciplinary Journal 6 (1999): 1–55. [Google Scholar]
- 15. Streiner D. L., “Starting at the Beginning: An Introduction to Coefficient Alpha and Internal Consistency,” Journal of Personality Assessment 80 (2003): 99–103. [DOI] [PubMed] [Google Scholar]
- 16. Koo T. K. and Li M. Y., “A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research,” Journal of Chiropractic Medicine 15 (2016): 155–163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Mokkink L. B., Elsman E. B. M., and Terwee C. B., “COSMIN Guideline for Systematic Reviews of Patient‐Reported Outcome Measures Version 2.0,” Quality of Life Research 33 (2024): 2929–2939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Cohen J., Statistical Power Analysis for the Behavioral Sciences (Lawrence Erlbaum Associates, 1988). [Google Scholar]
- 19. Forer B., “Missing Data,” in Encyclopedia of Quality of Life and Well‐Being Research, ed. Michalos A. C. (Springer Netherlands, 2014), 4078–4082. [Google Scholar]
- 20. Kenny D. A., Kaniskan B., and McCoach D. B., “The Performance of RMSEA in Models With Small Degrees of Freedom,” Sociological Methods & Research 44 (2015): 486–507. [Google Scholar]
- 21. Council NR , Nonresponse in Social Science Surveys: A Research Agenda (National Academies Press, 2013). [Google Scholar]
- 22. “Surgery TSQRfT,” The Swedish Quality Register for Tonsil Surgery—statistik [Pediatric Tonsil Surgeries Distributed by Indication], accessed March 3, 2025, https://ton.registercentrum.se/.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data are subject to Swedish data protection regulations and cannot be shared directly by the authors. Data may be available upon receipt of ethical approval and formally submitted application to the entity responsible for the research (www.rjl.se).
