Skip to main content
Ultrasound: Journal of the British Medical Ultrasound Society logoLink to Ultrasound: Journal of the British Medical Ultrasound Society
. 2019 Aug 1;28(1):4–13. doi: 10.1177/1742271X19865001

British Thyroid Association 2014 classification ultrasound scoring of thyroid nodules in predicting malignancy: Diagnostic performance and inter-observer agreement

Alexander Weller 1, Ban Sharif 1, Mohammad H Qarib 2, Dominic St Leger 1, Hakkini SL De Silva 1, Ravi K Lingam 1,
PMCID: PMC6987506  PMID: 32063989

Abstract

Objectives

To assess the inter-observer agreement amongst five observers of differing levels of expertise in applying the British Thyroid Association (2014) guidelines for ultrasound scoring of thyroid nodules (BTA-U score) in the management of thyroid cancer, and to assess the U-score diagnostic performance in predicting malignancy.

Method

A total of 73 consecutive patients were included over a two-year period (July 2012 to July 2014), after referral to a tertiary head and neck oncology centre for ultrasound plus fine needle aspiration and cytology. Our five observers retrospectively and independently reviewed static ultrasound images on PACS and scored the thyroid nodules according to BTA-U classification. The observers were blinded to each other’s scoring, cytology and histology results. Either the Kappa-statistic or intra-class correlation was used to assess the level of inter-observer agreement, plus agreement between the radiological and cytological diagnoses. The diagnostic performance of U-scoring for predicting final histological diagnosis was assessed with sensitivity, specificity, positive and negative predictive values.

Results

A Kappa-value of 0.73 (95% CI: 0.68–0.77) confirmed substantial inter-observer agreement amongst the five observers. All 17 histology confirmed malignant nodules were correctly classified as potentially malignant by all observers. The sensitivity and negative predictive value of BTA-U score in detecting and predicting malignancy were 100%, whereas the specificity and positive predictive values were 34% and 32%, respectively.

Conclusions

There is good inter-observer agreement in using the BTA-U score amongst different observers at differing levels of expertise. Adhering to BTA-U scoring can potentially achieve 100% sensitivity in selecting malignant nodules for sampling.

Keywords: Thyroid nodules, British Thyroid Association, U-grade, thyroid nodule classification, reproducibility

Introduction

The incidence of thyroid nodules is 40% in the general population (increasing with age).1 The primary diagnostic concern in characterising thyroid nodules is to identify malignancy, ultimately so as to prognosticate. Despite the high incidence of thyroid nodules, the annual incidence of thyroid cancer in the UK is 5.1 per 100,000 women and 1.9 per 100,000 men, of which papillary carcinoma accounts for 80–90%. In the remaining 10–20%, follicular carcinoma is the most common.1,2 Over recent years, the incidence of thyroid cancer has increased in western populations, an increase that is largely attributable to detection on imaging performed for unrelated clinical indications. This increase has not been accompanied by a proportionate decrease in thyroid cancer related mortality, highlighting the need for improved prognostic markers at diagnosis.3

Ultrasound is extremely sensitive, non-invasive and cost-effective technique for detecting thyroid nodules2,4,5 and allows targeted US guided fine needle aspiration for cytological analysis (FNAC). The resulting thyroid cytopathology is currently reported using the Bethesda system, which categorises malignancy risk (Table 1),68 and since its widespread adoption in 2009 has been shown to impart prognostic information about cancer type, variant and risk of recurrence.9 Within this system, Thy2 and Thy5 are definitive for non-neoplastic nodules and carcinoma respectively, Thy3 (possible neoplasia) and Thy4 (suspicious) are considered potentially malignant but final diagnosis requires either repeat FNAC or surgical histological confirmation.2

Table 1.

Bethesda 5-point categorisation for reporting risk of malignancy on thyroid cytopathology.

Bethesda category Risk of malignancy
Thy1/Thy1c Non-diagnostic for cytological diagnosis
Thy2/Thy2c Non-neoplastic
Thy3 Thy3a Thy3f Neoplasm possible Atypia Follicular neoplasm
Thy4 Suspicious of malignancy
Thy5 Malignant

Although certain features of thyroid nodules on ultrasound are predictive for malignancy, no single feature in isolation has sufficient diagnostic accuracy to reliably detect or discount cancer.1012 As a result, several groups have proposed standardised systems aimed at combining multiple features so as to maximise diagnostic performance and aid decision making about which nodules to target for FNAC.2,5,13 Among these are the TIRADS13 and EU-TIRADS,5 and the British Thyroid Association (BTA) U-Classification.2,14 Given the high background incidence of thyroid nodules and overall limited diagnostic accuracy of ultrasound, the main value of these systems lies in nodule selection, so as to safely detect all thyroid cancers, while minimising the number subject to cytological sampling, with associated patient anxiety and clinical cost.2

As part of general guidelines on managing thyroid cancer, the British Thyroid Association (BTA, 2014) ultrasound (U-) classification of thyroid nodules was introduced, primarily to facilitate the decision of whether or not to proceed to FNAC.2,14,15 Under this system, nodules are classified into categories U1 to U5, based on features including echogenicity, contour, halo, colloid artefact, calcification and vascularity (Figure 1). Under this classification, U1 represents normal thyroid parenchyma, U2 a benign nodule, U3 an indeterminate/equivocal nodule, U4 a suspicious nodule and U5 a malignant nodule. FNAC is recommended for nodules classified as U3 or above. One clear clinical advantage of the standardised nomenclature offered by all classification systems is in providing a common language for communicating ultrasound findings, reducing variability between operators in radiological reports. As outlined in Figure 1, the abbreviated BTA U-score (U = 1–5) is sub-divided according to specific sonographic features, giving a detailed 17-point U-score: U2 (a–f), U3 (a–c), U4 (a–d) and U5 (a–e).

Figure 1.

Figure 1.

British Thyroid Association 2014 classification ultrasound scoring of thyroid nodules, reproduced from literature.2.

As for any quantitative or semi-quantitative clinical scoring system or imaging biomarker, scoring variability (or reproducibility) must be defined prior to widespread clinical adoption. In the context of the BTA U-score, reproducibility is defined by inter-observer variability between different operators at different levels of expertise. Therefore, the aim of this study was to evaluate inter-observer agreement of the U-score, and in particular to evaluate agreement for determining which thyroid nodules require FNAC. This is an important initial step in validating the system.16 The secondary aim was to assess the diagnostic accuracy of the BTA U-score for detecting malignancy against final pathological diagnosis from either cytological or histological results. Final pathological diagnosis was available at FNAC for nodules classified as Thy2/Thy5, but required histological confirmation from surgery for Thy3–4 cytology.

Method

A total of 73 patients (age 18 to 88, average 51 years) with thyroid nodules that underwent diagnostic ultrasound and FNAC following referral to a tertiary head and neck oncology centre between July 2012 and July 2014 were included in this study. As this comprised retrospective analysis of data acquired for standard clinical care, ethics committee review and informed consent were waived.

The targeted ultrasound images of 73 individual nodules were retrospectively reviewed and classified according to the BTA 2014 U-score (Figure 1). Seventy-two of the 73 examinations were performed prior to regional adoption of the BTA U-classification in July 2014. Ultrasound examination, image acquisition and FNAC were performed at the time of each patient visit by an experienced head and neck radiologist (10 years’ experience), using the ‘Thyroid’ presets and high-frequency linear probe (6–15 MHz) on a General Electric LOGIC-E9 ultrasound scanner (GE Healthcare, Chicago, USA). All examinations were performed at a frequency between 9 and 13 MHz, with 1–2 focal zones centred on the region of interest and with spatial compound (GE ‘Cross-Beam’) imaging enabled. In addition to B-Mode (Brightness-Mode) grey-scale images, colour Doppler was used as a problem-solving tool in selected examinations. Where used, Doppler colour-flow region of interest was limited to the nodule under investigation and the following parameters applied: power (acoustic output) = 100%, gain = 10–13, pulse repetition frequency = 1.2 kHz.

Five observers with differing levels of expertise retrospectively reviewed the captured and stored images for all 73 patients, on proprietary patient archive and communication system (Sectra PACS, Linköping, Sweden), using vendor-specific default grey-scale window centre and width. The five different observers included: A consultant head and neck radiologist with 10 years’ head and neck radiology experience (RL); a general radiology consultant (HDS); a head and neck radiology sub-specialty trainee (DSL); a junior general radiology trainee (BS); and a senior sonographer experienced in thyroid ultrasound (HQ). With the exception of the consultant head and neck radiologist and senior sonographer, all observers underwent at least three months training in the BTA U-scoring system prior to the study. Training comprised direct supervision, where all thyroid examinations from ≥1 head and neck ultrasound session per week were reviewed with the experienced radiologist (as instructor). For the study, each observer independently scored the nodules according to the both abbreviated and detailed BTA classification U-scores, while blinded to: the radiology reports generated at the time of the ultrasound examination; the scoring applied by the other four observers; and the final cytology / histology results.

Pathological classification of malignant nodules was determined by histology from surgical specimens. Benign nodules were classified by obtaining either two consecutive Thy2 cytology results, or benign histology on surgical specimens following either Thy1 or Thy3–4 FNAC.

Statistical analysis

Statistical analysis was performed using Graph-Pad Prism Version 6 (Graph-Pad Software Inc. CA, USA) and Microsoft Excel. Clinical and laboratory patient characteristics were expressed as mean, standard deviation (SD) and range. The inter-observer agreement between independent observers’ BTA U-scoring was calculated using Fleiss’ Kappa coefficient (κ) or intra-class correlation (ICC) coefficient (as appropriate), where kappa value of 0 corresponds with no agreement, values between 0 and 0.2 slight agreement, 0.41 and 0.6 moderate agreement, 0.61 and 0.80 substantial agreement and 0.81 and 1.0 excellent agreement. This was performed for both the abbreviated U-scoring (U1–U5) and more detailed U-scoring (e.g. U3c).

As a secondary aim, to evaluate the diagnostic performance of BTA U-scoring for differentiating benign from malignant nodules, sensitivity, specificity, negative and positive predictive values and diagnostic accuracy of the BTA U-score for predicting final pathological diagnosis were calculated. These analyses were performed using the scores of each observer, from most (RL), to least experienced (DSL and BS). For the purpose of this analysis, U = 1–2 was considered benign (not needing fine needle aspiration), whereas a U = 3–5 was considered potentially malignant (and requiring fine needle aspiration). The performance of the BTA U-score for differentiating benign from malignant nodules was represented graphically using receiver-operating-characteristic (ROC) analysis and from the ROC analysis plot (sensitivity versus (1-specificity)) the area under the curve was calculated.

Finally, agreement between radiological (BTA U-) and cytological (Bethesda Thy-) scores were calculated using the kappa statistic. Where applicable, all tests were two sided, using a significance level of α = 5%.

Results

Patient and nodule characteristics are summarised in Table 2. Seventy-three nodules were included from ultrasound examinations of 73 patients. The patient cohort comprised 42 female and 31 male, mean age 51 years (range 18–88 years). Of the 73 nodules evaluated, 19 (26%) received benign BTA U-score (U2), whereas 54 (74%) were classified as U3–5 (equivocal, suspicious or malignant and warranting FNAC). Of these, at FNAC, 46 were classified as benign (Thy2), 25 as potentially malignant / malignant and 2 non-diagnostic (Thy1). Following ultrasound and FNAC, histological diagnosis from surgical specimens was obtained for 27 nodules in total (10 benign at surgery, 17 malignant).

Table 2.

Patient and nodule characteristics.

Patient and nodule characteristics
Patient demographics
• Patient age (years): Mean; sd; range 51; 15; 18–88
• Female: Male (n) 42: 31
BTA U-score n = 73
• Benign U-score (U2) 19/73 (26%)
• Equivocal, suspicious or malignant U-score (U3–5) 54/73 (74%)
Cytology Bethesda score n = 71
• Insufficient sample (Thy1) 2/73 (3%)
• Benign nodule cytology (Thy2) 46/73 (63%)
• Equivocal, suspicious or malignant cytology (Thy 3a – Thy5) 25/73 (34%)
Benign histology on surgical specimen n = 10
• Colloid / adenomatoid / lymphocytic thyroiditis / other 7
• Follicular adenoma 3
Malignant histology on surgical specimen (carcinoma) n = 17
• Papillary carcinoma 10 /17 (59%)
• Follicular carcinoma 7/17 (41%)

At final pathological diagnosis (either cytology where definitive or histology following surgery), 56 of 73 (77%) nodules were benign and 17 of 73 (23%) malignant (10 papillary carcinomas and 7 follicular carcinomas). Thy2 to Thy5 diagnostic cytology results were available in 71 of the 73 nodules included. The remaining two patients proceeded to surgery following two non-diagnostic (Thy1) FNAC results. Of these two Thy1 nodules, one was graded U5 radiologically and confirmed follicular carcinoma after surgery, whereas the second was graded U3 and received final p diagnosis of benign follicular adenoma/hyperplasia.

Inter-observer agreement

Substantial inter-observer agreement was confirmed between the five independent observers’ abbreviated BTA U-scoring (U1–5), as demonstrated by Fleiss’ kappa value (κ) of 0.73 (95% confidence interval (CI): 0.68–0.77) and ICC of 0.81 (95% CI: 0.75–0.86). As expected intuitively, this agreement reduced when applying the more detailed U-scoring (U1–5; (a–f)); nonetheless, substantial agreement was confirmed between the five observers: ICC = 0.74 (95% CI: 0.66–0.81) (Table 3). All five observers reached complete unanimity for 64 of 73 cases (88%). For the nine cases in which unanimity was not reached, no nodule had malignant final pathological diagnosis.

Table 3.

Inter-observer agreement between five independent observers for the BTA U-scoring of thyroid nodules (see Figure 1).

Variable Statistic Estimate (95% CI)
Abbreviated U-scoring (U1–5) ICC Kappa 0.81 (0.75–0.86) 0.73 (0.68–0.77)
Detailed U-scoring (U1a – U5e) ICC 0.74 (0.66–0.81)

U-score accuracy for identifying thyroid malignancy

The diagnostic accuracy of the BTA U-score for identifying thyroid malignancy is summarised in Tables 4 to 6, where most importantly all 17 cases with a final pathological diagnosis of thyroid carcinoma were classified by all observers on ultrasound as either equivocal or suspicious and therefore requiring FNAC (U = 3–5). The BTA U-score therefore had 100% sensitivity and 100% negative predictive value for malignancy (Table 4, Figure 2).

Table 4.

Diagnostic accuracy of BTA U-score for differentiating benign (U = 1–2) from malignant (U = 3–5) thyroid nodules, for the most experienced and less experienced observers and including mixed vascularity as a ‘potentially malignant’ feature.

Most experienced observers (highest specificity)
Less experienced observer (lowest specificity)
Statistic n/N Estimate (95% CI) n/N Estimate (95% CI)
Sensitivity 17 / 17 100% (81%, 100%) 17 / 17 100% (82%, 100%)
Specificity 19 / 56 34% (22%, 48%) 16 / 56 29% (17%, 42%)
Positive predictive value 17 / 54 32% (28%, 36%) 17 / 57 30% (26%, 33%)
Negative predictive value 19 / 19 100% (82%, 100%) 16 /16 100% (81%, 100%)
Accuracy 36 / 73 49% (37%, 61%) 33 / 73 45% (34%, 57%)

Table 6.

Abbreviated BTA U-score (from the most experienced observer) for differentiating benign from malignant final pathology (cytology / histology) results.

Radiological diagnosis
Final pathological diagnosis Benign (U2) Equivocal or malignant (U3–5) Total
Benign final pathology 19 37 56
Malignant final pathology 0 17 17
Total 19 54 73

Figure 2.

Figure 2.

ROC curve for BTA short U-score for differentiating benign from malignant thyroid nodules.

However, this was at the expense of a low specificity (34% for the most experienced observer) and positive predictive value (32%). Of the 37 false positive (FP) cases, 28 were scored U3 (U3c) and 9 were scored U4 (6 U4a, 1 U4c, 2 U4d) by the most experienced observer. Of the 56 nodules with a final benign diagnosis, 37 (66%) were classified as requiring FNAC (U = 3–5).

These results are represented graphically in Figure 2, where the ROC analysis area under the curve (Az) = 0.97 is potentially misleading as it is skewed by the lack of false negative cases – the overall diagnostic accuracy of the BTA U-score was low, with only 49% of cases correctly classified as either benign or malignant. The reason for this high sensitivity and poor specificity was that radiological diagnosis greatly overestimated malignancy and suggested equivocal or suspicious features in 54 patients (74% of the cohort), compared to a final malignant diagnosis in only 17 of these (23% of the cohort).

Considering the range of diagnostic performance across the observers, all five observers achieved 100% sensitivity for detecting malignancy. While the specificity obtained by the senior sonographer matched that of the most senior observer, it was lower for all three other less experienced observers where between 55 and 57 patients were classified as having equivocal or suspicious features at ultrasound (compared with 54 – Table 4).

The diagnostic accuracy for the most experienced observer above was based upon the abbreviated BTA U-scoring (U1–5), for which 34 of the nodules that received a scoring of U3 (and warranting FNAC) did so on the basis of mixed vascularity alone, with otherwise benign features. These nodules would have received a detailed U-score subcategory of U3c (Figure 1). If this mixed vascularity was removed from the sonographic evaluation (i.e. mixed vascularity in otherwise isoechoic homogenous nodules considered as benign), specificity for a final diagnosis of malignancy was increased to 84% (NPV 86%), but at the expense of reduced sensitivity of 53% (PPV 86% – Table 5).

Table 5.

Diagnostic accuracy of BTA U-score (from the most experienced observer) for differentiating benign (U = 1–2) from malignant (U = 3–5) thyroid nodules, but by considering mixed internal vascularity in isoechoic homogenous nodules as a benign feature.

Statistic n/N Estimate (95% CI)
Sensitivity 9 / 17 53% (28%, 77%)
Specificity 47 / 56 84% (72%, 92%)
Positive predictive value 9 / 18 50% (26%, 74%)
Negative predictive value 47 / 55 86% (73%, 94%)
Accuracy 56 / 73 77% (64%, 86%)

BTA U-score correlation with cytology

Table 7 compares U-grading with Bethesda Thy-grading. The kappa statistic for the strength of agreement between BTA U-score and Bethesda Thy-grade was 0.33 (95% CI: 0.16–0.50), suggesting only fair agreement between the two systems. This relatively poor agreement was largely due to the high predicted rate of malignancy (U3–5 – 74%) in the radiological diagnosis compared with Thy-grading (Thy3–5 – 23%). For the 71 patients assessed by both ultrasound and cytology, 52 were considered potentially malignant radiologically (U3–5), compared with only 25 at cytology (Thy3–5) (and 17 with confirmed malignancy at final pathological diagnosis).

Table 7.

BTA U-score for differentiating benign (Thy2) from equivocal / malignant (Thy3–5) cytology results for the 71 nodules in which diagnostic cytology was available (Thy2–5).

Radiological diagnosis
Cytological diagnosis Benign (U2) Equivocal or malignant (U3–5)
Benign (Thy2) 19 27
Equivocal or malignant (Thy3–5) 0 25

Discussion

Our results confirm substantial to excellent reproducibility of BTA U-scoring for thyroid nodules – with adequate training, excellent inter-observer agreement is achieved for observers at levels of expertise ranging from specialist radiologist, general radiologist and sonographer to radiology trainees. This is a promising result as the abbreviated BTA-U score categories (rather than the subcategories) are used as a determinant of whether or not to proceed to FNAC. The good inter-observer agreement for the BTA U-score compares favorably, if not equally with, Thyroid Imaging Reporting and Data System (TIRADS) and other current guidelines.1720 Our study confirms that satisfactory inter-observer agreement for BTA U-grading is achievable within a single centre following structured training. With similar training, these findings should be reproducible across institutions.

By applying the BTA U-score, all malignant nodules in our study were correctly identified as either equivocal or concerning for malignancy and therefore requiring sampling for cytological analysis. The sensitivity and negative predictive value of BTA U-score in detecting malignant nodules was 100%. This is clinically desirable as all malignant nodules were detected and selected for FNAC, making this an extremely safe test. The perfect sensitivity compares favourably with that of other guidelines, with which a high sensitivity for detecting thyroid carcinoma has been reported.17 However, within our cohort, this comes at the expense of a much lower specificity, leading to oversampling of benign nodules, with added costs and potential issues resulting from an invasive sampling and increased reliance on cytological analysis.8,12

Our results demonstrate that the high false positive rate obtained for BTA U-scoring was largely attributable to U3c nodules, for which mixed or internal vascularity was the only suspicious feature for malignancy. The inclusion or exclusion of Doppler vascularity in detecting thyroid carcinoma is currently stimulating debate. Although some studies have reported that internal vascularity tends toward a significant association with malignancy,12,2123 a recent meta-analysis24 reported that vascularity is not an independent predictor and that even if used alongside other suspicious sonographic features, it does not increase the likelihood of malignancy. Of note, vascularity is not included in the American College of Radiology (ACR) TIRADS,25 revised American Thyroid Association (ATA) guidelines,26 Korean Thyroid Society Consensus statement (K-TIRADS)27 or European Thyroid association guidelines EU-TIRADS5 for sonographic thyroid nodule grading. However, the European Thyroid Association Guidelines acknowledge that colour Doppler can help differentiate solid nodules from thick colloid and improve delineation of nodule margins when isoechoic to thyroid. That said, including colour Doppler imaging in our cohort improved sensitivity for detecting malignancy from 53% (without vascularity) to 100% (with vascularity), thereby detecting an extra eight (24%) thyroid carcinomas (two cystic papillary cancers, six follicular cancers (Figures 3 and 4)), from within 34 ‘indeterminate’ (BTA U3c) nodules. Excluding vascularity would have risked not subjecting these eight nodules to FNAC and misdiagnosing them as benign, whereas including vascularity was essential to maintaining a high sensitivity for malignancy. In summary, when triaging which nodules require FNAC, although including mixed vascularity increased false-positives at ultrasound, it eliminated false-negative results (i.e. missed cancer) and our data support the routine use of Doppler vascularity in thyroid nodule evaluation.

Figure 3.

Figure 3.

(a) B-mode; and (b) colour Doppler overlay images of a cystic thyroid nodule with solid isoechoic mural component, for which demonstration of internal vascularity allowed confident BTA classification as U3. The echogenic foci are non-specific for either colloid or fine calcification. This nodule was confirmed as papillary carcinoma at surgery, highlighting the value of Doppler vascularity in detecting malignancy.

Figure 4.

Figure 4.

Colour Doppler overlay on B-mode images of mildly heterogeneous thyroid nodules for two different patients: (a) near isoechoic nodule, the borders of which are delineated by the Doppler vascularity; (b) mildly hypoechoic nodule. The presence of internal vascularity in the absence of other concerning features led to BTA classification as U3 (indicating FNAC). Both of these nodules were confirmed as follicular carcinoma at surgery.

Interestingly, the proportion of follicular variant thyroid carcinoma in most published data (compared with papillary variant) is lower than that observed in the UK, where follicular variant accounts for 16–24% of all carcinomas,26,2832 and one in four patients with Thy3 cytology has either follicular or Hurthle cell carcinoma.33 In the sonographic evaluation of follicular neoplasms specifically, internal and mixed vascularity has been confirmed as a predictor of malignancy in a number of studies.12,27,3437 The fact that several other groups have failed to confirm the utility of Doppler vascularity for thyroid carcinoma in general is likely due to the lower proportion of follicular variant (as low as 2.5%) and higher proportion of papillary variant in their study populations.26,2830,38 The malignant nodules within our cohort compromised a much higher proportion of follicular carcinomas (41% of malignant nodules) compared with prior reports, meaning that our data are potentially more representative of the UK population.

With respect to our low specificity for BTA U-scoring in identifying thyroid carcinoma, our cohort was drawn from a tertiary referral centre, skewed the sample towards U3–5 nodules, with underrepresentation of nodules showing clear benign (U2) features. This selection is expected to diminish U-scoring specificity for our cohort compared with the wider population, as a result of a deleteriously high contribution to form ‘false-positive’ U3–5 nodules, and an underrepresentation of ‘true-negative’ U2 nodules. Further potential selection bias within this study arose from only including nodules identified that required FNAC. However, this was in part mitigated by analysing data from patients treated prior to adoption of the BTA U-classification, at a time when a higher proportion of nodules with benign ultrasound features (U2) were subject to FNAC at our institution.

The fair agreement observed between sonographic BTA U-score and Bethesda Thy-score highlights the potential for ultrasound not only to detect malignancy, but also to provide prognostic information about cancer type and recurrence risk – this has already been shown for the Bethesda score.9 Data on the prognostic capacity of any sonographic scoring system are, however, pending. In addition, data are currently being accumulated on the potential of ultrasound elastography to improve sonographic diagnostic performance.39,40 In the future, this could readily be analysed in conjunction with the BTA-U score at the time of scanning.

The main limitation of our study was sample size. Also, this study comprised a retrospective analysis for which static ultrasound images were reviewed, without the opportunity to obtain dynamic evaluation (as is commonly performed in clinical practice). Being performed at a tertiary referral centre, a portion of the nodules evaluated were pre-selected on the basis of suspicious features found at referring centres, biasing the cohort sample towards indeterminacy or malignancy (U3–5). This would likely account for the slightly higher malignancy rate in our cohort compared with the background UK population. Although a higher proportion of the carcinomas in our cohort were follicular variant compared with prior non-UK data, the proportion of follicular variant was similar to that observed in prior UK-based reports.2732 If this observed geographic variation represents a true difference in carcinoma types between populations, it should be reflected in regional practice and this area deserves further study. Most benign nodules included in our sample did not have histology from surgical excision, as excision is seldom justified here unless symptomatic. Benignancy was, however, established robustly with two consecutive benign Thy 2 cytology results.

Despite these limitations, the value of our data lies in being the first study to report the reproducibility of the BTA 2014 U-scoring of thyroid nodules and to confirm that by applying this scoring, the ultrasound operator can readily identify nodules requiring sampling with a very high sensitivity for malignancy. A prospective multicenter trial is warranted to confirm the generalisability of these results.

Conclusion

There is good reproducibility and inter-observer agreement in the use of the BTA-U classification amongst different observers at differing levels of expertise. Adhering to BTA-U scoring system achieves a sensitivity of 100% in selecting malignant nodules for sampling.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

Ethics Approval

Ethical approval was not submitted and was waived by the Department of Radiology, Northwick Park Hospital, London North West University Healthcare NHS Trust Watford Road, London HA1 3UJ.

Guarantor

RL

Contributors

RL, HQ, DSL and HDS conceived the study. DSL, AW and HDS desiged and developed the study protocol. DSL, HDS, RL, HQ and BS performed data extraction and interpretation. AW and BS performed data analysis and wrote the first draft of the manuscript. All authors reviewed and edited the manuscript and approved the final version of the manuscript.

Acknowledgements

None

References

  • 1.La Vecchia C, Malvezzi M, Bosetti Cet al. Thyroid cancer mortality and incidence: a global overview. Int J Cancer 2015; 136: 2187–2195. [DOI] [PubMed] [Google Scholar]
  • 2.Perros P, Boelaert K, Colley Set al. Guidelines for the management of thyroid cancer. Clin Endocrinol 2014; 81: 1–122. [DOI] [PubMed] [Google Scholar]
  • 3.Brito JP, Morris JC, Montori VM. Thyroid cancer: zealous imaging has increased detection and treatment of low risk tumours. BMJ 2013; 347: f4706–f4706. [DOI] [PubMed] [Google Scholar]
  • 4.Park S, Oh CM, Cho Het al. Association between screening and the thyroid cancer “epidemic” in South Korea: evidence from a nationwide study. BMJ 2016; 355: i5745–i5745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Russ G, Bonnema SJ, Erdogan MFet al. European Thyroid Association Guidelines for ultrasound malignancy risk stratification of thyroid nodules in adults: the EU-TIRADS. Eur Thyroid J 2017; 6: 225–237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Baloch ZW, Cibas ES, Clark DPet al. The National Cancer Institute Thyroid fine needle aspiration state of the science conference: a summation. CytoJournal 2008; 5: 6–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Cibas ES, Ali SZ. The Bethesda System for reporting thyroid cytopathology. Thyroid 2009; 19: 1159–1165. [DOI] [PubMed] [Google Scholar]
  • 8.Cross P, Chandra A, Giles T, et al. Guidance on the reporting of thyroid cytology specimens. The Royal College of Pathologists. Document G089 (version 2), 2016.
  • 9.Liu X, Medici M, Kwong Net al. Bethesda categorization of thyroid nodule cytology and prediction of thyroid cancer type and prognosis. Thyroid 2016; 26: 256–261. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Brito JP, Gionfriddo MR, Al Nofal Aet al. The accuracy of thyroid nodule ultrasound to predict thyroid cancer: systematic review and meta-analysis. J Clin Endocrinol Metab 2014; 99: 1253–1263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Remonti LR, Kramer CK, Leitao CBet al. Thyroid ultrasound features and risk of carcinoma: a systematic review and meta-analysis of observational studies. Thyroid 2015; 25: 538–550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Lingam RK, Qarib MH, Tolley NS. Evaluating thyroid nodules: predicting and selecting malignant nodules for fine-needle aspiration (FNA) cytology. Insights Imag 2013; 4: 617–624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Horvath E, Majlis S, Rossi Ret al. An ultrasonogram reporting system for thyroid nodules stratifying cancer risk for clinical management. J Clin Endocrinol Metab 2009; 94: 1748–1751. [DOI] [PubMed] [Google Scholar]
  • 14.Mitchell AL, Gandhi A, Scott-Coombes Det al. Management of thyroid cancer: United Kingdom National Multidisciplinary Guidelines. J Laryngol Otol 2016; 130: S150–s160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kanona H, Virk JS, Offiah Cet al. Ultrasound-guided assessment of thyroid nodules based on the 2014 British Thyroid Association guidelines for the management of thyroid cancer - how we do it. Clin Otolaryngol 2017; 42: 723–727. [DOI] [PubMed] [Google Scholar]
  • 16.Sullivan DC, Obuchowski NA, Kessler LGet al. Metrology standards for quantitative imaging biomarkers. Radiology 2015; 277: 813–825. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Koh J, Kim SY, Lee HSet al. Diagnostic performances and interobserver agreement according to observer experience: a comparison study using three guidelines for management of thyroid nodules. Acta Radiol 2018; 59: 917–923. [DOI] [PubMed] [Google Scholar]
  • 18.Hoang JK, Middleton WD, Farjat AEet al. Interobserver variability of sonographic features used in the American College of Radiology Thyroid Imaging Reporting and Data System. Am J Roentgenol 2018; 211: 162–167. [DOI] [PubMed] [Google Scholar]
  • 19.Grani G, Lamartina L, Cantisani Vet al. Interobserver agreement of various thyroid imaging reporting and data systems. Endocrine Connect 2018; 7: 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chandramohan A, Khurana A, Pushpa BTet al. Is TIRADS a practical and accurate system for use in daily clinical practice? Indian J Radiol Imag 2016; 26: 145–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fukunari N, Nagahama M, Sugino Ket al. Clinical evaluation of color Doppler imaging for the differential diagnosis of thyroid follicular lesions. World J Surg 2004; 28: 1261–1265. [DOI] [PubMed] [Google Scholar]
  • 22.Frates MC, Benson CB, Doubilet PMet al. Can color Doppler sonography aid in the prediction of malignancy of thyroid nodules? J Ultrasound Med 2003; 22: 127–131. quiz 32–34. [DOI] [PubMed] [Google Scholar]
  • 23.Na DG, Baek JH, Sung JYet al. Thyroid imaging reporting and data system risk stratification of thyroid nodules: categorization based on solidity and echogenicity. Thyroid 2016; 26: 562–572. [DOI] [PubMed] [Google Scholar]
  • 24.Khadra H, Bakeer M, Hauch Aet al. Is vascular flow a predictor of malignant thyroid nodules? A meta-analysis. Gland Surg 2016; 5: 576–582. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tessler FN, Middleton WD, Grant EGet al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): white paper of the ACR TI-RADS Committee. J Am Coll Radiol 2017; 14: 587–595. [DOI] [PubMed] [Google Scholar]
  • 26.Kim HG, Moon HJ, Kwak JYet al. Diagnostic accuracy of the ultrasonographic features for subcentimeter thyroid nodules suggested by the revised American Thyroid Association guidelines. Thyroid 2013; 23: 1583–1589. [DOI] [PubMed] [Google Scholar]
  • 27.Shin JH, Baek JH, Chung Jet al. Ultrasonography diagnosis and imaging-based management of thyroid nodules: revised Korean Society of Thyroid Radiology Consensus Statement and Recommendations. Korean J Radiol 2016; 17: 370–395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kwak JY, Han KH, Yoon JHet al. Thyroid imaging reporting and data system for US features of nodules: a step in establishing better stratification of cancer risk. Radiology 2011; 260: 892–899. [DOI] [PubMed] [Google Scholar]
  • 29.Migda B, Migda M, Migda MSet al. Use of the Kwak Thyroid Image Reporting and Data System (K-TIRADS) in differential diagnosis of thyroid nodules: systematic review and meta-analysis. Eur Radiol 2018; 28: 2380–2388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Moon HJ, Kwak JY, Kim MJet al. Can vascularity at power Doppler US help predict thyroid malignancy? Radiology 2010; 255: 260–269. [DOI] [PubMed] [Google Scholar]
  • 31.Olaleye O, Ekrikpo U, Moorthy Ret al. Increasing incidence of differentiated thyroid cancer in South East England: 1987-2006. Eur Arch Oto-rhino-laryngol 2011; 268: 899–906. [DOI] [PubMed] [Google Scholar]
  • 32.Reynolds RM, Weir J, Stockton DLet al. Changing trends in incidence and mortality of thyroid cancer in Scotland. Clin Endocrinol 2005; 62: 156–162. [DOI] [PubMed] [Google Scholar]
  • 33.Mihai R, Parker AJ, Roskell Det al. One in four patients with follicular thyroid cytology (Thy3) has a thyroid carcinoma. Thyroid 2009; 19: 33–37. [DOI] [PubMed] [Google Scholar]
  • 34.Miyakawa M, Onoda N, Etoh Met al. Diagnosis of thyroid follicular carcinoma by the vascular pattern and velocimetric parameters using high resolution pulsed and power Doppler ultrasonography. Endocrine J 2005; 52: 207–212. [DOI] [PubMed] [Google Scholar]
  • 35.De Nicola H, Szejnfeld J, Logullo AFet al. Flow pattern and vascular resistive index as predictors of malignancy risk in thyroid follicular neoplasms. J Ultrasound Med 2005; 24: 897–904. [DOI] [PubMed] [Google Scholar]
  • 36.Iared W, Shigueoka DC, Cristofoli JCet al. Use of color Doppler ultrasonography for the prediction of malignancy in follicular thyroid neoplasms: systematic review and meta-analysis. J Ultrasound Med 2010; 29: 419–425. [DOI] [PubMed] [Google Scholar]
  • 37.Choi YJ, Yun JS, Kim DH. Clinical and ultrasound features of cytology diagnosed follicular neoplasm. Endocrine J 2009; 56: 383–389. [DOI] [PubMed] [Google Scholar]
  • 38.Kwak JY, Jung I, Baek JHet al. Image reporting and characterization system for ultrasound features of thyroid nodules: multicentric Korean retrospective study. Korean J Radiol 2013; 14: 110–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Trimboli P, Guglielmi R, Monti Set al. Ultrasound sensitivity for thyroid malignancy is increased by real-time elastography: a prospective multicenter study. J Clin Endocrinol Metab 2012; 97: 4524–4530. [DOI] [PubMed] [Google Scholar]
  • 40.Zhao CK, Chen SG, Alizad A, et al. Three-dimensional shear wave elastography for differentiating benign from malignant thyroid nodules. J Ultrasound Med 2018; 37: 1777–1788. [DOI] [PubMed]

Articles from Ultrasound: Journal of the British Medical Ultrasound Society are provided here courtesy of SAGE Publications

RESOURCES