Abstract
Context: Recently published guidelines are discordant regarding diagnostic approaches to small (10–14 mm) thyroid nodules.
Objective: The objective of the study was to explore the relative desirability of alternative diagnostic approaches to small thyroid nodules using decision analysis.
Design: Four diagnostic approaches to a 10- to 14-mm thyroid nodule are modeled: 1) observation only, consistent with American Thyroid Association guidelines; 2) routine fine-needle aspiration biopsy (FNAB), an approach traditionally chosen by many endocrinologists and consistent with American Thyroid Association guidelines; 3) FNAB only when microcalcifications are present, as recommended by Society of Radiologists in Ultrasound guidelines; and 4) FNAB only when the nodule is hypoechoic and has at least one other ultrasonographic risk factor, as endorsed by American Association of Clinical Endocrinologists guidelines.
Main Outcome Measures: Measures included expected values; a priori likelihoods of prespecified outcomes; and two-way sensitivity analyses based on the utility of observation only in the setting of thyroid cancer and thyroid surgery for benign, asymptomatic thyroid disease.
Results: Expected values (EVs) were similar among decision alternatives modeling Society of Radiologists in Ultrasound guidelines, American Association of Clinical Endocrinologists guidelines, and routine observation (EVs from 0.912 to 0.927). Routine FNAB had the lowest EV (0.757–0.861), primarily related to a high a priori likelihood of having surgery for a benign nodule.
Conclusions: As a general approach to 10- to 14-mm thyroid nodules, routine FNAB appears to be the least desirable. This analysis offers additional data that physicians can use when choosing diagnostic approaches to small thyroid nodules based on perceived risks of delayed cancer diagnosis and unnecessary thyroid surgery.
A decision analysis reveals that routine fine needle aspiration biopsy of small (10-14 mm) thyroid nodules is a less favorable approach than biopsy only when suspicious ultrasound characteristics are present.
Thyroid nodules are common, with a prevalence of 3–8% by palpation and 20–76% by thyroid ultrasonography (US) (1,2,3). Although thyroid nodules may cause local compressive symptoms or hyperthyroidism, they are often asymptomatic, being discovered incidentally during physical or radiological examination. Nonetheless, clinical evaluation should be considered for all thyroid nodules given a 5–13% risk of thyroid malignancy (1,2,3).
High-resolution thyroid US has enhanced evaluation of nodular thyroid disease, and several US characteristics (e.g. microcalcifications) are associated with an increased risk of malignancy (1,2,3). Nonetheless, none of these US findings is diagnostic, and fine-needle aspiration biopsy (FNAB) remains the cornerstone of thyroid cancer diagnosis. However, exactly which nodules should be targeted for FNAB remains controversial (1,2,3,4,5,6,7).
Routine FNAB of all thyroid nodules 10 mm or greater in diameter is an approach traditionally chosen by many thyroidologists. This approach is consistent with American Thyroid Association (ATA) guidelines (3) and with other expert recommendations (4,5). The Society of Radiologists in Ultrasound (SRU) guidelines recommend FNAB of nodules 10 mm or greater only when microcalcifications are present; 15 mm or greater if completely or predominantly solid or if coarse calcifications are present; and 20 mm or greater if predominantly cystic with a solid component (1). American Association of Clinical Endocrinologists (AACE) guidelines state that FNAB should be performed on all hypoechoic nodules 10 mm or greater with one or more of the following US characteristics: irregular margins, chaotic intranodular vascular spots, a more-tall-than-wide shape, and microcalcifications (2). The National Cooperative Cancer Network Thyroid Carcinoma Clinical Practice Guidelines offer similar recommendations (8).
Thus, the approach to small (10–14 mm) thyroid nodules could vary, depending on which guidelines are followed. This discordance in part reflects disagreement regarding the balance between two primary risks: the risk of delaying appropriate surgical treatment when thyroid cancer is present and the risk of undergoing hemithyroidectomy/isthmusectomy for a benign, asymptomatic thyroid nodule.
We applied decision analysis to the frequently encountered clinical scenario in which a solitary, solid (or predominantly solid), 10- to 14-mm thyroid nodule is discovered in an asymptomatic and euthyroid patient with no clinical evidence of metastases (e.g. no abnormal cervical lymphadenopathy by palpation or US). We estimated the relative desirability of four decision alternatives: 1) observation only; 2) routine FNAB; 3) FNAB only when microcalcifications are present on US; and 4) FNAB only when the nodule is hypoechoic and has one or more additional US risk factors.
Materials and Methods
Decision analysis is a method of quantifying and comparing outcomes resulting from decision alternatives (9). In decision analysis, models are used to weigh the probability of events that follow decisions together with the relative desirability of possible outcomes, thus allowing an exploration of risk-benefit relationships in settings in which existing clinical data are incomplete. Specifically, each possible outcome is assigned a utility value (ranging from 0 to 1) that quantifies the desirability of the outcome, with 0 and 1 representing the worst and best, respectively, possible outcomes. Utilities for each possible outcome are multiplied by the probability of their occurring. Conditional probabilities are thus obtained for each outcome, with each outcome weighted by its relative utility value. The sum of the weighted probabilities for all events that follow a decision alternative is its expected value (EV), which represents a summary measure of the overall benefit obtained by a decision alternative. The decision alternative that obtains the highest EV is the preferred alternative.
Decision analysis model
We developed a decision analysis model (see Fig. 1 and Table 1) representing the four decision alternatives discussed above, primarily based on algorithms offered in AACE and ATA guidelines (2,3). The model was developed for the baseline case of an asymptomatic patient with a solitary, solid (or predominantly solid), 10- to 14-mm thyroid nodule, with no concerning clinical features (e.g. no history of neck irradiation, no cervical lymphadenopathy), and a normal TSH. The model assumes that nodule characteristics will be adequately assessed by an experienced thyroid ultrasonographer and that FNAB will be performed under US guidance. The model incorporates events that follow four decision alternatives emanating from the model’s decision node (represented by the open square).
Table 1.
Parameter | Abbreviation | Baseline value | Range for sensitivity analysis | References |
---|---|---|---|---|
Probability of thyroid cancer | P(cancer) | 0.08 | 0.05–0.13 | 1,2,3 |
Probability of persistently inadequate FNAB | P(inad_FNAB) | 0.10a/0.05b | 0.01–0.20 | 2,10,11 |
Probability of indeterminate FNAB | P(indet_FNAB) | 0.15a/0.10b | 0.10–0.30 | 2,3,10,11,12 |
Probability of hyperfunctioning nodule if indeterminate FNAB | P(hyperfunction) | 0.05 | 0.00–0.16 | 11,13,14,15 |
Probability of cancer if indeterminate FNAB | P(indet_cancer) | 0.20 | 0.05–0.30 | 8,12,16,17 |
Sensitivity of FNAB | Sens(FNAB) | 0.95a/0.98b | 0.89–0.99 | 2,3,10,11,12 |
Specificity of FNAB | Spec(FNAB) | 0.95a/0.98b | 0.93–1.00 | 2,10,11,12 |
Sensitivity of microcalcifications | Sens(mCa) | 0.45 | 0.26–0.59 | 1,2 |
Specificity of microcalcifications | Spec(mCa) | 0.90 | 0.86–0.95 | 2 |
Sensitivity of hypo-plus | Sens(hypo-plus) | 0.91 | 0.87–0.94 | 18,19 |
Specificity of hypo-plus | Spec(hypo-plus) | 0.70 | 0.66–0.74 | 18,19 |
Utility of observation if thyroid cancer | Observation_cancer | 0.00 | 0.00–1.00 | NA |
Utility of surgery for benign nodule | Surgery_benign | 0.00 | 0.00–1.00 | NA |
We were unaware of any published data suggesting that US findings influence the likelihood that a nodule will yield indeterminate FNAB results or the likelihood that an indeterminate nodule will be hyperfunctioning on I-123 scintigraphy. Thus, these probabilities were assumed to be equal across all arms. Two studies were used to estimate the testing characteristics of hypoechoic plus at least one other ultrasonographic risk factor: in the study by Papini et al. (18), additional concerning findings included microcalcifications, blurred margins, and intranodular vascularization, whereas in the study by Kim et al. (19), additional concerning findings included microcalcifications, irregular/microlobulated margin, and a more-tall-than-wide appearance. hypo-plus, Hypoechoic plus at least one other ultrasonographic risk factor; NA, not applicable.
Set of conservative FNAB parameter estimates (see Materials and Methods).
Set of optimistic FNAB parameter estimates (see Materials and Methods).
The first decision alternative is to forego FNAB, regardless of US characteristics (7). The likelihood that the nodule is malignant is represented by a chance node (open circle) for two possible events: cancer or benign nodule. These potential outcomes of the decision alternative are each denoted by a triangle. The probability of a missed (delayed) malignancy diagnosis is represented by the variable P (cancer), which estimates the prevalence of thyroid cancer among solitary nodules. The value (utility) of this potential outcome is also represented by a variable (observation_cancer). The likelihood that the nodule is benign is denoted by #, which is shorthand for one minus the sum of all other alternatives at a particular chance node. The utility of the potential outcome of a benign nodule is represented by the constant 1.0 because observation alone is highly desirable for a benign nodule.
The second alternative is routine FNAB regardless of US characteristics. With this approach, FNAB is persistently inadequate (i.e. yields too few epithelial cells for interpretation after one or more repeat FNAB attempts) in 5–10% of cases, and these patients proceed to surgery (hemithyroidectomy/isthmusectomy). The likelihood of cancer at surgery is equal to the prevalence of cancer among solitary nodules; the remainder of surgeries will disclose a benign nodule (this utility value is assigned the variable surgery_benign). FNAB is indeterminate (e.g. follicular neoplasm) in 10–15% of cases. In this case, a radioactive iodine (I-123) scan is performed. A hyperfunctioning nodule is assumed to be benign, but all other cases go to surgery (hemithyroidectomy/isthmusectomy), which reveals the nodule to be benign in 80% (utility = surgery_benign) and malignant in 20% (utility = 1.0). When FNAB is neither inadequate nor indeterminate, it is either positive (suggesting cancer) or negative (suggesting benign disease). A positive FNAB prompts surgery: this reveals the FNAB to be a true positive (utility = 1.0) or a false positive (utility = surgery_benign). In the case of a negative FNAB result, the nodule is observed (e.g. follow-up US in 1 yr). A negative FNAB represents either a true negative (utility = 1.0) or false negative (utility = observation_cancer).
The third alternative involves the application of SRU guidelines (1) and starts with US assessment for microcalcifications. If US is negative (i.e. microcalcifications absent), the nodule is observed. This can be a true negative [probability = negative predictive value (NPV) of microcalcifications] or a false negative (probability = one minus NPV of microcalcifications). If US is positive (microcalcifications present), FNAB is performed. The remaining events in the decision tree are identical with those presented for routine FNAB, except that the pre-FNAB probability of cancer is approximated by the positive predictive value (PPV) of microcalcifications (i.e. approximately 28% based on baseline model assumptions).
The fourth subtree models the AACE guidelines (2) and approximates National Cooperative Cancer Network guidelines (8). Evaluation starts with an US assessment for hypoechogenicity, microcalcifications, irregular margins, chaotic intranodular vascular spots, and a more-tall-than-wide shape. If US does not disclose a hypoechoic nodule with at least one other concerning US feature (i.e. is negative), then the nodule is observed. This can be a true negative (probability = NPV of hypoechoic plus at least one other concerning US feature) or false negative (probability = one minus NPV). If US reveals a hypoechoic nodule with at least one other concerning US feature, FNAB is performed. The decision tree is thereafter identical with that for routine FNAB, except the pre-FNAB test probability of cancer approximates the PPV of hypoechoic plus at least one other concerning US feature (i.e. approximately 21% based on baseline model assumptions).
Decision analysis model parameters
Table 1 lists parameter estimates for the events, outcomes, and utilities represented in the model. These estimates were primarily derived from clinical guidelines (1,2,3,8) or literature cited within the guidelines. The utility of missed (delayed) cancer diagnosis and the utility of surgery for a benign nodule were both assigned values of zero for the baseline case.
There was uncertainty regarding the appropriate estimate for some parameters. AACE guidelines cite FNAB sensitivity and specificity of 95% (2), whereas some literature suggests 98% (10,11,12). AACE guidelines suggest a 10–20% likelihood of inadequate FNAB (2), but it may be 5% when US guidance and repeat FNAB are routinely used (10,11,20,21,22,23). Whereas ATA guidelines endorse a 15–30% probability of indeterminate FNAB reading (3), AACE guidelines and other sources suggest a 10% likelihood (2,10,11). Because of these uncertainties, and because these estimates substantially influence model results, we first performed decision analysis using conservative FNAB parameter estimates (i.e. 95% FNAB sensitivity and specificity; 10% likelihood of persistently inadequate FNAB; and 15% likelihood of indeterminate FNAB). We subsequently repeated the analysis using more optimistic FNAB parameter estimates (i.e. 98% FNAB sensitivity and specificity; 5% likelihood of persistently inadequate FNAB; and 10% likelihood of indeterminate FNAB).
Bayesian revision of probabilities
The key diagnostic tests in the model are FNAB, US for microcalcifications, and US for hypoechoic plus at least one other ultrasonographic risk factor. Bayesian revision was used to calculate the probability of a positive test result, the PPV of a positive test result, and the NPV of a negative test result using the following formula:
These quantities were calculated using the estimates of test sensitivity P(B|A), test specificity [1 − P(B|not A)], and the pretest probability of cancer P(A) (i.e. the underlying prevalence of thyroid cancer in the group undergoing testing) with the following formulae:
In the SRU and AACE arms, the pre-US probability of cancer equaled the prevalence of cancer among solitary thyroid nodules. In the routine FNAB arm, the pre-FNAB probability approximated the prevalence of cancer among solitary thyroid nodules. In the SRU and AACE arms, the pre-FNAB probability of cancer approximated the PPV of microcalcifications and hypoechoic plus at least one other ultrasonographic risk factor, respectively.
Inadequate and indeterminate FNAB results were considered to be neither positive nor negative and thus were not subjected to Bayesian revision. The baseline likelihood of cancer in these situations was considered to be 8 and 20%, respectively. Because the probability of cancer for indeterminate FNABs was fixed at 20% (rather than 8%), the overall percentage of cancer in each decision alternative subtree could vary slightly, depending on the proportion entering the indeterminate FNAB subtree. To correct this small inequality, the prevalence of thyroid cancer in those with adequate and determinate FNABs was calculated as the overall risk of cancer minus the a priori likelihood of having cancer after a negative US (if applicable), the a priori likelihood of finding cancer after surgery for persistently inadequate FNAB, and the a priori likelihood of finding cancer after surgery for indeterminate FNAB; all divided by the a priori likelihood of being in the adequate and determinate FNAB subtree. This correction ensured that the overall likelihood of reaching a terminal node denoting thyroid cancer was exactly 8% for each decision alternative subtree. The following equation represents this calculation for the SRU arm:
Probability of cancer in an adequate and determinate FNAB = (P(cancer) − ((P(pos_mCa) × P(inad_FNAB) × PPV(mCa)) + (P(pos_mCa) × P(indet_FNAB) × (1 − P(hyperfunction)) × P(inad_cancer)) + ((1 − P(pos_mCa)) × (1 − NPV(mCa)))))/(P(pos_mCa) × (1 − (P(inad_FNAB) + P(indet_FNAB)))).
Sensitivity analyses
One- and two-way sensitivity analyses were conducted to assess the extent to which results from the model depend on estimates for specific parameters in the model. Sensitivity analysis compares the EV obtained by decision alternatives across the plausible range of estimates for specific parameters in the model. One-way sensitivity analysis evaluates changes in the model results across values for a single model parameter while holding all other estimates constant. Two-way analysis considers changes for alternative combinations of two variable estimates while holding all others constant. The range of values (minimum to maximum) used for each parameter are shown in Table 1.
Decision analysis software
All analyses were performed using TreeAge Pro 2006 (TreeAge Software, Inc., Williamstown, MA).
Results
The decision analysis model using conservative FNAB estimates demonstrated that the approach recommended by SRU guidelines has the highest EV of 0.927. Slightly lower EVs are obtained by routine observation (0.92) and AACE guidelines (0.912), but routine FNAB has a much lower EV of 0.757. When more optimistic FNAB estimates are used, AACE guidelines have the highest EV (0.947), followed by SRU (0.939), routine observation (0.920), and routine FNAB (0.861).
One-way sensitivity analyses suggest that the highest EV obtained for the decision alternatives is sensitive to the parameters P(cancer), P(inad_FNAB), P(indet_FNAB), sens(mCa), and spec(mCa) when conservative FNAB parameters are used in the model, and P(cancer), P(inad_FNAB), P(indet_FNAB), spec(FNAB), and sens(mCa) when more optimistic FNAB parameters are used. Tables 2 and 3 detail the specific thresholds at which there is a change in the decision alternative with the highest EV. Routine FNAB had the lowest EV throughout all plausible parameter estimates used in these one-way sensitivity analyses.
Table 2.
Parameter | Threshold | Effect on EV |
---|---|---|
P(cancer) | >0.06 | US (SRU) > routine observation |
>0.11 | US (AACE) > US (SRU) | |
P(inad_FNAB) | >0.01 | US (SRU) > US (AACE) |
>0.19 | Routine observation > US (SRU) | |
P(indet_FNAB) | >0.23 | Routine observation > US (SRU) |
Sens(mCa) | >0.34 | US (SRU) > routine observation |
Spec(mCa) | >0.87 | US (SRU) > routine observation |
Utility of observation_cancer | >0.21 | Routine observation > US (SRU) |
Utility of surgery_benign | >0.29 | US (AACE) > US (SRU) |
>0.95 | Routine FNAB > US (AACE) |
The specific thresholds at which the decision alternative with the highest expected value (EV) changes are presented. For example, the decision alternative with the highest EV is different depending on whether the probability of thyroid cancer is greater than or less than 0.06: below this threshold value, the highest EV is obtained by routine observation; but above this threshold value, the highest EV is obtained by US (SRU). For all other variables that are not shown, there was no threshold value at which the decision alternative with the highest expected value (EV) changes. US, FNAB based on ultrasound findings as recommended by American Association of Clinical Endocrinologists (AACE) or Society for Radiologists in Ultrasound (SRU); P(cancer), probability of cancer; P(inad_FNAB), probability of repeatedly inadequate FNABs; P(indet_FNAB), probability of indeterminate FNAB; sens(mCa), sensitivity of microcalcifications; spec(mCa), specificity of microcalcifications; surgery_benign, utility of surgery for a benign nodule; observation_cancer, utility of observation in the setting of thyroid cancer.
Table 3.
Parameter | Threshold | Effect on EV |
---|---|---|
P(cancer) | >0.06 | US (AACE) > US (SRU) |
P(inad_FNAB) | >0.09 | US (SRU) > US (AACE) |
P(indet_FNAB) | >0.14 | US (SRU) > US (AACE) |
Spec(FNAB) | >0.94 | US (AACE) > US (SRU) |
Sens(mCa) | >0.55 | US (SRU) > US (AACE) |
Utility of observation_cancer | >0.19 | US (SRU) > US (AACE) |
>0.55 | Routine observation > US (SRU) | |
Utility of surgery_benign | >0.92 | Routine FNAB > US (AACE) |
Sens(mCa), sensitivity of microcalcifications. Please see Table 2 legend for the remaining abbreviations and for a brief explanation regarding threshold value interpretation.
Two-way sensitivity analyses are shown in Fig. 2. Either the AACE or SRU approach has the highest EV for most combinations of estimates for the variables observation_cancer and surgery_benign. Routine observation has the highest EV when both the utility of observation_cancer is high and the utility of surgery_benign is low. Routine FNAB has the highest EV only when the utility of surgery_benign is very high and that of observation_cancer relatively low.
Estimates of the a priori probabilities that outcomes of interest will occur in a group of patients participating in the various diagnostic schemata are presented in Table 4. For example, only 0.1–0.2% of all patients in the routine FNAB approach will have a thyroid cancer that is missed; this represents 1.3–2.5% of those with thyroid cancer. However, 13.8–24.1% of those entering a routine FNAB paradigm will have surgery for a benign nodule. Conversely, the SRU approach is unlikely to result in surgery for benign disease (1.6–2.7% of all participating patients), but cancer will be missed in 4.5% (i.e. ∼56% of all cancer cases). An intermediate proportion (4.5–7.8%) of patients participating in the AACE approach will have surgery for benign disease, with few having a missed malignancy (0.8–1.0% of participating patients, or 10–12.5% of all cancer cases).
Table 4.
A priori probability (%)
|
||||
---|---|---|---|---|
Clinical strategy (FNAB assumptions) | Surgery for benign nodule | Observation for benign nodule | Surgery for cancer | Observation for cancer |
Routine observation | ||||
A | 0.0 | 92.0 | 0.0 | 8.0 |
B | 0.0 | 92.0 | 0.0 | 8.0 |
Routine FNAB | ||||
A | 24.1 | 67.9 | 7.8 | 0.2 |
B | 13.8 | 78.2 | 7.9 | 0.1 |
FNAB only if mCa | ||||
A | 2.7 | 89.3 | 3.5 | 4.5 |
B | 1.6 | 90.4 | 3.5 | 4.5 |
FNAB only if hypo-plus | ||||
A | 7.8 | 84.2 | 7.0 | 1.0 |
B | 4.5 | 87.5 | 7.2 | 0.8 |
These estimates are predicated on baseline model parameters (Table 1). A, Set of conservative FNAB parameter estimates (95% FNAB sensitivity and specificity; 10% likelihood of persistently inadequate FNAB; and 15% likelihood of indeterminate FNAB). B, Set of more optimistic FNAB parameter estimates (98% FNAB sensitivity and specificity; 5% likelihood of persistently inadequate FNAB; and 10% likelihood of indeterminate FNAB). mCa, Microcalcifications; hypo-plus, hypoechoic plus at least one other ultrasonographic risk factor.
Discussion
This analysis suggests that routine FNAB of 10- to 14-mm nodules has a substantially lower EV, compared with approaches that limit FNAB to 10- to 14-mm nodules with concerning ultrasonographic characteristics. This conclusion is consistent across the range of plausible parameter estimates included in the model. The lower EV of routine FNAB is primarily driven by the high (13.8–24.1%) a priori risk of surgery for a benign nodule. On the other hand, routine FNAB is associated with the lowest risk of a missed cancer diagnosis (1.3–2.5% of those with malignancy), which is related to the high sensitivity of FNAB for detecting cancer. Two-way sensitivity analysis suggests that routine FNAB would have the highest EV only when having surgery for a benign nodule is considered to be a highly desirable final outcome.
Compared with routine FNAB, the approach outlined in SRU guidelines is associated with a much lower rate of surgery for benign disease but a substantially higher risk of missed malignancy. Approximately 56% of cancers would be observed for a period of time (e.g. until nodule growth is observed) under SRU guidelines. Both the lower risk of surgery_benign and the higher risk of observation_cancer reflect the low sensitivity of microcalcifications, which limits the number of FNABs performed. Given the higher sensitivity of hypoechoic plus at least one other ultrasonographic risk factor, AACE guidelines yield a lower risk of missed malignancy (10–12.5% of those with cancer). However, because more FNABs will be performed under AACE guidelines, the likelihood of surgery for benign disease exceeds that of the SRU approach. The EV of the AACE approach increases relative to that of the SRU approach as the estimated utility for missed malignancy decreases, whereas the EV of the SRU approach increases relative to that of the AACE approach as the utility of surgery for benign disease decreases.
Although the EVs of routine observation, SRU guidelines, and AACE guidelines are very similar at baseline, two-way sensitivity analysis suggests that the AACE or SRU approaches are preferred for most plausible combinations of utility estimates for surgery_benign and observation_cancer. Although routine observation will yield no cases of surgery for benign disease over the short term, cancer diagnosis will always be delayed with this approach. Routine observation has the highest EV only when the utility of surgery for benign disease is lower than that of missed cancer (Fig. 2). Because such assignments seem generally unlikely, we anticipate that few would prefer routine observation.
We specified a solitary, solid (or predominantly solid) nodule for this analysis. However, this analysis is relevant to any nodule (solitary or part of a multinodular gland) with a pretest probability of cancer approximating 8%. We also assumed that FNAB would be performed under US guidance, but we recognize that not all endocrinologists use US guidance for every FNAB. If FNAB without US guidance were to substantially increase nondiagnostic and false-negative rates, compared with the estimated rates used in this analysis, then the EV results reported herein may not be applicable.
The time horizon considered in this analysis was approximately 1 yr. This time frame was chosen because many endocrinologists would perform follow-up US at 1 yr, and because observational data suggest that a greater than 1 yr delay of cancer diagnosis can negatively impact prognosis (24). Ideally, the decision analysis model would consider disease progression for patients with missed cancers and other events occurring over several years. However, we felt that such a model would be excessively complex and speculative. For example, the natural history of untreated 10- to 14-mm thyroid cancers remains ill defined, and it is unclear what proportion of initially observed cancers would grow sufficiently to prompt FNAB, over what time frame said growth would occur, what proportion would exhibit metastatic spread over time, etc. Also, many benign nodules would grow and prompt FNAB, adding to model complexity.
FNAB is a safe and relatively noninvasive diagnostic tool for assessing thyroid nodules. Use of FNAB has reduced the numbers of thyroid surgeries (thus reducing cost of care) and increased cancer yields at surgery (10). In general, FNAB is highly accurate (2,3,10,11,12). However, a major limitation of FNAB is inadequate or indeterminate results, which occur in 10–25% of cases (2,3,10,11,12,20,21,22). In such cases, surgery is usually recommended for diagnosis (2,3,23), and the majority of these nodules will prove to be benign. Thus, the likelihood of having surgery for benign disease is proportional to the percentage of patients undergoing FNAB. There is great interest in molecular markers to distinguish benign from malignant nodules in the setting of an indeterminate FNAB; and to the extent that they would reduce the likelihood of surgery for benign disease, the EV of routine FNAB would increase. However, such markers are not currently reliable enough for clinical use (23). Given these limitations of FNAB, many propose that FNAB is best limited to nodules with high-risk US characteristics. Unfortunately, some thyroid cancers do not manifest concerning US characteristics (18,19), and the risk of delayed diagnosis will increase when FNAB is restricted. In such cases, repeat FNAB should be pursued if significant nodule growth is observed (3,11). However, the presence or absence of growth is not a reliable marker of a nodule’s malignant or benign, respectively, nature (2,3,10,12).
Surgery for benign disease and delayed cancer diagnosis have different morbidity, mortality, and cost outcomes (see below). However, because utility values are by nature subjective, we assigned baseline utility values of 0 for both parameters. In this regard, the two-way sensitivity analysis is especially useful because it allows individual physicians and patients to determine a most-desired approach based on individually assigned utility values. To this end, relevant participants would first need to assign utilities to undergoing surgery for benign disease and missed (delayed) cancer diagnosis. A method by which individuals can reasonably assign utility values is required. Such methods include standard gamble, time trade-off, willingness to pay, and visual thermometer rating scales (25).
Before the assignment of utilities, the known and potential risks of thyroid surgery and delayed cancer diagnosis should be reviewed. Hemithyroidectomy and isthmusectomy is often recommended when FNAB is indeterminate or repeatedly inadequate. In experienced hands, this is a low-risk procedure: the risks of general anesthesia are low for most patients; and the risk of permanent hypoparathyroidism or recurrent laryngeal nerve injury is very low (<1%) (2,26,27). However, thyroid hormone supplementation will be required in 20–30% (28,29,30,31). Moreover, the inconvenience and cost of surgery are substantial.
In contrast to the well-defined risks of thyroid surgery, the risks of delayed treatment of a 10- to 14-mm thyroid cancer (without clinical metastases) are unclear. Thyroid cancers are usually indolent, and the long-term prognosis of differentiated thyroid cancer is excellent, at least when treated promptly. The 5-yr survival rate for differentiated thyroid cancer is 99.7% when confined to the thyroid and 96.9% when there is regional spread (32). Similarly, 30-yr mortality is low (<1%) in patients with primary tumors less than 1.5 cm (24). Also, small (<15 mm, but usually <5 mm) papillary cancers may be seen in up to 6% of autopsies in the United States (33). This represents an estimated two-logarithm difference, compared with clinical cancer rates (34), suggesting that the majority of small thyroid cancers are not clinically relevant.
Size criteria for FNAB are largely based on the association between primary cancer size and prognosis (24). Routine FNAB of nodules less than 10 mm is often discouraged (3) because these microcarcinomas infrequently metastasize and carry a small risk of recurrence and mortality after surgical removal (35); and one study found no clear difference in cancer persistence, recurrence, or distant metastases when comparing papillary cancers 11–15 mm to those 10 mm or less (36). On the other hand, a minority of microcarcinomas follow a more aggressive course (2,35,37). One study suggested that the risk of local invasion/metastases increases with papillary cancers greater than 5 mm (38), whereas another study observed distant metastases with primary tumors as small as 8 mm (39). It is unknown to what extent early detection and surgical treatment of small cancers would prevent local or distant metastatic spread. Observational data suggest that differentiated thyroid cancer mortality rates may be doubled if initial treatment is delayed greater than 1 yr after discovery (24), although this analysis included primary tumors of all sizes and therefore may not apply to 10- to 14-mm cancers specifically. Thus, despite a 20–50% likelihood of lymph node involvement at primary surgical treatment (3), it remains unclear whether delayed diagnosis substantially impacts prognosis in the setting of a 10- to 14-mm thyroid cancer with no clinical evidence of metastases.
Lastly, one must also consider monetary costs and other implications of a general practice. For example, if FNAB were performed on all of the estimated 11–12% of the population with thyroid nodules 10 mm or greater (40), 1–2% of the population could receive hemithyroidectomy/isthmusectomy for benign disease.
In conclusion, routine FNAB has the lowest EV of the approaches evaluated in this decision analysis. The low EV of routine FNAB is primarily driven by the high percentage of patients who would have surgery for benign nodules. These results suggest that US criteria should be used to determine which 10- to 14-mm nodules should undergo FNAB.
Footnotes
This work was supported by the National Institute of Child Health and Human Development, National Institutes of Health, through Grant K23-HD-044742 (to C.R.M.).
Disclosure Statement: The authors have nothing to disclose. Neither author is a current member of the organizations responsible for the guidelines compared herein.
First Published Online May 27, 2008
Abbreviations: AACE, American Association of Clinical Endocrinologists; ATA, American Thyroid Association; EV, expected value; FNAB, fine-needle aspiration biopsy; NPV, negative predictive value; PPV, positive predictive value; SRU, Society of Radiologists in Ultrasound; US, ultrasonography.
References
- Frates MC, Benson CB, Charboneau JW, Cibas ES, Clark OH, Coleman BG, Cronan JJ, Doubilet PM, Evans DB, Goellner JR, Hay ID, Hertzberg BS, Intenzo CM, Jeffrey RB, Langer JE, Larsen PR, Mandel SJ, Middleton WD, Reading CC, Sherman SI, Tessler FN 2005 Management of thyroid nodules detected at US: Society of Radiologists in Ultrasound consensus conference statement. Radiology 237:794–800 [DOI] [PubMed] [Google Scholar]
- 2006 American Association of Clinical Endocrinologists and Associazione Medici Endocrinologi medical guidelines for clinical practice for the diagnosis and management of thyroid nodules. Endocr Pract 12:63–102 [DOI] [PubMed] [Google Scholar]
- Cooper DS, Doherty GM, Haugen BR, Kloos RT, Lee SL, Mandel SJ, Mazzaferri EL, McIver B, Sherman SI, Tuttle RM 2006 Management guidelines for patients with thyroid nodules and differentiated thyroid cancer. Thyroid 16:109–142 [DOI] [PubMed] [Google Scholar]
- Burman KD 2006 Micropapillary thyroid cancer: should we aspirate all nodules regardless of size? J Clin Endocrinol Metab 91:2043–2046 [DOI] [PubMed] [Google Scholar]
- Baskin HJ, Duick DS 2006 The endocrinologists’ view of ultrasound guidelines for fine needle aspiration. Thyroid 16:207–208 [DOI] [PubMed] [Google Scholar]
- Davies TF 2006 Is consensus a good thing in the management of thyroid nodules? Thyroid 16:205 [DOI] [PubMed] [Google Scholar]
- Tan GH, Gharib H 1997 Thyroid incidentalomas: management approaches to nonpalpable nodules discovered incidentally on thyroid imaging. Ann Intern Med 126:226–231 [DOI] [PubMed] [Google Scholar]
- The NCCN Thyroid Carcinoma Clinical Practice Guidelines in Oncology 2007 National Comprehensive Cancer Network, Inc. (version 2.2007). Available at: http://www.nccn.org. Accessed August 6, 2007 (to view the most recent and complete version of the guideline, go online to www.nccn.org) [Google Scholar]
- Petitti DB 1994 Meta-analysis, decision analysis, and cost-effectiveness analysis. New York: Oxford University Press [Google Scholar]
- Hegedus L, Bonnema SJ, Bennedbaek FN 2003 Management of simple nodular goiter: current status and future perspectives. Endocr Rev 24:102–132 [DOI] [PubMed] [Google Scholar]
- Kaplan MM 2005 Clinical evaluation and management of solitary thyroid nodules. In: Braverman LE, Utiger RD, eds. Werner and Ingbar’s the thyroid: a fundamental and clinical text. 9th ed. Philadelphia: Lippincott Williams & Wilkins; 996–1010 [Google Scholar]
- Filetti S, Durante C, Torlontano M 2006 Nonsurgical approaches to the management of thyroid nodules. Nat Clin Pract Endocrinol Metab 2:384–394 [DOI] [PubMed] [Google Scholar]
- McHenry CR, Slusarczyk SJ, Askari AT, Lange RL, Smith CM, Nekl K, Murphy TA 1998 Refined use of scintigraphy in the evaluation of nodular thyroid disease. Surgery 124:656–661; discussion 661–662 [DOI] [PubMed] [Google Scholar]
- Gharib H, Goellner JR, Zinsmeister AR, Grant CS, Van Heerden JA 1984 Fine-needle aspiration biopsy of the thyroid. The problem of suspicious cytologic findings. Ann Intern Med 101:25–28 [DOI] [PubMed] [Google Scholar]
- Sabel MS, Staren ED, Gianakakis LM, Dwarakanathan S, Prinz RA 1997 Effectiveness of the thyroid scan in evaluation of the solitary thyroid nodule. Am Surg 63:660–663; discussion 663–664 [PubMed] [Google Scholar]
- Tuttle RM, Lemar H, Burch HB 1998 Clinical features associated with an increased risk of thyroid malignancy in patients with follicular neoplasia by fine-needle aspiration. Thyroid 8:377–383 [DOI] [PubMed] [Google Scholar]
- Block MA, Dailey GE, Robb JA 1983 Thyroid nodules indeterminate by needle biopsy. Am J Surg 146:72–78 [DOI] [PubMed] [Google Scholar]
- Papini E, Guglielmi R, Bianchini A, Crescenzi A, Taccogna S, Nardi F, Panunzi C, Rinaldi R, Toscano V, Pacella CM 2002 Risk of malignancy in nonpalpable thyroid nodules: predictive value of ultrasound and color-Doppler features. J Clin Endocrinol Metab 87:1941–1946 [DOI] [PubMed] [Google Scholar]
- Kim EK, Park CS, Chung WY, Oh KK, Kim DI, Lee JT, Yoo HS 2002 New sonographic criteria for recommending fine-needle aspiration biopsy of nonpalpable solid nodules of the thyroid. AJR Am J Roentgenol 178:687–691 [DOI] [PubMed] [Google Scholar]
- Danese D, Sciacchitano S, Farsetti A, Andreoli M, Pontecorvi A 1998 Diagnostic accuracy of conventional versus sonography-guided fine-needle aspiration biopsy of thyroid nodules. Thyroid 8:15–21 [DOI] [PubMed] [Google Scholar]
- Cochand-Priollet B, Guillausseau PJ, Chagnon S, Hoang C, Guillausseau-Scholer C, Chanson P, Dahan H, Warnet A, Tran Ba Huy PT, Valleur P 1994 The diagnostic value of fine-needle aspiration biopsy under ultrasonography in nonfunctional thyroid nodules: a prospective study comparing cytologic and histologic findings. Am J Med 97:152–157 [DOI] [PubMed] [Google Scholar]
- Yang GC, Liebeskind D, Messina AV 2001 Ultrasound-guided fine-needle aspiration of the thyroid assessed by Ultrafast Papanicolaou stain: data from 1135 biopsies with a two- to six-year follow-up. Thyroid 11:581–589 [DOI] [PubMed] [Google Scholar]
- Gharib H, Papini E 2007 Thyroid nodules: clinical importance, assessment, and treatment. Endocrinol Metab Clin North Am 36:707–735, vi [DOI] [PubMed] [Google Scholar]
- Mazzaferri EL, Jhiang SM 1994 Long-term impact of initial surgical and medical therapy on papillary and follicular thyroid cancer. Am J Med 97:418–428 [DOI] [PubMed] [Google Scholar]
- Froberg DG, Kane RL 1989 Methodology for measuring health-state preferences—II: scaling methods. J Clin Epidemiol 42:459–471 [DOI] [PubMed] [Google Scholar]
- Shrime MG, Goldstein DP, Seaberg RM, Sawka AM, Rotstein L, Freeman JL, Gullane PJ 2007 Cost-effective management of low-risk papillary thyroid carcinoma. Arch Otolaryngol Head Neck Surg 133:1245–1253 [DOI] [PubMed] [Google Scholar]
- Delbridge L 2006 Solitary thyroid nodule: current management. ANZ J Surg 76:381–386 [DOI] [PubMed] [Google Scholar]
- McHenry CR, Slusarczyk SJ 2000 Hypothyroidism following hemithyroidectomy: incidence, risk factors, and management. Surgery 128:994–998 [DOI] [PubMed] [Google Scholar]
- Miller FR, Paulson D, Prihoda TJ, Otto RA 2006 Risk factors for the development of hypothyroidism after hemithyroidectomy. Arch Otolaryngol Head Neck Surg 132:36–38 [DOI] [PubMed] [Google Scholar]
- Piper HG, Bugis SP, Wilkins GE, Walker BA, Wiseman S, Baliski CR 2005 Detecting and defining hypothyroidism after hemithyroidectomy. Am J Surg 189:587–591; discussion 591 [DOI] [PubMed] [Google Scholar]
- Rosario PW, Pereira LF, Borges MA, Alves MF, Purisch S 2006 Factors predicting the occurrence of hypothyroidism after hemithyroidectomy. Thyroid 16:707 [DOI] [PubMed] [Google Scholar]
- Ries LAG, Melbert D, Krapcho M, Mariotto A, Miller BA, Feuer EJ, Clegg L, Horner MJ, Howlader N, Eisner MP, Reichman M, Edwards BK SEER cancer statistics review, 1975–2004. Bethesda, MD: National Cancer Institute. http://seer.cancer.gov/csr/1975_2004/, based on November 2006 SEER data submission, posted to the SEER web site, 2007 [Google Scholar]
- Sampson RJ, Woolner LB, Bahn RC, Kurland LT 1974 Occult thyroid carcinoma in Olmsted County, Minnesota: prevalence at autopsy compared with that in Hiroshima and Nagasaki, Japan. Cancer 34:2072–2076 [DOI] [PubMed] [Google Scholar]
- Wang C, Crapo LM 1997 The epidemiology of thyroid disease and implications for screening. Endocrinol Metab Clin North Am 26:189–218 [DOI] [PubMed] [Google Scholar]
- Pazaitou-Panayiotou K, Capezzone M, Pacini F 2007 Clinical features and therapeutic implication of papillary thyroid microcarcinoma. Thyroid 17:1085–1092 [DOI] [PubMed] [Google Scholar]
- Pellegriti G, Scollo C, Lumera G, Regalbuto C, Vigneri R, Belfiore A 2004 Clinical behavior and outcome of papillary thyroid cancers smaller than 1.5 cm in diameter: study of 299 cases. J Clin Endocrinol Metab 89:3713–3720 [DOI] [PubMed] [Google Scholar]
- Mazzaferri EL 2006 Managing small thyroid cancers. JAMA 295:2179–2182 [DOI] [PubMed] [Google Scholar]
- Machens A, Holzhausen HJ, Dralle H 2005 The prognostic value of primary tumor size in papillary and follicular thyroid carcinoma. Cancer 103:2269–2273 [DOI] [PubMed] [Google Scholar]
- Roti E, Rossi R, Trasforini G, Bertelli F, Ambrosio MR, Busutti L, Pearce EN, Braverman LE, Degli Uberti EC 2006 Clinical and histological characteristics of papillary thyroid microcarcinoma: results of a retrospective study in 243 patients. J Clin Endocrinol Metab 91:2171–2178 [DOI] [PubMed] [Google Scholar]
- Ross DS 2002 Nonpalpable thyroid nodules—managing an epidemic. J Clin Endocrinol Metab 87:1938–1940 [DOI] [PubMed] [Google Scholar]