The purpose of this study was to validate the role of the total malignancy score (TMS) in identifying thyroid nodules suspicious for malignancy through the sum of their ultrasound features.
The local ethical committee approved this prospective observational study. We examined 231 nodules in 231 consecutive patients (164 females and 67 males; age range, 20 to 87 years; median age, 59 years; interquartile range, 48 to 70 years) who underwent ultrasound followed by fine-needle aspiration cytology (FNAC). The nodules were further classified using the TMS, which considers ultrasound features (number, echogenicity, structure, halo, margins, Doppler signal, calcifications, and growth), and the Bethesda System for Reporting Thyroid Cytopathology (TBSRTC), which considers cytological features. Patients with non-negative nodules (TBSRTC categories III to VI) underwent histological analysis, repeated FNAC, or 2 years of regular ultrasound follow-up. The associations between the final diagnosis, each of the ultrasound features, and the TMS were estimated using the chi-square test, the Mann-Whitney U test, and multivariate logistic regression. A receiver operating characteristic (ROC) curve was used to evaluate the diagnostic accuracy of the TMS.
On ultrasound, 47% of the nodules (108 of 231) had a TMS <3, 18% (42 of 231) had a TMS of 3, and 35% (81 of 231) had a TMS >3. The FNAC results of 85% of the nodules (196 of 231) were benign, while 15% (35 of 231) had non-negative results. Hypoechogenicity, solid structure, the presence of microcalcifications, and the number of nodules were independent predictors of the final diagnosis, and the diagnostic accuracy of the TMS was good (area under the ROC curve, 0.82).
Thyroid nodules are very common findings in the adult population, especially in women . The prevalence on clinical examinations is about 4%-7%, but the widespread use of ultrasound (US) has increased it to as high as 67% [2-5]. The discovery of non-palpable nodules raises concerns about their possible malignancy. Most of the detected nodules are benign, while 3%-7% are malignant [6-8], mainly affecting patients younger than 20 years or older than 60 years [4,9]. Moreover, given the financial burden on the health system and the unnecessary anxiety it induces in patients, it is unrealistic to biopsy every thyroid nodule to confirm the diagnosis . In addition, thyroid cancer has an excellent prognosis and long-term survival rate, and aggressive early interventions may not affect the clinical outcomes, as is the case for other kinds of cancer . Therefore, the challenge is to correctly identify malignant nodules and to avoid unnecessary procedures for benign ones. Suspicious US features may be useful for selecting patients for fine-needle aspiration cytology (FNAC) when incidental nodules are discovered and when multiple nodules are present.
No single accepted guideline exists for the management of thyroid nodules. FNAC is considered the method of choice to diagnose thyroid cancer, but it is an invasive procedure . Thus, it can be avoided in nodules without suspicious characteristics in favor of a less invasive test. Many studies have attempted to identify US features pathognomonic for malignancy, leading to a system of risk stratification based on the following predictors: solid or mostly solid structure, hypoechogenicity, irregular margins, microcalcifications, a discontinuous halo, taller-than-wide (TTW) shape, intralesional flow on Doppler examinations, and a >20% size increase in 6 months [5,7-14].
In a previous study of thyroid follicular proliferation, the US-based total malignancy score (TMS) was proposed for risk stratification, and it was suggested that US follow-up may be appropriate for nodules with a TMS <3 (low malignant potential) (Table 1) . This score has the potential to simplify the practical management of thyroid nodules.
The aim of this study was to validate the role of the TMS in identifying thyroid nodules suspicious for malignancy through the sum of their US features and to estimate its usefulness in a large population. We evaluated the predictive value of US findings in the diagnosis of malignant thyroid nodules and attempted to identify features that might be useful for making practical decisions about the management of malignant thyroid nodules.
Materials and Methods
The Intercompany Ethics Committee of Zone A, Milan approved this prospective observational study. All patients enrolled in this study provided written informed consent before each invasive procedure. Informed consent was not required for the diagnostic ultrasound.
All patients who underwent FNAC of a suspicious thyroid nodule between September 2012 and April 2015 were analyzed. US was performed immediately before the FNAC procedure. For each patient, the TMS of the suspicious nodule was calculated, as previously described (Table 1) . A cytological report classifying each nodule according to the six general diagnostic categories of the Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) was produced . Nodules reported as inadequate (TBSRTC category I) were excluded from the study. Nodules classified as category II were considered definitively negative and the patients did not undergo further examinations. Nodules classified as category III, IV, V, or VI were considered non-negative, and those patients were managed through histologic analysis, repeated FNAC, or ≥2 years of regular US follow-up.
US examinations were performed by three board-certified radiologists (G.G.P., S.T., and N.F.) with more than 10 years of experience in neck imaging, using a General Electric Logiq 9 US system (GE Healthcare, Milwaukee, WI, USA) with a linear 9-14 MHz frequency probe. A pulse repetition frequency of 2400 Hz and a color gain of 40%-50% were used for Doppler US, performed with a linear high-frequency probe (7.5-10 MHz). The radiologist who performed the diagnostic US also performed US-guided FNAC of the suspicious nodules. A cytopathologist was on site during the FNAC procedures and a quick assessment was made of the sample adequacy (i.e., the presence in the sample of enough thyroid cells to reach a diagnosis) by means of an ultrafast Papanicolaou stain. Aspiration was performed with 10-cm-long 23G or 25G needles, first by capillarity and, eventually, by forced syringe suction. Papanicolaou staining and hematoxylin and eosin staining were the most commonly used stains for diagnosis.
According to the cytopathologist’s requirements, 1-4 aspirations were performed on nodules with suspicious US features. For each nodule, the individual US features were summed up to obtain the TMS; the cytopathologist was blinded to the TMS score results. Based on incoming evidence, when the study was already in progress, we also decided to evaluate whether lesions had a TTW shape. Unfortunately, as in our hospital US images are not available on the picture archiving and communication system, we were only able to consider this feature for slightly less than half of the nodules, and it was not included in the score of any nodules.
The relationships between each US feature, the TMS, the nodule size (maximum diameter), the patient’s age, and the final diagnosis were estimated.
Continuous variables were expressed as median and interquartile range (IQR), whereas categorical variables were expressed as counts and percentages. Associations between the final diagnosis and each of the US features were estimated using the chi-square test, whereas associations of the diagnosis with the nodule’s diameter and the patient’s age were estimated using the Mann-Whitney U test. Multivariate logistic regression was performed using variables found to be statistically significant in the bivariate analysis as predictors of the final diagnosis. A receiver operating characteristic (ROC) curve was obtained for the diagnostic accuracy of TMS.
All analyses were performed using SPSS ver. 17 software (SPSS Inc., Chicago, IL, USA), and P-values <0.05 were considered to indicate statistical significance.
US-guided FNAC was performed in 249 thyroid nodules in 249 patients. The FNAC results for 18 nodules were excluded due to non-diagnostic or unsatisfactory samples (TBSRTC category I). Thus, we analyzed 231 nodules in 231 patients (164 females, 67 males; age range, 20 to 87 years; median age, 59 years; IQR, 48 to 70 years). No patients were lost to follow-up.
The FNAC results for 85% of the nodules (196 of 231) were benign (TBSRTC category II) (Fig. 1) and were non-negative (TBSRTC category ≥III) for 15% (35 of 231) (Table 2). Among the 35 non-negative nodules, 17 were malignant tumors (11 papillary carcinomas, 2 medullary carcinomas, 2 squamous carcinomas, and 2 anaplastic carcinomas) (Fig. 2), 10 were adenomas (Fig. 3), one was an intraparenchymal parathyroid, and the remaining two nodules were a hyperplasia (TBSRTC category III) and a multinodular goiter (TBSRTC category IV). Five nodules, all classified as TBSRTC category III, underwent follow-up instead of being resected. Of these nodules, only one showed an increasing size during US follow-up; this patient underwent a second FNAC, which confirmed the primary diagnosis of follicular lesion of undetermined significance. The other four nonresected nodules remained stable over 2 years of US follow-up; one of them was aspirated again, with a benign cytological diagnosis.
Bivariate Analysis, Multivariate Analysis, and TMS Accuracy
Among the 231 nodules that were examined, 108 (47%) had a TMS <3, 42 (18%) had a TMS of 3, and 81 (35%) had a TMS >3. Of the 198 definitively benign nodules (196 on cytology and 2 on histology), 107 (54%) had a TMS <3, 35 (18%) had a TMS of 3, and 56 (28%) had a TMS >3. Among the 33 confirmed non-negative nodules, one (3%) had a TMS <3, seven (21%) had a TMS of 3, and 25 (76%) had a TMS >3; and none of the malignant nodules had a TMS <3. The median TMS was 2 (IQR, 1 to 4) for benign nodules and 4 (IQR, 4 to 6) for non-negative nodules, which was a statistically significant difference (P<0.001).
The distribution of US features in non-negative and negative thyroid nodules and the results of the bivariate analysis are reported in detail in Table 3, and Table 4 presents the results of the multivariate analysis. In particular, in the multivariate analysis, hypoechogenicity, a solid structure, the presence of microcalcifications, and the number of nodules were independent predictors of the final diagnosis. The diagnostic accuracy of TMS was good, with an area under the ROC curve of 0.82 (Fig. 4). In contrast, the associations of the final diagnosis with nodule size and patient age were not statistically significant (P=0.715 and P=0.894, respectively). The shape of 44% of the nodules (102 of 231) was recorded. A TTW shape was registered in 8% of them (8 of 102), all of which had non-negative cytology (6 carcinomas and 2 adenomas) (Fig. 5).
Many studies have investigated whether the ultrasound characteristics of thyroid nodules are useful indicators of histological malignancy. Overall, these investigations have identified a few ultrasound features that are significantly more frequent in malignant than in benign thyroid nodules; some authors have tried to define a set of characteristics that identify nodules at a higher risk of malignancy, while other studies have investigated the feasibility of applying the Breast Imaging Reporting and Data System (BI-RADS) concept to the US evaluation of thyroid nodules, calling it the Thyroid Imaging Reporting and Data System (TI-RADS) [5,6,10,13,16-19]. The goal was to group thyroid lesions in different categories with percentages of malignancy similar to those of the BI-RADS categories. However, no standard TI-RADS classification system is yet widely accepted and used in routine clinical practice [13,14,20]. It is well known that no single US feature allows the differentiation of malignant and benign thyroid lesions. However, the presence of multiple suspicious features in a nodular lesion is closely correlated with the nodule’s risk of malignancy, suggesting that the global appearance of each nodule, including all its features, should be considered for diagnosis [6,21,22].
The aim of our study was to validate the TMS in a large series of consecutive patients, in order to optimize the diagnostic and therapeutic management of thyroid nodular lesions. The TMS, which is based on indications already described in the literature that have been adopted, is simple to use and to interpret [6,9,13,15,17,23]. It constitutes a basis to stratify patients into groups with different risks of malignancy. Adamczewski and Lewinski  recently described a similar attempt, evaluating 200 lesions in 111 patients, with encouraging results that were helpful in making proper diagnostic and therapeutic decisions. Those authors proposed a scoring system based on US features that grouped patients into three prognostic categories . They found microcalcifications (odds ratio [OR], 24.67), a TTW shape (OR, 301), increased intranodular vascularity (OR, 20.44), and hypoechogenicity (OR, 18.61) to be features suggestive of malignancy. Other authors have stated that the presence of at least two US features suspicious for malignancy may be the best criterion for recommending FNAC .
The reliability of our scoring system was good, with a significant difference between the median TMS of benign and non-negative nodules (P<0.001). Approximately 75% of the non-negative nodules had a score >3 and none of the malignant ones had a score <3. Our experience confirmed that we could assess the risk of malignancy in a graduated manner according to the TMS category, with no risk or negligible risk for a TMS <3, low risk for a TMS of 3, and medium or high risk for higher TMS values. An innovation in our scoring system compared to other studies was that we not only included the US features of the nodule, but also accounted for the fact that a single nodule, regardless of its appearance, in a normal gland could be more dangerous than a nodule in a multi-nodular gland [10,22,24,25]. This was confirmed by the results of the multivariate analysis, in which the presence of a single nodule was an independent predictor of non-negativity. Our results confirmed the previously established reliability of certain US characteristics as predictors of malignancy, such as hypoechogenicity, a solid structure, irregular margins, and microcalcifications [5,7,9,12]. Moreover, our results showed that a TTW shape was an exclusive characteristic of non-negative nodules, as it was not detected in any benign nodules. This is in agreement with a recent meta-analysis that showed a TTW shape to be the strongest predictor of malignancy, despite a quite low sensitivity .
This study has several limitations. First, selection bias could have been present in the choice of patients, but only to an extent that reflects clinical practice. Second, all nodules with negative cytology results were considered to be definitively benign, and no further procedures or follow-up was performed, but this also reflects clinical practice. Moreover, we do not think it would be ethically sound to force patients with nodules highly likely to be benign to undergo surgery solely to obtain histological proof of the benignity of the nodule. Thus, false-negatives could have been present among the cytology results, potentially affecting our conclusions. Third, the results for the TTW shape are limited to approximately half of the population, as we began to consider this feature later, based on updated research findings . Finally, interobserver variability in the interpretations of US images among the radiologists was not evaluated; however, we think that all properly-trained sonographers can easily recognize the features considered in the TMS.
In conclusion, although the proposed TMS still requires scientific validation in a larger cohort of patients, we are confident that it will become useful for all sonographers dealing with nodular thyroid disease, and its simplicity should ensure its wider diffusion. According to our results, we suggest limiting FNAC to nodules with a TMS ≥3. In our population, this would have limited the number of interventional procedures to 53% (123 of 231) of the nodules, without missing any carcinomas. In the presence of economic and/or organizational reasons to strongly restrict the number of FNACs, a good compromise between the risk of missing carcinomas and the need to avoid unnecessary procedures would probably be represented by restricting FNAC to nodules with a TMS ≥4, with follow-up required for nodules with a TMS of 3. In our population, this would have limited the number of interventional procedures to 35% (81 of 231), missing 24% of the non-negative nodules (8 of 33). We therefore advocate this scoring system as a simple-to-use, reliable, and easily reproducible tool.
1. Wolinski K, Szkudlarek M, Szczepanek-Parulska E, Ruchala M. Usefulness of different ultrasound features of malignancy in predicting the type of thyroid lesions: a meta-analysis of prospective studies. Pol Arch Med Wewn 2014;124:97–104.
2. Rahimi M, Farshchian N, Rezaee E, Shahebrahimi K, Madani H. To differentiate benign from malignant thyroid nodule comparison of sonography with FNAC findings. Pak J Med Sci 2013;29:77–80.
3. Papini E, Guglielmi R, Bianchini A, Crescenzi A, Taccogna S, Nardi F, et al. Risk of malignancy in nonpalpable thyroid nodules: predictive value of ultrasound and color-Doppler features. J Clin Endocrinol Metab 2002;87:1941–1946.
4. Rorive S, D'Haene N, Fossion C, Delpierre I, Abarguia N, Avni F, et al. Ultrasound-guided fine-needle aspiration of thyroid nodules: stratification of malignancy risk using follicular proliferation grading, clinical and ultrasonographic features. Eur J Endocrinol 2010;162:1107–1115.
5. Kwak JY, Han KH, Yoon JH, Moon HJ, Son EJ, Park SH, et al. Thyroid imaging reporting and data system for US features of nodules: a step in establishing better stratification of cancer risk. Radiology 2011;260:892–899.
6. Xie C, Cox P, Taylor N, LaPorte S. Ultrasonography of thyroid nodules: a pictorial review. Insights Imaging 2016;7:77–86.
7. Hoang JK, Lee WK, Lee M, Johnson D, Farrell S. US Features of thyroid malignancy: pearls and pitfalls. Radiographics 2007;27:847–860.
8. Davies L, Welch HG. Current thyroid cancer trends in the United States. JAMA Otolaryngol Head Neck Surg 2014;140:317–322.
9. Frates MC, Benson CB, Charboneau JW, Cibas ES, Clark OH, Coleman BG, et al. Management of thyroid nodules detected at US: Society of Radiologists in Ultrasound consensus conference statement. Radiology 2005;237:794–800.
10. Horvath E, Majlis S, Rossi R, Franco C, Niedmann JP, Castro A, et al. An ultrasonogram reporting system for thyroid nodules stratifying cancer risk for clinical management. J Clin Endocrinol Metab 2009;94:1748–1751.
11. Hambly NM, Gonen M, Gerst SR, Li D, Jia X, Mironov S, et al. Implementation of evidence-based guidelines for thyroid nodule biopsy: a model for establishment of practice standards. AJR Am J Roentgenol 2011;196:655–660.
12. Andrioli M, Carzaniga C, Persani L. Standardized ultrasound report for thyroid nodules: the endocrinologist's viewpoint. Eur Thyroid J 2013;2:37–48.
13. Koh J, Kim SY, Lee HS, Kim EK, Kwak JY, Moon HJ, et al. Diagnostic performances and interobserver agreement according to observer experience: a comparison study using three guidelines for management of thyroid nodules. Acta Radiol 2018;59:917–923.
14. Park JW, Kim DW, Kim D, Baek JW, Lee YJ, Baek HJ. Korean Thyroid Imaging Reporting and Data System features of follicular thyroid adenoma and carcinoma: a single-center study. Ultrasonography 2017;36:349–354.
15. Pompili G, Tresoldi S, Primolevo A, De Pasquale L, Di Leo G, Cornalba G. Management of thyroid follicular proliferation: an ultrasound-based malignancy score to opt for surgical or conservative treatment. Ultrasound Med Biol 2013;39:1350–1355.
16. Cibas ES, Ali SZ; NCI Thyroid FNA State of the Science Conference. The Bethesda System For Reporting Thyroid Cytopathology. Am J Clin Pathol 2009;132:658–665.
17. Cappelli C, Castellano M, Pirola I, Cumetti D, Agosti B, Gandossi E, et al. The predictive value of ultrasound findings in the management of thyroid nodules. QJM 2007;100:29–35.
18. Lingam RK, Qarib MH, Tolley NS. Evaluating thyroid nodules: predicting and selecting malignant nodules for fine-needle aspiration (FNA) cytology. Insights Imaging 2013;4:617–624.
19. Maia FF, Matos PS, Pavin EJ, Zantut-Wittmann DE. Thyroid imaging reporting and data system score combined with Bethesda system for malignancy risk stratification in thyroid nodules with indeterminate results on cytology. Clin Endocrinol (Oxf) 2015;82:439–444.
20. Srinivas MN, Amogh VN, Gautam MS, Prathyusha IS, Vikram NR, Retnam MK, et al. A prospective study to evaluate the reliability of thyroid imaging reporting and data system in differentiation between benign and malignant thyroid lesions. J Clin Imaging Sci 2016;6:5.
21. Adamczewski Z, Lewinski A. Proposed algorithm for management of patients with thyroid nodules/focal lesions, based on ultrasound (US) and fine-needle aspiration biopsy (FNAB); our own experience. Thyroid Res 2013;6:6.
22. Mohammadi A, Hajizadeh T. Evaluation of diagnostic efficacy of ultrasound scoring system to select thyroid nodules requiring fine needle aspiration biopsy. Int J Clin Exp Med 2013;6:641–648.
23. Bae JM, Hahn SY, Shin JH, Ko EY. Inter-exam agreement and diagnostic performance of the Korean thyroid imaging reporting and data system for thyroid nodule assessment: real-time versus static ultrasonography. Eur J Radiol 2018;98:14–19.
24. Lee YH, Kim DW, In HS, Park JS, Kim SH, Eom JW, et al. Differentiation between benign and malignant solid thyroid nodules using an US classification system. Korean J Radiol 2011;12:559–567.
Adapted from Pompili et al. Ultrasound Med Biol 2013;39:1350-1355, with permission of Elsevier .