Inconclusive cytology results of fine-needle aspiration for thyroid nodules: the importance of strict guideline implementation
Article information
Abstract
Purpose
This study investigated whether strict adherence to the Korean Thyroid Imaging Reporting and Data System (K-TIRADS) biopsy indications could reduce inconclusive cytology results and evaluated associated clinical factors.
Methods
This retrospective study included 2,440 nodules from 2,256 patients who underwent initial fine-needle aspiration (FNA) for thyroid nodules from January to December 2022. Inconclusive specimens were defined as Bethesda categories I and III, while conclusive specimens comprised Bethesda categories II, IV, V, and VI. Nodules smaller than the K-TIRADS biopsy threshold were considered FNA not-indicated nodules. Clinical factors included patient age, sex, ultrasound features, nodule size, number of needle passes, and operator experience. Univariate and multivariate logistic regression analyses were performed to assess the associations between clinical factors and inconclusive results.
Results
Among 2,440 nodules, 900 yielded initial inconclusive biopsy results, while 1,540 provided conclusive results. Independent predictors of inconclusive biopsy results included FNA not-indicated nodules, nodules sized 0 to 5 mm, operator experience of less than 1 year, older age, and K-TIRADS category 4 (P<0.001, P=0.006, P=0.003, P=0.001, and P<0.001, respectively). Among K-TIRADS category 4 nodules, the presence of suspicious ultrasound characteristics was negatively associated with inconclusive biopsy results (P=0.004).
Conclusion
FNA not-indicated nodules, nodule size of 0 to 5 mm, operator experience of less than 1 year, older age, and K-TIRADS category 4 were factors associated with inconclusive biopsy results. Strict adherence to the K-TIRADS biopsy indications may reduce inconclusive cytology results.
Introduction
Thyroid nodules are common, affecting up to 25% of the general population, with an even higher prevalence in women (36.5%) and older adults (44.7%) [1,2]. Advances in diagnostic technologies have increased thyroid nodule detection rates, contributing to the global rise in thyroid cancer prevalence [3]. With a marked increase in incidence over recent decades, thyroid cancer is currently the most common cancer in Korea, with 68.8 cases per 100,000 population according to the National Cancer Information Center [4,5]. However, the mortality rate of thyroid cancer has remained stable, suggesting potential overdiagnosis and overtreatment of thyroid nodules and cancers [5,6]. In 2020, Korea exhibited the highest incidence-to-mortality rate ratio among 185 countries, underscoring that thyroid cancer overdiagnosis remains a critical health concern in Korea [7].
Excessive diagnosis and subsequent treatment of thyroid nodules might offer minimal therapeutic benefit compared to active surveillance, and could even cause harm, economic burdens, and psychological distress for patients. These issues include surgical complications, long-term follow-up necessities, and anxiety from diagnosis, treatment, and remission stages [8-10]. To ensure optimal management of thyroid nodules, several research groups and societies have recommended risk stratification systems and size criteria for fine-needle aspiration (FNA) [11-16]. Adhering to these guidelines is crucial for minimizing overdiagnosis and overtreatment in clinical practice. Nevertheless, a substantial number of thyroid nodules at tertiary hospitals undergo biopsy despite not meeting Korean Thyroid Imaging Reporting and Data System (K-TIRADS) size thresholds, indicating that biopsy guidelines are loosely implemented in Korea [16,17].
Although ultrasound (US)-guided FNA is the standard diagnostic method for thyroid nodules due to its cost-effectiveness and safety, inconclusive cytology results remain relatively frequent [11,18,19]. As K-TIRADS aims to improve the efficiency and cost-effectiveness of thyroid nodule management, reducing inconclusive biopsy rates would significantly improve diagnostic accuracy and decrease patient burdens. However, little is known about the specific association between adherence to K-TIRADS guidelines and inconclusive biopsy outcomes.
Definitions of "inconclusive biopsy results" vary among studies, encompassing either category III alone, categories I and III, or categories I, III, and IV [20-23]. Despite this variability, investigating inconclusive biopsy results remains essential, especially given interobserver variability among cytopathologists in Bethesda category III and the intrinsic characteristics of nodules, such as low cellularity, cystic changes, or architectural atypia, which may influence cytological outcomes [20,24]. In this study, Bethesda categories I and III were categorized as inconclusive, as the Bethesda system specifically recommends repeat FNA for managing these categories [22,23]. To examine the clinical relevance of adhering to K-TIRADS biopsy criteria, this study investigated factors associated with inconclusive biopsy results and evaluated whether strict adherence to K-TIRADS biopsy guidelines could decrease their occurrence.
Materials and Methods
Compliance with Ethical Standards
This retrospective study was approved by the institutional review board (Severance Hospital, IRB No. 4-2024-0305), which waived the requirements for patient approval and informed consent due to the retrospective nature of reviewing patient records.
Study Population
This study was conducted at a tertiary referral hospital between January 2022 and December 2022. A total of 2,268 patients were referred to the authors’ affiliated institution for nodule examination, and initial FNA was performed on 2,452 nodules. Although FNA was intended to follow the K-TIRADS guidelines, some nodules that did not meet size thresholds underwent biopsy based on patient requests or the clinical judgment of referring physicians. Six patients with diffuse microcalcifications were excluded. Another six patients with completely cystic nodules who received FNA for purposes other than diagnosis were also excluded. Ultimately, 2,440 nodules in 2,256 patients were included in the final analysis. Among these patients, 2,076 underwent FNA for one nodule, 176 underwent FNA for two nodules, and four patients had three nodules biopsied. Fig. 1 illustrates the diagnostic flowchart of all inconclusive nodules. Supplementary Fig. 1 provides the diagnostic flowchart of all included nodules.
US Examinations and FNA
Real-time US examinations were performed by seven radiologists specializing in thyroid imaging, using a 5-12 MHz linear array transducer (iU22, Philips Medical Systems or EPIQ, Philips Medical Systems, Bothwell, WA, USA). Five radiologists had more than 1 year of experience performing thyroid FNA, whereas two had less than 1 year of experience. The radiologist who conducted the US documented the following US features: composition, echogenicity, margin, shape, and vascularity [14]. Nodules were then classified into K-TIRADS categories based on these US features [16]. Nodule size was defined as the largest measured diameter on US. Following nodule evaluation, US-guided FNA was performed freehand by the same radiologist who had assessed the nodule, using a 23-gauge needle attached to a 2 ml syringe, without a cytopathologist present. Most nodules underwent a single needle pass (2,156 nodules, 88.4%), while two passes were performed for 277 nodules (11.4%), and three passes were done for six nodules (0.2%). The number of needle passes was determined by the radiologist based on the visual appearance of the aspirated material. For patients with multiple nodules, FNA was performed on nodules showing suspicious features or, if none were suspicious, on the largest nodule. Aspirated material was smeared onto glass slides and placed in 95% alcohol for Papanicolaou staining. Cytology results were interpreted according to Bethesda system categories [25]. Inconclusive specimens were defined as Bethesda category I (nondiagnostic/unsatisfactory) or category III (atypia of undetermined significance) by one of 20 institutional cytopathologists. Conclusive specimens included Bethesda category II (benign), IV (follicular neoplasm or suspicious for follicular neoplasm), V (suspicious for malignancy), and VI (malignant).
Data and Statistical Analysis
All US examination data were prospectively documented by the seven radiologists. Clinical characteristics of thyroid nodules and patients—including US features, nodule size, number of needle passes, operator experience, patient age, and sex—were evaluated. Radiologist experience was categorized as less than 1 year or more than 1 year. The number of needle passes was grouped into single needle pass or more than one pass. According to K-TIRADS guidelines, indications for FNA were nodules greater than 10 mm for K-TIRADS 5 (high suspicion), greater than 15 mm for K-TIRADS 4 (intermediate suspicion), and greater than 20 mm for K-TIRADS 3 (low suspicion) (Supplementary Table 1). Each nodule was classified as either FNA indicated or FNA not-indicated based on these guideline criteria.
Data are presented as number and percentage or as median (range). Categorical variables were analyzed using the chi-square test. Continuous variables underwent normality testing using the Kolmogorov-Smirnov test. Patient age and nodule size were not normally distributed and thus were compared using the Mann-Whitney U test. Associations between the number of needle passes and factors such as nodule size, US classification, and operator experience were analyzed using the Mann-Whitney U test and the chi-square test.
Univariate and multivariate logistic regression analyses were conducted to evaluate associations between clinical factors and inconclusive biopsy results. Crude odds ratios (cORs) and adjusted odds ratios (aORs) were calculated. Two separate multivariate logistic regression models were constructed: one model included biopsy indication, and the other included nodule size and US classification. Age, sex, operator experience, and the number of FNA needle passes were included for confounding adjustment. In the analysis involving FNA indications, nodule size was excluded because size is integral to biopsy indication criteria. In the analysis including nodule size, nodules were categorized into three groups: 0-5 mm, 6-10 mm, and ≥11 mm.
Additional analyses were conducted for K-TIRADS 4 nodules based on specific US characteristics. Nodules with punctate echogenic foci, irregular margins, or nonparallel orientation were classified as having suspicious characteristics. Univariate and multivariate logistic regression analyses were performed to calculate cORs and aORs, evaluating associations between suspicious US characteristics and inconclusive biopsy results. Results were presented with 95% confidence intervals (CI). Statistical analyses were performed using SPSS version 27 (IBM Corp., Armonk, NY, USA), and statistical significance was defined as P<0.05.
Results
Inconclusive Biopsy Results According to Clinical Characteristics
Among the 2,440 nodules from 2,256 included patients, 900 nodules (36.9%) yielded inconclusive biopsy results, while 1,540 nodules (63.1%) yielded conclusive results (Table 1). The distribution of nodules according to Bethesda categories was as follows: category I (nondiagnostic), 379 nodules (15.5%); category II (benign), 855 nodules (35.0%); category III (atypia of undetermined significance, AUS), 521 nodules (21.4%); category IV (follicular neoplasm or suspicious for follicular neoplasm [FN/SFN]), 10 nodules (0.4%); category V (suspicious for malignancy), 271 nodules (11.1%); and category VI (malignant), 404 nodules (16.6%). There was a significant association between inconclusive biopsy results and K-TIRADS categories (P<0.001) (Fig. 2). Nodules without indication for FNA had a significantly higher rate of inconclusive results compared to those indicated for FNA (39.5%, 609/1,543 vs. 32.4%, 291/897; P<0.001). Nodules biopsied by operators with less than 1 year of experience had significantly higher rates of inconclusive results compared to those performed by operators with more than 1 year of experience (43.4%, 164/378 vs. 35.7%, 736/2,062; P=0.004). The median age of patients with inconclusive biopsy results was significantly higher (51 [range, 15 to 89]) compared to patients with conclusive results (49 [range, 16 to 86]) (P<0.001). No significant differences were observed in median nodule size, gender distribution, or number of FNA needle passes between the inconclusive and conclusive biopsy groups (P=0.442, P=0.087, and P=0.579, respectively).

Examples of a subcentimeter fine-needle aspiration (FNA) not-indicated nodule according to Korean Thyroid Imaging Reporting and Data System (K-TIRADS).
Ultrasonography (US) from a 35-year-old woman shows a solid hypoechoic nodule without any suspicious US features. The nodule was classified as K-TIRADS 4, and as the size of the nodule was 3 mm. FNA was conducted in response to the patient’s request, and identified atypia with undetermined significance (A, transverse scan, B, longitudinal scan). Arrows indicate the aspirated nodule.
Association of Multiple Needle Passes with Nodule Size and US Characteristics
Nodules receiving two or more needle passes were significantly smaller (median 11 mm [range, 2 to 67 mm]) compared to those receiving only a single needle pass (median 12 mm [range, 1 to 109 mm]) (P=0.010). Among nodules with two or more needle passes, K-TIRADS 4 was the most common category (102 nodules, 35.9%), followed by K-TIRADS 5 (97 nodules, 34.2%) and K-TIRADS 3 (85 nodules, 29.9%). The number of needle passes was significantly associated with K-TIRADS categories (P<0.001). Furthermore, nodules receiving two or more needle passes had a significantly higher proportion biopsied by operators with less than 1 year of experience compared to those receiving a single pass (20.1%, 57/284 vs. 17.4%, 321/2,156; P=0.023).
Association of Various Clinical Characteristics and Inconclusive Biopsy Results
Multivariate logistic regression analysis identified older patient age (aOR, 1.010; 95% CI, 1.004 to 1.016; P=0.001), operators with less than 1 year of experience (aOR, 1.401; 95% CI, 1.120 to 1.754; P=0.003), and FNA not-indicated nodules (aOR, 1.397; 95% CI, 1.173 to 1.663; P<0.001) as independent predictors of inconclusive biopsy results (Table 2). In a separate multivariate analysis including nodule size and US classifications, K-TIRADS 4 (aOR, 1.502; 95% CI, 1.224 to 1.844; P<0.001) and nodules sized 0-5 mm (aOR, 1.451; 95% CI, 1.114 to 1.889; P=0.006) emerged as independent predictors of inconclusive biopsy results (Table 3). Conversely, K-TIRADS 5 was a negative predictor of inconclusive biopsy results (aOR, 0.384; 95% CI, 0.294 to 0.501; P<0.001). Nodule size between 6-10 mm did not show a significant association with inconclusive results (aOR, 1.190; 95% CI, 0.957 to 1.480; P=0.117). The number of nodules according to the size groups was 384 nodules (15.7%) for 0-5 mm, 640 nodules (26.2%) for 6-10 mm, and 1,416 nodules (58.0%) for ≥11 mm.
Association of K-TIRADS 4 Nodules and Biopsy Results According to US Features
Among K-TIRADS 4 nodules, 235 (31.2%, 235/753) exhibited suspicious US features (punctate echogenic foci, irregular margins, or nonparallel orientation). Nodules with these suspicious characteristics had a significantly lower rate of inconclusive biopsy results compared to nodules without these features (41.3%, 97/235 vs. 52.9%, 274/518; P=0.003). Multivariate logistic regression analysis confirmed that suspicious US features significantly reduced the odds of inconclusive biopsy results (aOR, 0.631; 95% CI, 0.460 to 0.866; P=0.004).
Discussion
Since repeated biopsies and surveillance due to inconclusive results can cause increased anxiety and financial burdens for patients, reducing unnecessary biopsy procedures is essential in the management of thyroid nodules [9,17]. In this study, inconclusive biopsy results were defined as Bethesda category I (nondiagnostic) and category III (AUS), while conclusive biopsy results included Bethesda categories II (benign), IV (FN/SFN), V (suspicious for malignancy), and VI (malignant). This study aimed to evaluate the relationship between adherence to existing FNA guidelines and inconclusive biopsy outcomes to highlight the importance of strict guideline implementation. It also examined various clinical factors that could influence biopsy results. Nodules not indicated for FNA, operators with less than 1 year of experience, nodule sizes ranging from 0 to 5 mm, older patient age, and K-TIRADS category 4 were all identified as independent risk factors for inconclusive biopsy results.
Of the 2,440 nodules analyzed in this study, 900 (36.9%) yielded inconclusive biopsy results, and 1,543 (63.2%) did not satisfy the established FNA indications. Nodules not indicated for biopsy significantly increased the odds of inconclusive results (P<0.001). Furthermore, nodules sized between 0 and 5 mm emerged as independent predictors of inconclusive results, whereas those sized 6 to 10 mm did not show a significant difference compared to nodules of 11 mm or larger. These findings are consistent with previous studies reporting high rates of inconclusive biopsy results in smaller nodules, especially subcentimeter nodules [26,27]. Therefore, strict adherence to nodule size-based biopsy guidelines is effective in reducing inconclusive biopsy rates, particularly for micronodules, except in special circumstances where FNA may provide direct clinical benefit, such as nodules closely adjacent to critical structures like the trachea or esophagus.
When analyzed by US characteristics, K-TIRADS category 4 nodules showed a high rate of inconclusive biopsy results (64.1%), significantly higher than nodules categorized as K-TIRADS 3 or 5. This finding aligns with previous studies demonstrating that TIRADS 4 nodules tend to have the highest inconclusive biopsy rates compared to categories 3 or 5 [28]. Despite variations in biopsy thresholds for TIRADS 4 nodules (intermediate-risk nodules) across different guidelines—15 mm for American College of Radiology Thyroid Imaging Reporting and Data System (ACR-TIRADS) and uropean Thyroid Imaging Reporting and Data System (EU-TIRADS), 10–15 mm for revised K-TIRADS, and 10 mm for American Thyroid Association (ATA) guidelines—reported malignancy rates remain similar among ACR-TIRADS (12.7%-29.8%), EU-TIRADS (6.5%-17%), revised K-TIRADS (13.8%-29.6%), and ATA guidelines (17%-25.3%) [11-13,16,29-31]. Compared with the fixed 10 mm threshold of the 2016 K-TIRADS, the 2021 revision recommends biopsy for K-TIRADS 4 nodules measuring 10-15 mm, allowing clinicians to make final decisions based on US features and other risk factors. This revision significantly reduced unnecessary biopsies while maintaining high sensitivity [16]. In this study, despite employing a fixed 15 mm biopsy threshold for K-TIRADS 4 nodules rather than a size range, these nodules still comprised the highest proportion of inconclusive biopsy results. Additionally, among nodules not indicated for biopsy, K-TIRADS 4 nodules accounted for the largest proportion (595/1,543, 38.6%). This significant association between K-TIRADS categories and biopsy indications (P<0.001) may have further contributed to the high inconclusive rate for K-TIRADS 4 nodules, reinforcing the need to maintain relatively larger size thresholds for biopsy decisions. A similar pattern was noted with other TIRADS classifications: applying a 15 mm threshold from ACR-TIRADS to intermediate-risk nodules defined by ATA guidelines improved diagnostic accuracy, specificity, and reduced unnecessary biopsies compared to using the original 10 mm threshold [29].
Within K-TIRADS 4 nodules specifically, suspicious US features significantly reduced the odds of inconclusive biopsy results (odds ratio, 0.631; 95% CI, 0.460 to 0.866; P=0.004). Consistent with previous studies associating suspicious US features with higher malignancy rates in K-TIRADS 4 nodules, the present findings emphasize the relevance of integrating US features into biopsy decision-making [32]. Although the 2021 K-TIRADS guidelines recommend biopsy decisions for K-TIRADS 4 nodules sized between 1.0 and 1.5 cm based on additional US findings, this study applied a uniform 1.5 cm threshold for K-TIRADS 4 nodules due to the retrospective nature of data collection, precluding individualized threshold determination [16]. However, supplementary analyses demonstrating lower inconclusive rates among K-TIRADS 4 nodules with suspicious features provide additional support for considering US characteristics when setting biopsy thresholds, especially in intermediate-risk nodules.
Although nodules that underwent two or more needle passes tended to be smaller in size and categorized as K-TIRADS 4, multiple needle passes were not significantly associated with inconclusive biopsy results. Operators typically performed multiple needle passes when previous aspirates visually appeared insufficient for cytological evaluation. Prior studies present conflicting views regarding the relationship between the number of needle passes and inconclusive biopsy outcomes; some studies indicate no significant impact on cytological adequacy, while others suggest that multiple needle passes may reduce inadequate samples [33,34]. In this study, performing multiple needle passes did not lower the rate of inconclusive biopsy results. Thus, multiple passes may not necessarily improve cytological adequacy for nodules inherently prone to inconclusive results, such as micronodules or K-TIRADS category 4 nodules. Additionally, biopsies conducted by operators with less than 1 year of experience showed higher rates of inconclusive results and more frequent use of multiple needle passes. This aligns with previous research demonstrating that variability in operator experience and technique directly influences biopsy outcomes [33,35]. Methods that provide assistance or guidance to less experienced physicians, such as the presence of an on-site cytopathologist (though not currently implemented at the authors’ affiliated institution), may help increase sample adequacy [34,35]. Further research is needed to precisely identify factors associated with inconclusive biopsy results, thus refining biopsy indications based on US characteristics and minimizing unnecessary repeated FNAs.
Several limitations of this study must be acknowledged. First, the study employed a retrospective design conducted at a single institution, raising the possibility of selection bias. Second, the rate of inconclusive biopsy results in the study sample (36.9%) was relatively high compared to previous reports (ranging from 5% to 34.7%) [27,36]. Notably, 63.2% of nodules in this study were classified as not indicated for FNA, which likely contributed to the high rate of inconclusive results, alongside several other influencing factors. As the authors’ institution is a tertiary medical center, staging US prior to surgery frequently prompts biopsy of nodules that do not strictly meet FNA guidelines, to determine surgical extent. Additionally, repeat FNAs are often requested by primary care clinics following inconclusive initial results, or patients independently seek tertiary centers for further evaluation. Such circumstances likely influenced the frequency of inconclusive outcomes observed in this study. Third, interobserver variability in thyroid US interpretation might limit the reproducibility of these findings in other institutions. Although interobserver agreement is typically adequate, particularly when standardized US classification systems are used, interobserver consistency across different centers remains lower, and variability persists among examiners in US reporting and classification [37,38]. Nevertheless, given the statistical significance of these findings, the identified factors related to inconclusive biopsy results should meaningfully assist Korean institutions in applying the K-TIRADS guidelines.
In conclusion, performing FNA on nodules that do not meet the K-TIRADS biopsy criteria was associated with higher inconclusive biopsy results, along with other factors such as nodule size of 0-5 mm, operators with less than 1 year of experience, older patient age, and K-TIRADS category 4. Among K-TIRADS 4 nodules, suspicious US characteristics were associated with malignancy and lower inconclusive biopsy rates, supporting the K-TIRADS biopsy guidelines. Therefore, strict adherence to the K-TIRADS biopsy guidelines is essential for reducing inconclusive biopsy outcomes. Additionally, providing education and guidance for less experienced operators and conducting continued research aimed at refining biopsy indications based on nodule size and US characteristics may further improve clinical outcomes.
Notes
Author Contributions
Conceptualization: Cho S, Kwak JY. Data acquisition: Cho S, Yoon JH, Kwak JY. Data analysis or interpretation: Cho S, Han K, Rho M, Yoon J, Kwak JY. Drafting of the manuscript: Cho S, Kwak JY. Critical revision of the manuscript: Cho S, Han K, Yoon JH, Rho M, Yoon J, Kwak JY. Approval of the final version of the manuscript: all authors.
Conflict of Interest
No potential conflict of interest relevant to this article was reported.
Supplementary Material
US categories and according biopsy size thresholds in the 2021 K-TIRADS (https://doi.org/10.14366/usg.24216).
Diagnostic flowchart of the included nodules https://doi.org/10.14366/usg.24216).
References
Article information Continued
Notes
Key point
Strict adherence to Korean Thyroid Imaging Reporting and Data System biopsy indications can reduce inconclusive cytology results. Subcentimeter nodule size, operator experience, older age, and ultrasound categorization are associated with inconclusive biopsy results.