AbstractPurposeThis study aimed to evaluate our institution's experience in using artificial intelligence (AI) decision support (DS) as part of the clinical workflow to triage patients with Breast Imaging Reporting and Data System (BI-RADS) 3 sonographic lesions whose follow-up was delayed during the coronavirus disease 2019 (COVID-19) pandemic, against subsequent imaging and/or pathologic follow-up results.
MethodsThis retrospective study included patients with a BI-RADS category 3 (i.e., probably benign) breast ultrasound assessment from August 2019–December 2019 whose follow-up was delayed during the COVID-19 pandemic and whose breast ultrasounds were re-reviewed using Koios DS Breast AI as part of the clinical workflow for triaging these patients. The output of Koios DS was compared with the true outcome of a presence or absence of breast cancer defined by resolution/stability on imaging follow-up for at least 2 years or pathology results.
ResultsThe study included 161 women (mean age, 52 years) with 221 BI-RADS category 3 sonographic lesions. Of the 221 lesions, there were two confirmed cancers (0.9% malignancy rate). Koios DS assessed 112/221 lesions (50.7%) as benign, 42/221 lesions (19.0%) as probably benign, 64/221 lesions (29.0%) as suspicious, and 3/221 lesions (1.4%) as probably malignant. Koios DS had a sensitivity of 100% (2/2; 95% confidence interval [CI], 16% to 100%), specificity of 70% (154/219; 95% CI, 64% to 76%), negative predictive value of 100% (154/154; 95% CI, 98% to 100%), and false-positive rate of 30% (65/219; 95% CI, 24% to 36%).
IntroductionArtificial intelligence (AI) in breast imaging is a growing field, with numerous studies having demonstrated its value for imaging interpretation [1]. Indeed, multiple AI applications have been developed for breast imaging, spanning mammography, magnetic resonance imaging, and ultrasound modalities. Regarding the role of AI applications in breast ultrasound specifically, studies have shown that AI applications have a high sensitivity in predicting malignancy [2-4].
Koios decision support (DS) for Breast (Koios DS Breast; Koios Medical, Inc., New York, NY, USA) is a proprietary, Food and Drug Administration (FDA)–cleared AI software program designed to aid radiologists’ interpretation of lesions identified on breast ultrasound. Koios DS generates a Breast Imaging Reporting and Data System (BI-RADS)–aligned probability of malignancy for any lesion indicated by a region of interest (ROI) on the static breast ultrasound image drawn manually by the radiologist. Koios DS has been reported to improve radiologists’ accuracy in diagnosing breast cancer [3,5].
Breast ultrasound is an essential tool in the detection and diagnosis of breast cancer. The sonographic characteristics of a lesion help drive the level of suspicion and therefore the predicted likelihood of benignity or malignancy [6]. In some cases, a sonographic lesion is given a probably benign (BI-RADS category 3) assessment and short-interval follow-up is recommended for the patient to ensure stability. While most BI-RADS category 3 sonographic lesions will prove benign on follow-up, follow-up remains essential to identify early changes in the lesion that may indicate malignancy. Studies have shown that BI-RADS category 3 sonographic lesions have a malignancy rate from 0.2%-2.2%, which varies based upon the presence or absence of a mammographic finding [7,8]. The malignancy rate increases to 4.9% in lesions that demonstrate interval growth on follow-up ultrasound [8]. Moreover, morphologic changes in growing lesions are more highly associated with malignancy [8]. Appropriate follow-up can detect these early changes and allow for timely diagnosis and treatment, which are essential to improve patient outcomes.
Follow-up imaging for an individual patient may be delayed for a variety of reasons, but there are also environmental constraints and resource limitations that may impact the follow-up of a large population of patients at the same moment in time, such as experienced during the coronavirus disease 2019 (COVID-19) pandemic [9]. During the COVID-19 pandemic, our institution, like many across the country, had to meet the challenge of scheduling breast imaging follow-up appointments for many patients with delayed follow-up. Notably, triaging care may be particularly challenging when assessing patients with a similar risk profile, as with patients with BI-RADS category 3 sonographic lesions.
To our knowledge, no prior study has assessed the use of using breast ultrasound AI DS to triage care once follow-up is delayed. We evaluated our institution's experience in using Koios DS as part of the clinical workflow to triage patients with BI-RADS category 3 sonographic lesions whose follow-up was delayed during the COVID-19 pandemic, against subsequent imaging and/or pathologic follow-up results. While our institution’s process of using Koios DS for triaging was set during the COVID-19 pandemic, the process and findings may be extrapolated to breast ultrasound follow-up in scenarios where resources are limited.
Materials and MethodsCompliance with Ethical StandardsThis retrospective study was approved by the institutional review board at Memorial Sloan Kettering Cancer Center (IRB protocol 19-119). The need for informed consent was waived.
Patient InclusionTo formally evaluate our institution's experience in using Koios DS as part of the clinical workflow to triage patients with BI-RADS 3 sonographic lesions during the COVID-19 pandemic, we performed a retrospective study which, as mentioned above, was approved by the institutional review board and compliant with the Health Insurance Portability and Accountability Act. This retrospective study was performed to assess Koios DS output, obtained as part of the clinical workflow when follow-up was delayed, against subsequent imaging and/or pathologic follow-up results. From 212 women with 295 BI-RADS category 3 sonographic lesions, we excluded 51 women with 74 BI-RADS category 3 sonographic lesions who were lost to follow-up or who had incomplete follow-up (defined as at least one imaging follow-up appointment but not 2-year imaging follow-up or pathology results to confirm benignity or malignancy) as of October 2023. Thus, the final study sample comprised 161 women with 221 BI-RADS category 3 sonographic lesions with complete imaging and/or pathology follow-up results.
Clinical Workflow during the COVID-19 PandemicDuring the COVID-19 pandemic, as part of the clinical workflow to help triage patient follow-up, in May 2020, the institutional database was searched for consecutive patients who received a BI-RADS category 3 breast ultrasound assessment from August 2019 to December 2019 and whose follow-up breast ultrasound had been delayed for more than 1 month. For patients with 1-3-month ultrasound follow-up delay (7-9 months from their initial ultrasound evaluation with a BI-RADS 3 assessment), their breast ultrasound was re-reviewed using Koios DS (Koios DS Breast; Koios Medical Inc.). Koios DS has been trained on over two million ultrasound images and uses AI and machine learning to analyze over 17,000 features per breast ultrasound image to provide DS for physicians and technologists [10]. A rectangular ROI was drawn manually around each lesion by one of two breast fellowship-trained radiologists (K.C. and V.L.M., with 4 and 12 years of experience, respectively) using orthogonal gray-scale breast ultrasound images. Koios DS output was recorded for each lesion as one of four categories indicating the likelihood of malignancy: benign, probably benign, suspicious, and probably malignant. Patients were then contacted by our institution and encouraged to return for follow-up with patients receiving probably malignant Koios DS output triaged to the earliest follow-up.
Statistical AnalysisFor this retrospective study, Koios DS output obtained per the clinical workflow was compared against the true presence or absence of breast cancer defined by stability/resolution on follow-up imaging for at least 2 years or pathology results if the lesion was biopsied. The Koios DS’ outputs "suspicious" and "probably malignant" were considered a positive assessment. Sensitivity, specificity, negative predictive value (NPV), and the false-positive rate of Koios DS were determined. All analyses were done using R 4.2 (R Foundation for Statistical Computing, Vienna, Austria).
ResultsPatient and Lesion CharacteristicsThe study sample comprised 161 patients (all female; mean age, 52 years) with 221 BI-RADS 3 sonographic lesions from August 2019 to December 2019 and who had complete imaging and/or pathology follow-up results. Slightly greater than one half of the study sample (88/161, 54.7%) were considered at increased risk of breast cancer due to factors including personal history of breast cancer, family history of breast cancer, personal history of high-risk breast lesions, and genetic mutation (Table 1).
Of the 221 lesions, 219/221 (99.1%) were benign and 2/221 (0.9%) were malignant. Of the 219 benign lesions, 212/219 were confirmed as benign by imaging follow-up and 7/219 were confirmed as benign by pathology results. Of the seven biopsied benign lesions, 1/7 (14.3%) was given a BI-RADS category 4 assessment on follow-up ultrasound imaging with recommendation to biopsy, while 6/7 (85.7%) were biopsied at the patient’s request. Both malignant lesions were confirmed as malignant by pathology results (Fig. 1).
Koios DS PerformanceOf the 221 lesions, Koios DS assessed 112/221 (50.7%) lesions as benign, 42/221 (19.0%) lesions as probably benign, 64/221 (29.0%) lesions as suspicious, and 3/221 (1.4%) lesions as probably malignant (Fig. 2). The two biopsy-proven cancers were accurately assessed as suspicious by Koios DS. Overall, among those with adequate follow-up imaging or pathology results, Koios DS had a sensitivity of 100% (2/2; 95% CI, 16% to 100%), specificity of 70% (154/219; 95% CI, 64% to 76%), NPV of 100% (154/154; 95% CI, 98% to 100%), and false-positive rate of 30% (65/219; 95% CI, 24% to 36%).
Regarding the two biopsy-proven cancers that were accurately assessed as suspicious by Koios DS, one cancer was in a 76-year-old female patient with a history of contralateral right mastectomy and ipsilateral left breast reduction mammoplasty who presented with a palpable left breast lump. Breast ultrasound findings demonstrated a 1.2-cm isoechoic mass assessed as probably benign fat necrosis secondary to post-surgical reduction changes (Fig. 3). Because she was scheduled for right implant removal (secondary to capsular contracture and discomfort), she opted for excisional biopsy of the left breast palpable lump at the same time. Surgical biopsy yielded adenoid cystic carcinoma. The second cancer was in an 82-year-old female patient with a 0.8-cm left axillary tail mass assessed as a probably benign lymph node (Fig. 4). Six-month follow-up breast ultrasound demonstrated an increase in size to 1 cm with change in echogenicity prompting a BI-RADS 4 assessment and biopsy recommendation. Breast ultrasound-guided core biopsy revealed invasive ductal carcinoma.
Regarding three lesions that were assessed as probably malignant (more than a 50% likelihood of malignancy) by Koios DS, all three were benign as assessed by imaging follow-up. Two of the three lesions were assessed as stable and thought to likely reflect fat necrosis at a site of a prior lumpectomy. The third lesion was an axillary mass, also demonstrated to be stable and consistent with a lipoma or fat lobule on imaging.
DiscussionOur study demonstrates that AI DS may be a helpful triaging tool when large groups of patients have simultaneously delayed follow-up, such as experienced during the COVID-19 pandemic. In our study, we evaluated the output of the Koios DS software, which was obtained as part of the clinical workflow to triage in patients with probably benign sonographic lesions and delayed follow-up during the COVID-19 pandemic, against the true presence or absence of breast cancer defined by stability/resolution on follow-up imaging for at least 2 years or pathology results if the lesion was biopsied. We found that Koios DS had a sensitivity and NPV of 100% while the false-positive rate was 30%.
The false-positive rate is higher than the acceptable rate if the AI software were acting in isolation to diagnose breast cancer; however, in this context, we used the AI software to assess those lesions already interpreted as probably benign by an experienced radiologist in order to help triage probably benign lesions that had delayed follow-up. The NPV of 100% reassures us that in those patients with a benign assessment by the AI system, the likelihood of malignancy is essentially zero (Fig. 5). A sensitivity of 100% is also helpful in this setting of triage, ensuring us that breast ultrasounds with cancer are captured appropriately by the AI system as suspicious or probably malignant. All three of the lesions that were assessed as probably malignant by the AI system were benign, two of which were likely fat necrosis at a site of prior lumpectomy. Of note, the AI system is not specifically trained to evaluate fat necrosis or postoperative changes.
Conversely, two cancers assessed as probably benign by the radiologist were accurately assessed as suspicious by Koios DS. On retrospective review, one case assessed as probably benign fat necrosis does not meet classic criteria for fat necrosis. However, at our institution we see a large volume of postoperative patients, in whom early evolving fat necrosis fail to meet typical probably benign or benign sonographic appearance. Given the patient’s recent left breast mastopexy, and based upon real time appearance, which the radiologist described as mixed echogenicity with central areas of hypoechogenicity, the interpreting radiologist assessed the finding as probably benign fat necrosis. The other case misinterpreted as a probably benign lymph node, even on retrospective review does appear to have morphology suggestive of a lymph node.
The potential of AI DS in breast ultrasound imaging has been demonstrated in a variety of publications [2-5,11,12]. But, to our knowledge, none have trialed such software for patient triaging purposes. Mango et al. [5] demonstrated that Koios DS improves the accuracy of breast ultrasound interpretation in a retrospective reader study that included 900 breast ultrasounds lesions. They also found that Koios DS has an extremely high sensitivity of 98%. The study included ultrasound findings assessed as BI-RADS category 2-5 by an interpreting radiologist, unlike our cohort of solely BI-RADS category 3 lesions. Shen et al. [11] highlighted the utility of an AI system developed and evaluated in a New York University ultrasound dataset. In their retrospective reader study of 600 breast ultrasound exams, the AI system improved the diagnostic accuracy of breast ultrasound interpretation by the radiologist, by 4.4% (90.1% to 94.5%). In Browne et al.’s study [2], which used Koios DS, 50 BI-RADS category 3 lesions were biopsied, with seven demonstrating malignant pathology. These lesions underwent cytology assessment at the radiologist’s discretion with subsequent core biopsy based on pathology results, which differs from BI-RADS atlas recommendations for BI-RADS category 3 lesions which typically involve short-interval imaging follow-up [6]. In our practice, if the radiologist deems that histologic assessment warranted, then a BI-RADS category 4 assessment is given. This difference in practice may account in part for the difference in malignancy rate within the BI-RADS category 3 study samples (0.9% malignancy rate in our study compared with 15.4% in Browne’s study [2]). Coffey et al. [4] used Koios DS to analyze 345 triple-negative breast cancers. The authors found that Koios DS misclassified 12 of the 345 cancers as probably benign or benign, but correctly classified six out of six cancers that were assessed as probably benign or benign by the radiologist. Similar to our findings, Koios DS had an extremely high sensitivity of 96%. The difference in performance metrics between our study and the previous studies of Coffey et al. [4] and Browne et al. [2] (in addition to those listed above) is likely, at least in part, a result of the low absolute number of cancers in our study. And lastly, Lee et al. [12] retrospectively reviewed 492 breast masses and found that while specificity, positive predative value, and accuracy significantly improved with the use of AI computer-assisted diagnosis, the area under the curve only improved with inexperienced radiologists.
In breast cancers that initially present with probably benign appearing features, timely follow-up is essential to identify changes and ensure prompt diagnosis [8]. However, as experienced during the COVID-19 pandemic, there are external elements that may impact the follow-up for many patients at the same moment(s) in time [13]. Resource limitations and environmental safety considerations may occur with minimal warning or ability to prepare. As a safety measure during the COVID-19 pandemic, in March 2020, the Society of Breast Imaging recommended that 6-month follow-up breast imaging exams are delayed until risks from the pandemic have decreased [14] and patients report experiencing those delays in breast cancer imaging [15]. As imaging facilities and institutions quickly reopened, there was pressure to accommodate an increased demand of breast imaging appointments for those with delayed screening and diagnostic appointments, while simultaneously mitigating transmission risk, ensuring social distancing in waiting rooms, and limiting the number of patients in clinic [13]. The ability to triage patients in such circumstances became essential and our results support that AI DS may be able to identify which patients to prioritize for follow-up. Of note, the potential utility of AI DS to triage extends beyond the circumstances of the COVID-19 pandemic. Such a tool may prove beneficial in number of situations in which patient care triaging needs are present, such as natural disaster, equipment, or material supply chain challenges, or in communities in which there is a struggle to adequately meet healthcare needs, both within the United States and abroad.
There are several limitations to our study. First, this is a retrospective single-institutional study. Further studies, particularly prospective trials, assessing change in management and clinical outcomes based upon integrated breast imager and AI DS assessment are needed to fully evaluate the potential of AI DS in triaging patients with breast ultrasound findings. Second, we are a dedicated cancer institute with breast imaging subspecialists. As such, the impact of a breast ultrasound AI DS tool on patients with BI-RADS category 3 assessments may vary across institutions and have a more powerful impact in settings with fewer resources. Third, we may also have a larger patient population of patients with prior lumpectomies or surgical breast excisions. As Koios DS is not trained to evaluate fat necrosis and postoperative changes, Koios DS may perform superiorly in a population of patients without prior breast surgery. Fourth, 74 patients were lost to follow-up. We recognize that some patients may have incomplete follow-up secondary to factors independent from the COVID-19 pandemic, such as seeking healthcare services at another institution. However, we also know that the pandemic did not affect people uniformly. Data have shown that the pandemic disproportionally affected minorities and people of lower socioeconomic status [16]. Our final dataset excluding those lost to follow-up is therefore biased toward a patient population that was able to return for their imaging appointments and does not include those who lost health insurance and/or other resources directly or indirectly because of the pandemic. Lastly, in our sample of 221 lesions with completed follow-up, there were two cancers. Our malignancy rate is appropriate for BI-RADS category 3 lesions at less than 2%; however, the low absolute count of cancers is a limitation when calculating statistical performance metrics. Specifically, positive predictive value (PPV) is not reported, as it is very dependent on the prevalence of cancer. The low prevalence of cancer in this population could lead to unreliable estimates of PPV.
In conclusion, breast ultrasound AI DS can be used to triage probably benign breast ultrasound findings by identifying those cases that are more likely to be malignant. This may be particularly useful when large numbers of follow-up appointments are simultaneously delayed, such as during the COVID-19 pandemic, and when radiologists are prioritizing patient care with limited resources.
NotesAuthor Contributions Conceptualization: Coffey K, Mango VL. Data acquisition: Coffey K, Mango VL. Data analysis or interpretation: Amir T, Reiner JS, Sevilimedu V. Drafting of the manuscript: Amir T. Critical revision of the manuscript: Amir T, Coffey K, Reiner JS, Sevilimedu V, Mango VL. Approval of the final version of the manuscript: all authors. References1. Fruchtman Brot H, Mango VL. Artificial intelligence in breast ultrasound: application in clinical practice. Ultrasonography 2024;43:3–14.
![]() ![]() ![]() 2. Browne JL, Pascual MA, Perez J, Salazar S, Valero B, Rodriguez I, et al. AI: can it make a difference to the predictive value of ultrasound breast biopsy? Diagnostics (Basel) 2023;13:811.
![]() ![]() ![]() 3. Amir T, Coffey K, Sevilimedu V, Fardanesh R, Mango VL. A role for breast ultrasound artificial Intelligence decision support in the evaluation of small invasive lobular carcinomas. Clin Imaging 2023;101:77–85.
![]() ![]() ![]() 4. Coffey K, Aukland B, Amir T, Sevilimedu V, Saphier NB, Mango VL. Artificial intelligence decision support for triple-negative breast cancers on ultrasound. J Breast Imaging 2024;6:33–44.
![]() ![]() ![]() ![]() 5. Mango VL, Sun M, Wynn RT, Ha R. Should we ignore, follow, or biopsy? Impact of artificial intelligence decision support on breast ultrasound lesion assessment. AJR Am J Roentgenol 2020;214:1445–1452.
![]() ![]() ![]() 6. D’Orsi CJ, Sickles EA, Mendelson EB, Morris EA. ACR BI-RADS Atlas: Breast Imaging Reporting and Data System. Reston, VA: American College of Radiology, 2013.
7. Berg WA, Berg JM, Sickles EA, Burnside ES, Zuley ML, Rosenberg RD, et al. Cancer yield and patterns of follow-up for BI-RADS category 3 after screening mammography recall in the national mammography database. Radiology 2020;296:32–41.
![]() ![]() 8. Ha SM, Chae EY, Cha JH, Shin HJ, Choi WJ, Kim HH. Growing BI-RADS category 3 lesions on follow-up breast ultrasound: malignancy rates and worrisome features. Br J Radiol 2018;91:20170787.
![]() ![]() ![]() 9. Chamadoira J, Au F, Ghai S, Kulkarni S, Grant A, Fleming R, et al. Effects of delayed callback from screening mammography due to the COVID-19 pandemic. Clin Imaging 2023;99:41–46.
![]() ![]() ![]() 10. Koios Medical. Koios DS Breast [Internet]. New York, NY: Koios Medical, 2024 [cited 2024 Mar 21]. Available from: https://koiosmedical.com/products/koios-ds-breast/.
11. Shen Y, Shamout FE, Oliver JR, Witowski J, Kannan K, Park J, et al. Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nat Commun 2021;12:5645.
![]() ![]() ![]() ![]() 12. Lee SE, Han K, Youk JH, Lee JE, Hwang JY, Rho M, et al. Differing benefits of artificial intelligence-based computer-aided diagnosis for breast US according to workflow and experience level. Ultrasonography 2022;41:718–727.
![]() ![]() ![]() ![]() 13. Nguyen DL, Ambinder EB, Myers KS, Oluyemi E. Addressing disparities related to access of multimodality breast imaging services before and during the COVID-19 pandemic. Acad Radiol 2022;29:1852–1860.
![]() ![]() ![]() 14. Society of Breast Imaging. Society of Breast Imaging statement on breast imaging during the COVID-19 pandemic [Internet]. Reston, VA: Society of Breast Imaging, 2020 [cited 2024 Mar 21]. Available from: https://assets-002.noviams.com/novi-file-uploads/sbi/pdfsand-documents/policy-and-position-statements/2020/society-ofbreast-imaging-statement-on-breast-imaging-during-COVID19-pandemic-fd7aaf6a.pdf.
Patients included and excluded from analysis.BI-RADS, Breast Imaging Reporting and Data System.
![]() Fig. 1.Artificial intelligence decision support (AI DS) output and ground truth for 221 Breast Imaging Reporting and Data System category 3 ultrasound lesions with delayed follow-up.The center circle shows the different output categories of the AI DS system. The outer boxes show the ground truth for each of the AI DS system’s output categories based on 2-year imaging follow-up results or pathology results.
![]() Fig. 2.A 76-year-old female patient who presented with a palpable left breast lump.Ultrasonorgaphy demonstrated a 1.2-cm isoechoic mass assessed as probably benign fat necrosis secondary to post-surgical reduction changes. Surgical biopsy yielded adenoid cystic carcinoma. The artificial intelligence decision support system assessed the finding as suspicious.
![]() Fig. 3.An 82-year-old female patient with a 0.8-cm left axillary tail mass initially assessed as a probably benign lymph node.Ultrasound-guided core biopsy was recommended at 6-month follow-up due to an increase in size. Pathology revealed invasive ductal carcinoma. The artificial intelligence decision support system assessed the finding as suspicious on the initial ultrasound.
![]() Fig. 4.A 40-year-old female patient with prior mastectomy for breast cancer and flap reconstruction presented with a palpable lump.Ultrasonography shows a 0.8-cm mass, which the interpreting radiologist assessed as probably benign, likely fat necrosis. Ultrasound followup showed a decrease in the size of the mass, consistent with benign etiology. The artificial intelligence decision support system accurately assessed the finding as benign based on the initial ultrasound.
![]() Fig. 5.Table 1.Patient characteristics |