Predictive performance of ultrasonography-based radiomics for axillary lymph node metastasis in the preoperative evaluation of breast cancer
Article information
Abstract
Purpose
The purpose of this study was to evaluate the predictive performance of ultrasonography (US)-based radiomics for axillary lymph node metastasis and to compare it with that of a clinicopathologic model.
Methods
A total of 496 patients (mean age, 52.5±10.9 years) who underwent breast cancer surgery between January 2014 and December 2014 were included in this study. Among them, 306 patients who underwent surgery between January 2014 and August 2014 were enrolled as a training cohort, and 190 patients who underwent surgery between September 2014 and December 2014 were enrolled as a validation cohort. To predict axillary lymph node metastasis in breast cancer, we developed a preoperative clinicopathologic model using multivariable logistic regression and constructed a radiomics model using 23 radiomic features selected via least absolute shrinkage and selection operator regression.
Results
In the training cohort, the areas under the curve (AUC) were 0.760, 0.812, and 0.858 for the clinicopathologic, radiomics, and combined models, respectively. In the validation cohort, the AUCs were 0.708, 0.831, and 0.810, respectively. The combined model showed significantly better diagnostic performance than the clinicopathologic model.
Conclusion
A radiomics model based on the US features of primary breast cancers showed additional value when combined with a clinicopathologic model to predict axillary lymph node metastasis.
Introduction
Axillary lymph node metastasis is an important prognostic factor in patients with breast cancer [1]. Sentinel lymph node biopsy is the standard method for diagnosing axillary lymph node metastasis in patients with non-palpable lymph nodes and for determining whether axillary lymph node dissection is indicated in these patients [2]. Although sentinel lymph node biopsy is less invasive than axillary lymph node dissection, patients still report symptoms such as numbness, pain, and restricted movement [3]. Sentinel lymph node biopsy also has a false-negative rate of 5% to 10% [4]. In patients with suspicious axillary lymph nodes on ultrasonography (US), US-guided fine-needle aspiration is commonly used to confirm axillary lymph node metastasis and to evaluate the need for neoadjuvant chemotherapy [5]. US assessment and US-guided fine-needle aspiration may have the potential to replace sentinel lymph node biopsy in planning the surgical approach; however, the diagnostic performance of these methods has not been found to be satisfactory [6-9].
Several nomograms have been published for the preoperative prediction of axillary lymph node metastasis. These include clinical factors and post-biopsy information, such as patient age, tumor size, tumor location, multiplicity, tumor type, and receptor status. Previous studies have also associated several US features of primary tumors with axillary lymph node metastasis [10,11]. Recently, magnetic resonance imaging-based radiomics features extracted from primary tumors showed high predictive performance for axillary lymph node metastasis [12-15]. However, fewer studies have been conducted on US-based radiomics than on magnetic resonance imaging-based radiomics [16-18]. In the few studies that have been published, US-based radiomic features have displayed good diagnostic performance, but these results were not validated in a separate cohort [16,17]. A well-established radiomics model may be able to assist or even replace US assessment with fine-needle aspiration or sentinel lymph node biopsy in the planning of the surgical approach.
Thus, the purpose of this study was to evaluate the preoperative predictive performance of US-based radiomics for axillary lymph node metastasis and to compare it with the predictive performance of a clinicopathologic model.
Materials and Methods
This retrospective study was approved by the institutional review board of Severance Hospital (Seoul, Korea). The requirement for informed consent was waived.
Patient Population
Between January 2014 and December 2014, 793 patients underwent surgery for breast cancer at our institution. The exclusion criteria were as follows: (1) 175 patients with ductal carcinoma in situ, (2) 51 patients with masses larger than 4 cm at preoperative US that could not be fully included in a single standard US image, (3) 21 patients who underwent re-operation or surgery for recurrence, (4) 20 patients referred to our institution after excisional biopsy at an outside clinic, (5) 17 patients with non-mass lesions with uncertain boundaries due to vague regions of interest, and (6) 13 patients who underwent neoadjuvant chemotherapy without initial histological confirmation of the axillary lymph nodes. After exclusion, 496 patients (mean age, 52.5±10.9 years) with breast cancer were included in our study. Among them, 306 patients who underwent surgery between January 2014 and August 2014 were enrolled as the training cohort, and 190 patients who underwent surgery between September 2014 and December 2014 were enrolled as the validation cohort (Fig. 1).
Clinicopathologic Data Acquisition
US examinations were performed by 10 radiologists using two different ultrasound machines (iU22, Phillips Medical Systems, Bothell, WA, USA; LOGIQ E9, GE Healthcare, Milwaukee, WI, USA) with linear array transducers. If a patient underwent multiple US examinations prior to surgery, we selected the US examination taken at the time at which a suspicious mass was detected at our institution. The median interval between the initial US examination and surgery was 15 days (range, 2 to 335 days). A radiologist (E.K.K.) retrospectively reviewed the US images and collected data regarding mass size, tumor location, multiplicity in a single breast, and skin-to-tumor distance.
We reviewed post-biopsy pathologic reports to investigate cancer type (ductal, lobular, or other) and estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2 (HER2), and Ki67 status. Mixed ductal and lobular cancer, mucinous cancer, invasive micropapillary carcinoma, tubular carcinoma, and other mixed types were classified as "other" with regard to cancer type. Estrogen receptor and progesterone receptor positivity were defined as immunoreactivity of 1% or higher for tumor cell nuclei, and Ki67 positivity was defined as immunoreactivity of 14% or higher. In cases of equivocal HER2 overexpression, an amplification ratio of 2 or higher on fluorescence in situ hybridization testing was considered to indicate HER2 positivity. We also collected data from electronic medical records regarding whether each patient underwent neoadjuvant chemotherapy, and we obtained information regarding lymphovascular invasion and histologic grade from postoperative pathologic reports; these data were used for baseline comparison of clinicopathologic features between the training and validation cohorts. In the development of the clinicopathologic model, these variables were excluded because we aimed to develop the model in a preoperative setting. The standard reference for lymph node status was based on the results of sentinel lymph node biopsy or axillary lymph node dissection. In patients who received neoadjuvant chemotherapy, we instead referred to the results of fine-needle aspiration before starting treatment.
Extraction of Radiomic Features
A radiologist with 1 year of experience in breast imaging (S.E.L.) selected one axial image among the US images of each breast mass and cropped the image to remove the space used for informative text. After the image was resampled to a pixel size of 0.2 mm, a region of interest along the mass margin was imaged semi-automatically using MIPAV software version 8.0.2 (NIH, Bethesda, MD, USA; open-source, https://mipav.cit.nih.gov) and converted into mask files for feature extraction by the same radiologist. Another resident radiologist (Y.S.), a third-year resident, independently performed segmentation of 50 randomly-chosen masses to evaluate interobserver reproducibility.
A radiologist with 4 years of experience in data science (S.K.) extracted features from the mask files using Pyradiomics software (version 2.0.0, open-source, https://pyradiomics.readthedocs.io/en/latest). A total of 444 radiomic features were extracted from the original and derived (wavelet-transformed) images. For each radiomic feature of the 50 randomly-selected masses, the intraclass correlation coefficient was calculated between two radiologists, and 39 features with coefficients of less than 0.75 were excluded. For the remaining 405 radiomic features, z-score normalization was applied to standardize the values. Features with Spearman correlation coefficients greater than 0.95 were represented by a single feature that showed the widest range among clustered features through the hierarchical clustering process. In total, 125 features were selected to optimize reproducibility and redundancy. They consisted of 40 features from the original images (4 shape features, 10 first-order features, 15 gray-level co-occurrence matrix features, 4 gray-level run-length matrix features, 5 gray-level size-zone matrix features, and 3 gray-level dependence matrix features) and 85 features from the wavelet-filtered images. Image processing was performed using ITK Python packages (version 4.13.2, open-source, https://itk.org/ITK/resources/software.html). Fig. 2 shows the process from US image acquisition to radiomics model development.
Statistical Analysis
Finally, we selected radiomic features using penalized logistic regression under the least absolute shrinkage and selection operator (LASSO) model with 5-fold cross-validation in the training cohort. A rad-score was computed via a linear combination of the selected features weighted by each coefficient. The area under the curve (AUC) was calculated in the training cohort using the selected features with a 95% confidence interval (CI). A preoperative clinicopathologic model was established using multivariable logistic regression with the variables that had P-values less than 0.05. We calculated the predictive performance levels of the clinicopathologic and combined clinicopathologic-radiomics models to evaluate the incremental value of the radiomic model via the Delong test for two receiver operating characteristic curves. Performance was independently evaluated in the validation cohort. Differences in clinicopathologic characteristics between the training and validation cohorts were assessed using the Mann-Whitney U test and the chi-square test.
Statistical analyses were performed using R software version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria; http://www.R-project.org). P-values of less than 0.05 were considered to indicate statistical significance.
Results
Patient Characteristics
The clinicopathologic characteristics of the patients in the training and validation cohorts are summarized in Table 1. Lymph node positivity was present in 30.1% (92 of 306) of the patients in the training cohort and in 32.1% (61 of 190) of those in the validation cohort, which did not constitute a statistically significant difference (P=0.689). Similarly, no significant difference was observed between the training and validation cohorts for any other factor.
Clinicopathologic Predictors and Performance
Based on multivariable logistic regression, the independent preoperative clinicopathologic factors identified as predictors of axillary lymph node metastasis were mass size on US, tumor location (outer, medial, or subareolar), tumor type (ductal, lobular, or other), and multiplicity (Table 2). Age, skin-to-tumor distance, distance from the nipple, and receptor status showed no significant association with lymph node metastasis. The predictive performance of the clinicopathologic model was moderate, with AUC values of 0.760 (95% CI, 0.703 to 0.817) in the training cohort and 0.708 (95% CI, 0.630 to 0.786) in the validation cohort.
Radiomics Model Development and Comparison
Of the 125 features that were originally chosen, 23 were selected in the training cohort using the LASSO logistic regression model (Table 3, Fig. 3). Among the 23 radiomics features, 'first order_kurtosis' was a dominant feature in our radiomics model, as it was associated with the highest value of the coefficient. A clinicopathologic model was developed with four factors: tumor size, location, subtype, and multiplicity. The predictive performance of the radiomics model was comparable to that of the clinicopathologic model, with AUCs of 0.812 (95% CI, 0.760 to 0.864) in the training cohort and 0.831 (95% CI, 0.773 to 0.889) in the validation cohort. The radiomics model showed significantly better predictive performance than the clinicopathologic model in the validation cohort (P=0.013) (Table 4).
To evaluate the incremental value of the radiomics model, we developed a combined model using the radiomics score and the four aforementioned clinicopathologic factors. The AUC of the combined model was 0.858 (95% CI, 0.814 to 0.902) in the training cohort, which was significantly better than the performance of the clinicopathologic model alone (AUC, 0.760; P=0.007).
When we applied these models to the validation cohort, the AUC of the combined model was 0.810 (95% CI, 0.745 to 0.876). The combined model performed significantly better than the clinicopathologic model in the prediction of axillary lymph node metastasis (AUC, 0.708; P=0.048) (Table 4, Fig. 4).
Discussion
We developed a radiomics model consisting of 23 features selected using LASSO logistic regression and a preoperative clinicopathologic model consisting of four factors (tumor size, location, subtype, and multiplicity) to predict axillary lymph node metastasis in patients with breast cancer. As combination with the US-based radiomics model significantly improved the predictive performance of the clinicopathologic model, the radiomics model can be said to provide additional value in the prediction of axillary lymph node metastasis. This result implies that US-based intratumoral characteristics of primary breast cancer, represented by radiomic features, are associated with axillary lymph node metastasis, although this has not been clearly identified in the context of US features such as shape, margin, echogenicity, or orientation [10]. In the future, this model may help identify patients who need sentinel lymph node biopsy or axillary dissection before surgery, and it could even potentially indicate which patients require aspiration or core-needle biopsy of lymph nodes at the staging workup.
In our preoperative clinicopathologic model, multivariable logistic regression was used to identify tumor size, tumor location, tumor type, and multiplicity as predictors of axillary lymph node metastasis; these factors have similarly been shown to be predictive factors in previous studies [19]. Based on previous reports, we also added skin-to-tumor distance and the distance from the nipple in the analysis; however, these were not found to be predictive factors in our study [10,20]. Our preoperative clinicopathologic model did not include histologic grade or lymphovascular invasion of the tumor, since this information is obtained after surgery. Instead, data regarding tumor type and hormone receptor status were included, because they could be readily obtained from biopsy results. The internationally-validated nomogram developed by the Memorial Sloan-Kettering Cancer Center from nine variables (age, tumor size, type, location, lymphovascular invasion, nuclear grade, multifocality, estrogen receptor status, and progesterone receptor status) showed slightly higher predictive performance (AUC of 0.71-0.78) than our preoperative clinicopathologic model [21,22]. Because histologic grade and lymphovascular invasion are known to be influential factors, this could be a reason for the slightly lower performance exhibited by our model.
Although the radiomics model performed significantly better than the preoperative clinicopathologic model in the prediction of axillary lymph node metastasis in the validation cohort, it did not exhibit statistically higher performance than the clinicopathologic model in the training cohort. Since radiomic features are developed from intratumoral characteristics only, this model did not contain clinical characteristics or extratumoral information such as posterior shadowing, echogenic halo, or peritumoral distortion beyond the region of interest. Thus, the radiomics model may complement the clinicopathologic model, as the combined model significantly improved the predictive performance of the clinicopathologic model.
Among the few published studies that have used US-based radiomics to predict axillary lymph node metastasis in patients with breast cancer, two have not been verified with a validation cohort, and overfitting remained a problem for those studies [16,17,23]. Recently, Yu et al. [18] analyzed 426 patients (300 in a training cohort and 126 in a validation cohort) and found the combined model to have additional value over the clinical model. In that study, the dominant radiomic feature was first-order kurtosis, which aligned with our results. The conclusion of that study was also consistent with ours; however, its clinical model consisted of age, mass size, and US-reported lymph node status, and the known predictors of tumor location and multiplicity were not included in its analysis [18]. Additionally, more than 40% of patients in the study by Yu et al. [18] were found to have axillary lymph node metastasis, which was higher than the percentage observed in our study (30.8%, 153 of 496). The incidence of axillary metastases in patients with invasive breast cancer was previously reported to be 30%-40%, but this value has decreased gradually because the size of detected breast cancer has decreased since regular cancer screening has been established [24-26]. The relatively low proportion of patients with axillary lymph node metastasis in the present study may be a reflection of clinical practice, which may have been facilitated by our use of two different US machines operated by 10 radiologists.
This study has several limitations, the most notable of which is its retrospective single-institution design. Future multicenter studies, ideally with prospective data collection obtained via population-based screening, are warranted to confirm our findings. Second, we utilized the results of fine-needle lymph node aspiration in the 72 patients who received neoadjuvant chemotherapy, since surgical pathology is affected by chemotherapy. Fine-needle aspiration has been found to have high diagnostic performance, but it may still be lower than that of surgical biopsy. Third, we could not include information regarding the palpability of axillary lymph nodes in the clinicopathologic model; although most nodes were specified as non-palpable (478 of 496; 96.4%) or palpable (15 of 496; 3.0%), a few (3 of 496; 0.6%) were not identified on the electronic medical records. We also tried to focus on the clinicopathologic features of primary breast tumors. Finally, we utilized images obtained from different US systems and radiologists. Radiomic features have been reported to be affected by vendor dependency and operator dependency, which may have affected our results.
In conclusion, a radiomics model based on the US features of primary breast cancers showed additional value in the prediction of axillary lymph node metastasis when combined with a preoperative clinicopathologic model.
Notes
Author Contributions
Conceptualization: Kim EK. Data acquisition: Lee SE, Sim Y. Data analysis or interpretation: Kim S, Lee SE. Drafting of the manuscript: Lee SE. Critical revision of the manuscript: Kim EK, Kim S. Approval of the final version of the manuscript: all authors.
No potential conflict of interest relevant to this article was reported.