Current affiliation: Department of Radiology, Yongin Severance Hospital, Yonsei University College of Medicine, Yongin, Korea
The purpose of this study was to evaluate the predictive performance of ultrasonography (US)-based radiomics for axillary lymph node metastasis and to compare it with that of a clinicopathologic model.
A total of 496 patients (mean age, 52.5±10.9 years) who underwent breast cancer surgery between January 2014 and December 2014 were included in this study. Among them, 306 patients who underwent surgery between January 2014 and August 2014 were enrolled as a training cohort, and 190 patients who underwent surgery between September 2014 and December 2014 were enrolled as a validation cohort. To predict axillary lymph node metastasis in breast cancer, we developed a preoperative clinicopathologic model using multivariable logistic regression and constructed a radiomics model using 23 radiomic features selected via least absolute shrinkage and selection operator regression.
In the training cohort, the areas under the curve (AUC) were 0.760, 0.812, and 0.858 for the clinicopathologic, radiomics, and combined models, respectively. In the validation cohort, the AUCs were 0.708, 0.831, and 0.810, respectively. The combined model showed significantly better diagnostic performance than the clinicopathologic model.
A radiomics model based on the US features of primary breast cancers showed additional value when combined with a clinicopathologic model to predict axillary lymph node metastasis.
Axillary lymph node metastasis is an important prognostic factor in patients with breast cancer [
Several nomograms have been published for the preoperative prediction of axillary lymph node metastasis. These include clinical factors and post-biopsy information, such as patient age, tumor size, tumor location, multiplicity, tumor type, and receptor status. Previous studies have also associated several US features of primary tumors with axillary lymph node metastasis [
Thus, the purpose of this study was to evaluate the preoperative predictive performance of US-based radiomics for axillary lymph node metastasis and to compare it with the predictive performance of a clinicopathologic model.
This retrospective study was approved by the institutional review board of Severance Hospital (Seoul, Korea). The requirement for informed consent was waived.
Between January 2014 and December 2014, 793 patients underwent surgery for breast cancer at our institution. The exclusion criteria were as follows: (1) 175 patients with ductal carcinoma
US examinations were performed by 10 radiologists using two different ultrasound machines (iU22, Phillips Medical Systems, Bothell, WA, USA; LOGIQ E9, GE Healthcare, Milwaukee, WI, USA) with linear array transducers. If a patient underwent multiple US examinations prior to surgery, we selected the US examination taken at the time at which a suspicious mass was detected at our institution. The median interval between the initial US examination and surgery was 15 days (range, 2 to 335 days). A radiologist (E.K.K.) retrospectively reviewed the US images and collected data regarding mass size, tumor location, multiplicity in a single breast, and skin-to-tumor distance.
We reviewed post-biopsy pathologic reports to investigate cancer type (ductal, lobular, or other) and estrogen receptor, progesterone receptor, human epidermal growth factor receptor 2 (HER2), and Ki67 status. Mixed ductal and lobular cancer, mucinous cancer, invasive micropapillary carcinoma, tubular carcinoma, and other mixed types were classified as "other" with regard to cancer type. Estrogen receptor and progesterone receptor positivity were defined as immunoreactivity of 1% or higher for tumor cell nuclei, and Ki67 positivity was defined as immunoreactivity of 14% or higher. In cases of equivocal HER2 overexpression, an amplification ratio of 2 or higher on fluorescence
A radiologist with 1 year of experience in breast imaging (S.E.L.) selected one axial image among the US images of each breast mass and cropped the image to remove the space used for informative text. After the image was resampled to a pixel size of 0.2 mm, a region of interest along the mass margin was imaged semi-automatically using MIPAV software version 8.0.2 (NIH, Bethesda, MD, USA; open-source,
A radiologist with 4 years of experience in data science (S.K.) extracted features from the mask files using Pyradiomics software (version 2.0.0, open-source,
Finally, we selected radiomic features using penalized logistic regression under the least absolute shrinkage and selection operator (LASSO) model with 5-fold cross-validation in the training cohort. A rad-score was computed via a linear combination of the selected features weighted by each coefficient. The area under the curve (AUC) was calculated in the training cohort using the selected features with a 95% confidence interval (CI). A preoperative clinicopathologic model was established using multivariable logistic regression with the variables that had P-values less than 0.05. We calculated the predictive performance levels of the clinicopathologic and combined clinicopathologic-radiomics models to evaluate the incremental value of the radiomic model via the Delong test for two receiver operating characteristic curves. Performance was independently evaluated in the validation cohort. Differences in clinicopathologic characteristics between the training and validation cohorts were assessed using the Mann-Whitney U test and the chi-square test.
Statistical analyses were performed using R software version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria;
The clinicopathologic characteristics of the patients in the training and validation cohorts are summarized in
Based on multivariable logistic regression, the independent preoperative clinicopathologic factors identified as predictors of axillary lymph node metastasis were mass size on US, tumor location (outer, medial, or subareolar), tumor type (ductal, lobular, or other), and multiplicity (
Of the 125 features that were originally chosen, 23 were selected in the training cohort using the LASSO logistic regression model (
To evaluate the incremental value of the radiomics model, we developed a combined model using the radiomics score and the four aforementioned clinicopathologic factors. The AUC of the combined model was 0.858 (95% CI, 0.814 to 0.902) in the training cohort, which was significantly better than the performance of the clinicopathologic model alone (AUC, 0.760; P=0.007).
When we applied these models to the validation cohort, the AUC of the combined model was 0.810 (95% CI, 0.745 to 0.876). The combined model performed significantly better than the clinicopathologic model in the prediction of axillary lymph node metastasis (AUC, 0.708; P=0.048) (
We developed a radiomics model consisting of 23 features selected using LASSO logistic regression and a preoperative clinicopathologic model consisting of four factors (tumor size, location, subtype, and multiplicity) to predict axillary lymph node metastasis in patients with breast cancer. As combination with the US-based radiomics model significantly improved the predictive performance of the clinicopathologic model, the radiomics model can be said to provide additional value in the prediction of axillary lymph node metastasis. This result implies that US-based intratumoral characteristics of primary breast cancer, represented by radiomic features, are associated with axillary lymph node metastasis, although this has not been clearly identified in the context of US features such as shape, margin, echogenicity, or orientation [
In our preoperative clinicopathologic model, multivariable logistic regression was used to identify tumor size, tumor location, tumor type, and multiplicity as predictors of axillary lymph node metastasis; these factors have similarly been shown to be predictive factors in previous studies [
Although the radiomics model performed significantly better than the preoperative clinicopathologic model in the prediction of axillary lymph node metastasis in the validation cohort, it did not exhibit statistically higher performance than the clinicopathologic model in the training cohort. Since radiomic features are developed from intratumoral characteristics only, this model did not contain clinical characteristics or extratumoral information such as posterior shadowing, echogenic halo, or peritumoral distortion beyond the region of interest. Thus, the radiomics model may complement the clinicopathologic model, as the combined model significantly improved the predictive performance of the clinicopathologic model.
Among the few published studies that have used US-based radiomics to predict axillary lymph node metastasis in patients with breast cancer, two have not been verified with a validation cohort, and overfitting remained a problem for those studies [
This study has several limitations, the most notable of which is its retrospective single-institution design. Future multicenter studies, ideally with prospective data collection obtained via population-based screening, are warranted to confirm our findings. Second, we utilized the results of fine-needle lymph node aspiration in the 72 patients who received neoadjuvant chemotherapy, since surgical pathology is affected by chemotherapy. Fine-needle aspiration has been found to have high diagnostic performance, but it may still be lower than that of surgical biopsy. Third, we could not include information regarding the palpability of axillary lymph nodes in the clinicopathologic model; although most nodes were specified as non-palpable (478 of 496; 96.4%) or palpable (15 of 496; 3.0%), a few (3 of 496; 0.6%) were not identified on the electronic medical records. We also tried to focus on the clinicopathologic features of primary breast tumors. Finally, we utilized images obtained from different US systems and radiologists. Radiomic features have been reported to be affected by vendor dependency and operator dependency, which may have affected our results.
In conclusion, a radiomics model based on the US features of primary breast cancers showed additional value in the prediction of axillary lymph node metastasis when combined with a preoperative clinicopathologic model.
Conceptualization: Kim EK. Data acquisition: Lee SE, Sim Y. Data analysis or interpretation: Kim S, Lee SE. Drafting of the manuscript: Lee SE. Critical revision of the manuscript: Kim EK, Kim S. Approval of the final version of the manuscript: all authors.
No potential conflict of interest relevant to this article was reported.
US, ultrasonography; GLCM, gray-level co-occurrence matrix features; GLRLM, gray-level run-length matrix features; GLSZM, gray-level size-zone matrix features; GLDM, gray-level dependence matrix features; LASSO, least absolute shrinkage and selection operator.
A. The area under the receiver operating characteristic curve (AUC) was plotted versus log (λ). Dotted vertical lines were drawn at the optimal values by using the minimum criterion and 1 standard error (SE) of the minimum criterion (1-SE criterion) according to 5-fold cross-validation. B. LASSO coefficient profiles of the 125 features were shown. A coefficient profile plot was produced against the log (λ) sequence. A vertical line was drawn at the value selected at which optimal λ resulted in 23 nonzero coefficients.
A. In the training cohort, the areas under the curve (AUC) were 0.760, 0.812, and 0.858 for the clinicopathologic, radiomics, and combined models, respectively. B. In the validation cohort, the AUC values were 0.708, 0.831, and 0.810, respectively.
Clinicopathologic characteristics of the training and validation cohorts
Training cohort (n=306) | Validation cohort (n=190) | P-value | |
---|---|---|---|
Axillary LN metastasis | 92 (30.1) | 61 (32.1) | 0.689 |
Age (y) | 50 (45-60) | 52 (46-59) | 0.307 |
Mass size on US (mm) | 16 (11-22) | 16 (11-23) | 0.663 |
Skin-to-tumor distance (mm) | 6 (4-8) | 7 (4-9) | 0.238 |
Distance from nipple (cm) | 3 (2-5) | 3 (2-4) | 0.603 |
Tumor location | 0.147 | ||
Outer | 182 (59.4) | 126 (66.3) | |
Medial | 109 (35.6) | 60 (31.6) | |
Center | 15 (4.9) | 4 (2.1) | |
Tumor type | 0.443 | ||
Ductal | 255 (83.3) | 156 (82.1) | |
Lobular | 8 (2.6) | 9 (4.7) | |
Other |
43 (14.1) | 25 (13.2) | |
Multiplicity | 68 (22.2) | 48 (25.3) | 0.447 |
ER-positive | 235 (65.3) | 145 (76.3) | 0.913 |
PR-positive | 142 (46.4) | 87 (45.8) | 0.926 |
HER2-positive | 39 (12.8) | 25 (13.2) | 0.891 |
Ki67-positive | 106 (34.6) | 78 (41.1) | 0.153 |
Neoadjuvant chemotherapy | 39 (12.7) | 33 (17.4) | 0.190 |
Histologic grade |
267 | 157 | 0.236 |
1 | 76 (28.5) | 55 (35.0) | |
2 | 134 (50.2) | 66 (42.0) | |
3 | 57 (21.3) | 36 (22.9) | |
Lymphovascular invasion |
20 (7.5) | 11 (7.0) | >0.99 |
Values are presented as number (%) or median (interquartile range).
LN, lymph node; US, ultrasonography; ER, estrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor receptor 2.
Includes mixed ductal and lobular cancer (in the training and validation cohorts, n=17 and n=5, respectively), mucinous cancer (n=7 and n=9, respectively), tubular carcinoma (n=9 and n=7, respectively), invasive micropapillary carcinoma (n=5 and n=2, respectively), and others (n=5 and n=2, respectively).
Analyzed in patients who did not receive neoadjuvant chemotherapy.
Analyzed in patients who did not receive neoadjuvant chemotherapy.
Preoperative clinicopathologic predictors of axillary lymph node metastasis
Metastasis (-) | Metastasis (+) | Univariable P-value | Multivariable P-value | Estimate | |
---|---|---|---|---|---|
Age (y) | 50 (44-60) | 51 (45-60) | 0.863 | ||
Mass size on US (mm) | 14 (10-20) | 19 (14-26) | <0.001 | <0.001 |
0.072 |
Skin-to-tumor distance (mm) | 6 (4-9) | 5 (4-8) | 0.297 | ||
Distance from nipple (cm) | 3 (2-5) | 3 (2-5) | 0.980 | ||
Tumor location | |||||
Outer | 118 | 64 | |||
Medial | 84 | 25 | 0.030 | 0.018 |
-0.733 |
Subareolar | 12 | 3 | 0.243 | 0.143 | -1.024 |
Tumor type | |||||
Ductal | 172 | 83 | |||
Lobular | 6 | 2 | 0.655 | 0.751 | -0.296 |
Other |
36 | 7 | 0.036 | 0.027 |
-1.101 |
Multiplicity | 30 | 38 | <0.001 | <0.001 |
1.450 |
ER-positive | 165 | 70 | 0.847 | ||
PR-positive | 102 | 40 | 0.501 | ||
HER2-positive | 21 | 18 | 0.021 | 0.629 | |
Ki67-positive | 66 | 40 | 0.034 | 0.757 |
Values are presented as median (interquartile range) or number.
US, ultrasonography; ER, estrogen receptor; PR, progesterone receptor; HER2, human epidermal growth factor receptor 2.
Variables used in the clinicopathologic model.
Includes mixed ductal and lobular cancer, mucinous cancer, invasive micropapillary carcinoma, tubular carcinoma, and other mixed types.
Radiomics features selected via LASSO logistic regression
Feature | Coefficient |
---|---|
Intercept | -1.014046390 |
Shape_Elongation | -0.232204952 |
Firstorder_TotalEnergy | 0.056019024 |
Firstorder_Kurtosis | -0.530412652 |
Firstorder_Maximum | -0.012711923 |
Firstorder_RootMeanSquared | -0.118752125 |
GLRLM_RunLengthNonUniformity | 0.315401837 |
GLRLM _ShortRunEmphasis | -0.281044343 |
GLSZM_ZoneVariance | -0.011315798 |
GLSZM_LargeAreaLowGrayLevelEmphasis | -0.001872964 |
GLSZM_LowGrayLevelZoneEmphasis | 0.305594347 |
GLSZM_SmallAreaEmphasis | -0.226977032 |
Wavelet.LH_firstorder_Kurtosis | 0.059095450 |
Wavelet.LH_firstorder_Median | -0.241255217 |
Wavelet.LH_firstorder_Skewness | -0.235081848 |
Wavelet.LH_GLCM_Correlation | 0.234716517 |
Wavelet.LH_GLCM_Imc 1 | 0.012029529 |
Wavelet.LH_GLSZM_LargeAreaHighGrayLevelEmphasis | -0.037115355 |
Wavelet.HL_GLCM_Imc 1 | 0.059703461 |
Wavelet.HH_firstorder_Median | -0.319908947 |
Wavelet.HH_GLCM_Imc 1 | 0.066029891 |
Wavelet.LL_GLRLM_LongRunLowGrayLevelEmphasis | 0.097798404 |
Wavelet.LL_GLDM_SmallDependenceLowGrayLevel Emphasis | 0.009686394 |
Wavelet.LL_GLDM_DependenceEntropy | 0.193182701 |
LASSO, least absolute shrinkage and selection operator; GLRLM, gray-level run-length matrix; GLSZM, gray-level size-zone matrix; GLCM, gray-level co-occurrence matrix; GLDM, gray-level dependence matrix.
Comparison of predictive performance between the models in the training and validation cohorts
AUC (95% CI) |
||
---|---|---|
Training cohort | Validation cohort | |
Clinicopathologic model | 0.760 (0.703-0.817) | 0.708 (0.631-0.786) |
Radiomics model | 0.812 (0.760-0.864) | 0.831 (0.773-0.889) |
P-value |
0.184 | 0.013 |
Combined model | 0.858 (0.814-0.902) | 0.810 (0.745-0.876) |
P-value |
0.008 | 0.048 |
AUC, area under the receiver operating characteristic curve; CI, confidence interval.
Comparison between the clinicopathologic model and the radiomics model.
Comparison between the clinicopathologic model and the combined model.