Application of a quality threshold to improve liver shear wave elastography measurements in free-breathing pediatric patients
Article information
Abstract
Purpose
This study assessed the benefits of quality threshold (QT) implementation for liver shear wave elastography (SWE) in children during free breathing.
Methods
The QT, which adjusts the SWE map display based on shear wave quality, was set at 55%. Phantom measurements (PMs) were taken with a fixed probe using QT (termed PM-1); a moving probe without QT (PM-2); and a moving probe with QT (PM-3). Each measurement was subjected to random samplings of various sizes. Clinical measurements (CMs) were obtained from children with biliary atresia using following protocols: CM-1, manually defined regions of interest (ROIs); CM-2, default ROIs without QT; and CM-3, default ROIs with QT. Elasticity measurements were compared across fibrosis grades, and color patterns on the SWE maps were analyzed.
Results
In the phantom experiments, the moving probe produced lower elasticity measurements; this difference decreased upon QT application. With the moving probe, random sampling indicated fewer interquartile range-to-median ratios exceeding 30% upon QT application (4% vs. 14% when five values were sampled, P=0.004). In clinical experiments, QT improved the differentiation of fibrosis grade in patients over 5 years old, with a significant difference between moderate and severe fibrosis (P=0.004). Elasticity variability was positively correlated with fibrosis grade (τ=0.376, P<0.001). Certain apparent errors, termed artificial stripe patterns, were not eliminated by QT.
Conclusion
Applying QT to exclude low-quality pixels can minimize measurement error and improve differentiation of liver fibrosis grades. The presence of an artificial stripe pattern on the SWE map may indicate images requiring exclusion.
Introduction
Liver fibrosis can develop in pediatric patients due to a variety of medical conditions, including biliary atresia, Wilson disease, non-alcoholic fatty liver disease, and Fontan-associated liver disease. Chronic liver disease can lead to persistent liver inflammation, which may ultimately result in liver fibrosis [1]. Clinicians must assess and monitor the progression of liver fibrosis to prevent the disease from advancing and avoid potential complications [2]. Since liver diseases can present early in life, non-invasive monitoring of liver fibrosis is crucial, as surveillance is a lifelong necessity for pediatric patients following their initial diagnosis. Although biopsy is considered the gold standard for diagnosis, its invasive nature poses challenges for pediatric patients, particularly due to the need for anesthesia. Moreover, it is not feasible to perform a biopsy at every follow-up [3,4].
Fortunately, non-invasive evaluation methods such as ultrasound (US) and magnetic resonance imaging are available to assess liver fibrosis. US shear wave elastography (SWE) is a well-established imaging technique that provides a rapid and straightforward assessment of liver stiffness. In both adults and children with various chronic liver diseases, SWE can be applied during abdominal US, without the need for sedation [5]. For pediatric patients, the use of SWE instead of biopsy enables a quantitative assessment of liver fibrosis through non-invasive, real-time US, thereby eliminating the need for anesthesia [6]. SWE also alleviates concerns about sampling error, as measurements can be taken from multiple sites [7].
However, the application of SWE in pediatric patients presents certain challenges, as these individuals are not always capable of holding their breath. The associated movement can lead to motion artifacts, potentially introducing measurement errors. A recent study highlighted the potential for non-negligible errors in children under 5 years old who are unable to maintain a proper breath hold during SWE [8]. To obtain stable SWE values, guidelines recommend performing measurements with patients holding their breath [9]. However, pediatric patients may be unable to cooperate, and attempts to hold their breath can sometimes result in even more erratic, stressed breathing patterns. Several studies have documented a higher failure rate among patients during free breathing, as indicated by a higher coefficient of variation or a greater ratio of the interquartile range to the median (hereafter indicated as "IQR/median") [10,11]. Although the current recommendation is to capture a cine loop and select the most stable image for analysis [9], adhering to this guideline can meaningfully increase the time required to collect an adequate number of images.
To address these issues, various vendors have proposed methods to evaluate the quality of the SWE wave and to minimize motion errors. One such option is the quality threshold (QT), developed by GE HealthCare (Wauwatosa, WI, USA). However, this approach has not been assessed in the context of pediatric elasticity measurements. Furthermore, the acquisition of stable values through multiple measurements, as well as the optimal number of measurements required, remain topics of debate. Consequently, this study aimed to evaluate the effectiveness of QT application for SWE under conditions of free breathing, to explore methods for minimizing motion errors, and ultimately to improve the accuracy and reliability of SWE in the assessment of liver fibrosis in pediatric patients.
Materials and Methods
Compliance with Ethical Standards
This retrospective study was approved by the institutional review board of the Severance Hospital (4-2023-1269). The requirement for obtaining informed consent was waived due to the retrospective nature of the study.
Image Acquisition
SWE was performed using a LOGIQ E10 US scanner (GE HealthCare) equipped with a 3.5-MHz convex probe. During the examination, a quality map (Fig. 1, left) and an SWE map (Fig. 1, right) were displayed concurrently to facilitate the acquisition of elasticity measurements. The quality map assesses the integrity of shear wave propagation through pixel-by-pixel analysis of the shape of each wave, and this information was displayed as a two-dimensional color map. Additionally, the average quality value (termed "Q") of shear wave pixels within a selected region of interest (ROI) on the SWE map is calculated and displayed (Figs. 1, 2), irrespective of the pixels’ visibility on the map. The QT, a parameter that controls the display of specific areas on the SWE map, was set to 55%, adhering to the manufacturer's recommendations. With this setting, only areas with a calculated quality of at least 55% are shown on the SWE map, following the exclusion of lower-quality waves. If the measurement process failed due to a shortage of SWE pixels after applying QT, the procedure was repeated.
Phantom Experiments
A commercially available shear wave liver fibrosis phantom with a published elasticity of 7.84 kPa (Model 039, CIRS Inc., Norfolk, VA, USA) was utilized for the measurements. The probe was secured in a stationary position on a clip stand to simulate the breath-holding state; alternatively, it was manually moved up and down 20-30 times per minute to replicate the respiratory rate of preschool children. This movement was designed to emulate the technician’s efforts to follow the liver’s motion during breathing, thus simulating the free-breathing state. For all methods, circular ROIs with a diameter of 1.5 cm were centered on trapezoidal SWE maps at a depth of 4 cm within a color box measuring 1.5 cm×3.0 cm. Phantom measurements (PM) were conducted using three methods: PM-1, in which a fixed probe was applied along with QT; PM-2, involving a moving probe used without QT; and PM-3, involving a moving probe used with QT (Fig. 1). The data for PM-3 were recalculated using the raw dataset from PM-2. A QT of 55% was applied to both PM-1 and PM-3. Measurements were repeated 200 times for each method.
To determine the optimal number of acquisitions, a random sampling was obtained of three, five, and seven values from the sets of 200 measurements taken for PM-1 to PM-3, with this sampling process repeated 150 times [8,12]. Within those 150 measurements, the proportion of IQR/median values exceeding 30% was used to indicate and compare measurement quality.
Clinical Experiments
All abdominal US images obtained at the authors’ institution in children (defined as patients under 19 years of age) with risk factors for liver fibrosis due to biliary atresia were initially included in the study. These images were captured using the LOGIQ E10 US scanner with a 3.5-MHz convex probe between December 2021 and May 2022. The exclusion criteria were as follows: (1) patients presenting with gross hepatobiliary lesions other than liver fibrosis and (2) patients with laboratory findings indicative of acute cholestasis or hepatitis, which can influence liver elasticity values irrespective of the presence of fibrosis.
Regarding the grading of liver fibrosis, mild fibrosis was characterized by an absence of imaging evidence of portal hypertension. Moderate fibrosis was indicated by the presence of compensated portal hypertension, such as splenomegaly, without signs of complications; in contrast, severe fibrosis was marked by non-compensated portal hypertension with associated complications, including portal hypertensive gastroenteropathy, varices with or without a history of bleeding, or ascites. Patients who had undergone partial splenic embolization were categorized as exhibiting severe fibrosis [13].
All abdominal US examinations, including liver SWE measurements, were performed by two pediatric radiologists with 18 and 10 years of experience (M.J.L. and H.Y., respectively) during the study period. As part of routine abdominal US, SWE data acquisition was performed 10 times for each patient in an intercostal scan targeting liver segment V under free-breathing conditions. Circular ROIs were placed on the trapezoidal SWE maps, centered at a depth of 4 cm within a color box measuring 1.5 cm×1.5 cm for each method. Clinical measurements (CM) entailed three methods of calculating elasticity using the acquired raw data. CM-1 involved manual placement of ROIs of varying sizes and locations to avoid low-quality pixels within the trapezoidal SWE maps. CM-2 employed default ROIs of uniform size (diameter, 1.5 cm) centered at a depth of 4 cm without QT application, and CM-3 employed default ROIs of the same size and location as CM-2, but with QT applied (Fig. 2). In the review of CMs, the color patterns and artifacts on the SWE maps were visually analyzed. Since the patients in this study did not have focal liver lesions, the randomly distributed colors on the SWE map were attributed to heterogeneity due to liver fibrosis. Any area with a distinctly different color, a clear border, and a marked contrast to the surrounding liver parenchyma was considered an artifact.
Statistical Analyses
Statistical analysis was performed using SPSS version 26 (IBM Corp., Armonk, NY, USA). Continuous variables were expressed as means±standard deviations (SDs). Additionally, median values and IQRs were calculated and compared. An IQR/median value exceeding 30% was deemed indicative of high variability. For the phantom experiments, repeated measures analysis of variance (ANOVA) and the McNemar test were employed to assess the impact of applying QT to the moving probe. To determine the optimal number of acquisitions, random sampling was conducted for the phantom experiment; the result from CM-1, in which 14% of patients exhibited an IQR/median above 30%, was referenced. Assuming that the application of QT would decrease the proportion of IQR/median ratios above 30% by 5%, a set of 150 repetitions was deemed necessary according to the McNemar test, with the ratio of intra-participant to inter-participant variability set at 0.6.
For the clinical experiments, the t-test, ANOVA, and the Kendall rank correlation were used. P-values below 0.05 were considered to indicate statistical significance.
Results
Phantom Experiments and the Impact of Motion on SWE
The elasticity values obtained for PM-1, PM-2, and PM-3 in the phantom experiments are summarized in Table 1. The mean elasticity values exhibited significant differences when comparing any two of the three methods (PM-1 vs. PM-2, P<0.001; PM-2 vs. PM-3, P<0.001; PM-1 vs. PM-3, P=0.017). The elasticity for PM-1 was measured at 7.01 kPa, which was the closest to the phantom’s official elasticity of 7.84 kPa. The methods involving motion (PM-2 and PM-3) yielded lower elasticity measurements. The discrepancy in elasticity relative to the PM-1 results diminished from 0.61 kPa (7.01 kPa-6.40 kPa) for PM-2 to 0.24 kPa (7.01 kPa-6.77 kPa) for PM-3, in which QT was applied to the moving probe.
To evaluate the impact of motion on the variability of SWE, the SD values for elasticity were compared. The mean SD was greater for the moving probe without QT (PM-2), at 3.67 kPa, compared with the fixed condition (PM-1) at 1.19 kPa. However, relative to PM-2, the mean SD decreased to 3.17 kPa when QT was applied (PM-3). Additionally, the IQR/median ratio was assessed as an indicator of variability, and this value was less than 30% for all three methods (Table 1).
Random Sampling of Phantom Experiments
When performing random samplings of three, five, and seven values from each set of 200 measurements taken—with this sampling process repeated 150 times per PM—the measured elasticity showed a consistent trend across all sampling sizes for PM-1 through PM-3. The value for PM-1 most closely matched the published elasticity. With the application of QT to the moving probe (that is, transitioning from PM-2 to PM-3), the elasticity reached a value similar to that of PM-1 (Table 2).
When comparing the rate of IQR/median ratios greater than 30% for all sampling sizes, PM-3 generally displayed a lower proportion relative to PM-2. This trend reached statistical significance only when five values were sampled, with the proportion decreasing from 14% (PM-2) to 4% (PM-3) and displaying a P-value of 0.004 (Table 2). When sampling seven values, the proportion decreased from 14% to 6.7%, with borderline significance (P=0.054). The other comparisons did not achieve statistical significance.
Clinical Experiments on Liver Fibrosis Grade
A total of 79 abdominal US studies from 78 patients with liver fibrosis (mean age, 8.9±5.1 years; male-to-female ratio, 44:34) were included in the clinical experiments. This included 39 patients (40 US studies) with mild fibrosis, 32 patients with moderate fibrosis, and seven patients with severe fibrosis (Supplementary Table 1). Four elasticity values for CM-1 were unavailable due to a software error and were thus excluded. Otherwise, the analysis involved no exclusions or missing data. The results are summarized in Table 3.
When comparing clinical measurements of elasticity (CM-1 to CM-3) across liver fibrosis grades, significant differences were observed for all grades, with the exception of moderate and severe fibrosis for CM-1 (P=0.476). For CM-2, a significant difference was noted in the mean SWE values between moderate (7.8±1.9 kPa) and severe (9.8±1.9 kPa) fibrosis (P=0.015). CM-3 demonstrated significantly different mean elasticity values between the moderate (7.9±1.7 kPa) and severe (10.6±2.8 kPa) fibrosis cases (P<0.001) (Supplementary Table 2).
Patients were categorized into two age-based groups: those under 5 years old, comprising 26 patients, and those 5 years old and above, representing 52 patients. No significant difference in liver fibrosis grade was observed between these groups (P=0.298). Subgroup analysis of patients under 5 years of age revealed that all three methods—CM-1, CM-2, and CM-3—demonstrated significant differences only between mild and severe fibrosis (P=0.044, P=0.017, and P=0.003, respectively). Among patients over 5 years old, CM-1 exhibited a significant difference only between mild and moderate fibrosis (P=0.002). For CM-2, significant differences were observed between mild and moderate as well as mild and severe fibrosis (P=0.003 and P=0.001, respectively), but not between moderate and severe fibrosis (P=0.116). CM-3 was the only method that differentiated all fibrosis grades, with significant differences between mild and moderate (P=0.002), moderate and severe (P=0.004), and mild and severe fibrosis (P<0.001) (Table 4, Supplementary Table 3).
Clinical Experiments on SWE Variability and Color Patterns To evaluate the effects of liver fibrosis grade and age on SWE variability, a correlation analysis was performed. A significant positive correlation was observed between the SD of SWE measurements and liver fibrosis grade (τ=0.376, P<0.001) (Fig. 3). Additionally, the IQR/median ratio increased with advancing fibrosis grade (Table 3). In contrast, no significant correlation was observed between SD and age when age was treated as a continuous variable (P=0.080). However, the SD was significantly higher in the subgroup of patients younger than 5 years compared to those older than 5 years (5.7±3.8 kPa vs. 3.6±2.3 kPa, P=0.014). In the analysis of color patterns and artifacts, artificial stripe patterns were identified, seemingly caused by the time lag associated with the interleaved scan. These errors could not be corrected by QT (Fig. 4).
Discussion
Prior studies have attempted to identify the optimal number of SWE acquisitions for pediatric patients. However, no study has yet evaluated the quality of the acquired images or determined the appropriate area for measuring elasticity with SWE in children during free breathing. This study incorporated both phantom data, acquired with motion that simulated respiration, and clinical data from pediatric patients with chronic liver disease. The results demonstrated that applying QT with default ROI positioning reduced measurement variability by excluding poor-quality pixels. Additionally, a new artifact was identified: an artificial stripe pattern on SWE maps that could not be corrected using QT.
When assessing elasticity with SWE, multiple acquisitions are recommended. Additionally, variability should be evaluated using the IQR-to-median ratio of the obtained values to ensure appropriate quality [9]. Several studies have indicated that the variability of SWE measurements may be influenced by factors such as patient movement, age, and fibrosis grade [8,10,11,14]. A previous phantom study that compared elasticity variability between stationary and moving states revealed a significantly higher rate of unreliable measurements and a greater coefficient of variance under the motion condition [11]. Similarly, a study involving pediatric patients demonstrated increased variability in elasticity measurements during free breathing as opposed to breath-holding [10]. These findings align with the present results, as the phantom studies with a moving probe (that is, PM-2 and PM-3) yielded higher SD and IQR/median values compared to PM-1. In contrast, another study evaluated SWE variability by calculating the intraclass correlation coefficient (ICC) for measurements across age groups and numbers of acquisitions, using 15 repetitions as a reference. The results indicated that the breathing condition did not significantly impact measurement agreement. That study noted that the ICC exceeded 0.8 only when measurements were taken seven times in patients younger than 5 years, suggesting that at least seven measurements are recommended for this age group [8]. Along similar lines, the present results revealed a significantly higher SD for patients under 5 years old, although no significant correlation was observed between age and SD when age was treated as a continuous variable. This study also revealed a significant difference in SD and IQR/median values depending on the motion status in the phantom experiments. The impact of motion is an inherent challenge when assessing SWE in pediatric patients or those with cognitive impairments who are unable to control their breathing. Therefore, practitioners must be aware of these effects when performing SWE in such patients.
Fibrosis grade may also impact SWE variability. The present study demonstrated a significant positive correlation between the SD of elasticity and the liver fibrosis grade. This finding aligns with prior research suggesting a positive correlation between SD and METAVIR grade [14]. However, the results contradict a previous report in which liver disease status did not appear to influence measurement agreement [8]. Given that the mean elasticity of the liver disease group in that study was 8.0±2.2 kPa, which is near the normal range, the impact of liver disease on the measurements may have been underestimated. The observed correlation between the variability of elasticity and liver fibrosis grade could stem from the increasing heterogeneity of the liver parenchyma as fibrosis progresses [15]. Consequently, when dealing with young patients, those who struggle with breath control, or those with a high expected fibrosis grade, it is necessary to account for the anticipated variability of elasticity by increasing the number of acquisitions. This study indicated that the application of QT could reduce this variability, improve the differentiation of liver fibrosis, and potentially decrease the total time required for assessment in such cases.
Several vendors have introduced quality assessment methods for SWE [9], including the quality map by Siemens Healthcare (Erlangen, Germany) [16], the confidence map by Philips Healthcare (Best, The Netherlands) [17], the reliability measurement index by Samsung Medison (Seoul, Korea) [18], and the stability index by SuperSonic Imagine (Aix-en-Provence, France) [19]. However, studies evaluating the utility of these techniques are limited. Further research is required for an objective comparison of the various quality assessment methods.
On US images, various artifacts are generated by the physical characteristics of US [20]. To date, no studies or reports have documented the "artificial stripe pattern" artifact observed in the SWE images of the present study. Typically, liver SWE in adults is conducted while the patient holds their breath, minimizing the likelihood of motion-related artifacts. However, this artifact may be encountered in pediatric patients who struggle with breath control. Notably, this artifact cannot be mitigated by QT application. Consequently, an SWE map exhibiting such an artifact should be considered invalid and excluded from analysis. Additional SWE measurements should be obtained to compensate for this loss of data.
This study has several limitations. First, in the phantom experiments, only a single kPa value was used. This may not fully capture the complexity of real liver tissue elasticity, which varies across regions and stages of fibrosis. Second, the study categorized liver fibrosis based solely on imaging findings from the clinical experiments, without histopathological confirmation. This approach may introduce uncertainty in the grading of fibrosis. However, the criteria used to categorize liver fibrosis are easily and objectively applied in a clinical setting, and such a limitation is often unavoidable during routine follow-up in pediatric patients. Third, the study did not evaluate the SWE measurements when artificial stripe patterns were excluded, limiting the understanding of how these patterns affect the results. Fourth, the sample size for the clinical experiments, which included 79 abdominal US studies from 78 patients, was relatively small. Further research with larger samples should consider factors impacting diagnostic accuracy with the use of QT analysis.
Although SWE is frequently used as a non-invasive method in the evaluation of pediatric patients with chronic liver disease, limited research is available on the quality of the images produced. Based on this study, the application of QT to exclude poor-quality pixels is an effective strategy for reducing measurement errors and improving efficiency. This is particularly useful since the ROI can only be assigned a default size and a default position. Additionally, the use of QT was beneficial in differentiating liver fibrosis grades in children, especially patients over 5 years old. The presence of an artificial stripe pattern on the SWE map is believed to result from the interleaved scans and indicates that the image should be excluded due to insufficient data quality.
Notes
Author Contributions
Conceptualization: Lee MJ. Data acquisition: Kamiyama N, Tanigawa S, Yoon H, Lee MJ. Data analysis or interpretation: Kim J, Kamiyama N, Tanigawa S, Yoon H, Lim HJ, Lee MJ. Drafting of the manuscript: Kim J, Kamiyama N. Critical revision of the manuscript: Kamiyama N, Tanigawa S, Yoon H, Lim HJ, Lee MJ. Approval of the final version of the manuscript: all authors.
No potential conflict of interest relevant to this article was reported.
Supplementary Material
References
Article information Continued
Notes
Key point
Applying a quality threshold when performing shear wave elastography (SWE) can help minimize measurement errors. Using a quality threshold with SWE can also improve the differentiation of fibrosis grades in patients older than 5 years. The presence of an artificial stripe pattern on the SWE map may indicate that the image should be excluded for accurate results.