Ultrasound-guided transient elastography and two-dimensional shear wave elastography for assessment of liver fibrosis: emphasis on technical success and reliable measurements
Article information
Abstract
Purpose
This study investigated whether the use of ultrasound (US) guidance in transient elastography (TE) improved the technical success and reliability of liver stiffness (LS) measurements and whether 2-dimensional (2D) shear wave elastography (SWE) provided reliable LS measurements if TE measurements failed.
Methods
In this prospective study, 292 participants (male:female, 189:103; median age, 60 years) with chronic liver disease (CLD) were enrolled. LS was measured via the consecutive use of conventional TE, 2D-SWE, and US-guided TE. The technical success rates and reliable LS measurement rates of the three elastography techniques were compared. The risk factors for TE failure were assessed through univariate and multivariate logistic regression models.
Results
US-guided TE was associated with a higher technical success rate (281 of 292, 96.2%) and a higher reliable measurement rate (266 of 292, 91.1%) than conventional TE (technical success: 256 of 292, 87.7%; reliable measurements: 231 of 292, 79.1%; P<0.001 for both). In participants for whom conventional TE failed, 2D-SWE provided high rates of technical success (36 of 36, 100%) and reliable measurements (30 of 36, 83.3%). TE failure was associated with female sex (odds ratio [OR], 5.85; 95% confidence interval [CI], 1.30 to 26.40), severe reverberation artifacts (OR, 8.79; 95% CI, 3.93 to 19.69), and high skin-to-liver capsule depth (OR, 1.23; 95% CI, 1.09 to 1.39).
Conclusion
US guidance in TE improved the technical success and reliable measurement rates in the assessment of LS in patients with CLD. In participants for whom TE failed, subsequent 2D-SWE successfully delivered reliable LS measurements.
Introduction
Chronic liver disease (CLD) of various etiologies leads to hepatic fibrosis and, eventually, to cirrhosis. The evaluation of liver fibrosis and the diagnosis of cirrhosis in the course of CLD play an important role in determining the prognosis and treatment plan in patients with CLD [1-3]. Additionally, with the increasing use of antiviral therapy, the evaluation of patients' responses to such treatment is becoming increasingly important [4-6]. Liver biopsy is a direct diagnostic test for liver fibrosis, but its invasiveness makes it unsuitable for disease monitoring. Several noninvasive methods involving serum markers and imaging techniques have been used as surrogates [7,8]. To date, transient elastography (TE) is the most validated noninvasive modality for the evaluation of hepatic fibrosis [7]. While TE is relatively inexpensive and widely available, its rates of technical failure and unreliable measurements have been reported to be as high as approximately 10%-15% and 15%, respectively [9-11].
Known risk factors for TE failure include high body mass index (BMI), central obesity, old age, female sex, narrow intercostal spaces, and high liver stiffness (LS) [9,12-16]. However, the potential risk factors related to the sonic window or anatomical structures are poorly understood because TE involves M-mode imaging alone, and imaging findings that may be associated with TE failure are inaccessible [7,9,17]. Previous studies have identified significantly higher technical success rates for two-dimensional shear wave elastography (2D-SWE) than for TE, perhaps due to the use of real-time B-mode guidance in 2D-SWE [10,18]. Recently, an ultrasound (US) scanner equipped with both TE and 2D-SWE (GE LOGIQ S8, GE Healthcare, Milwaukee, WI, USA) has been developed and is being used clinically. In such clinical settings, LS can be measured with TE, US-guided TE, or 2D-SWE depending on the situation. It may be possible to improve the quality of LS measurements.
Therefore, the purpose of this study was to determine whether US guidance in TE improved the technical success and reliable LS measurement rates and to evaluate whether 2D-SWE provided reliable LS measurements in participants for whom TE failed.
Materials and Methods
Study Population
This prospective study was approved by the Institutional Review Board of Seoul National University Hospital, and written informed consent was obtained from all participants. Participants were enrolled between December 2018 and September 2019 according to the following eligibility criteria: (1) individuals 20 years of age or older; (2) patients who underwent abdominal US scanning under suspicion of diffuse liver disease or chronic hepatitis; (3) potential donors who underwent abdominal US scanning as part of preoperative evaluation for liver transplantation; and (4) individuals who provided written informed consent. Participants who had undergone right hepatectomy were excluded from the study population. For all participants, age, sex, height, weight, and BMI, as well as the presence of underlying liver disease, diabetes mellitus, and severe cardiac dysfunction, were recorded. The aspartate aminotransferase-to-platelet ratio index (APRI) was calculated from the level of aspartate aminotransferase (AST, U/L) and the platelet count (109 cells per liter) using the following formula: APRI=(AST/Upper limit of normal range)/Platelet count 100 [19]. The upper limit of the normal range for AST was 40 U/L. The recommended cut-off levels of APRI (0.5 and 2.0) were used to indicate significant fibrosis and cirrhosis, respectively [20].
Preparation of Participants for US Examinations
All US examinations were performed by experienced operators using a LOGIQ S8 US system (GE Healthcare). All participants fasted for 6 hours before the examination. During the US examination, the participant remained supine with the right arm maximally abducted above the head to stretch the right intercostal muscles. All US examinations were performed by two radiologists (J.M.L and H.J.K, with 21 and 8 years of experience in abdominal imaging and 7 and 4 years of experience in US elastography, respectively). After the first LS measurements were obtained using TE, a standard abdominal US examination was performed; additionally, LS measurements using US-guided TE and 2D-SWE were serially obtained.
Transient Elastography
LS measurements were made with TE (Fibroscan, EchoSens, Paris, France) using a LOGIQ S8 US scanner in two sessions. The standard M probe was used for all participants unless the use of an XL probe was recommended by the scanner [21].
In the first session (conventional LS measurement with TE), LS measurements were taken of the right anterior section of the liver through an intercostal space at its intersection with the midaxillary line in accordance to the manufacturer’s recommendation. In the second session (US-guided TE), the probe was placed on the right lobe of the liver so as to avoid any large vessel or interfering structures, under B-mode US guidance.
The examination was performed until 10 valid LS measurements were acquired with both conventional and US-guided TE. During the US-guided examination, the following B-mode parameters were documented: the degree of reverberation artifacts, the presence of any large vessel (>3 mm in diameter) in the scan range, the skin-to-liver capsule depth, the presence of cardiac motion or pulsations, and the breath-holding ability of the participant. The presence of reverberation artifacts was graded as severe when the liver was not clearly visible due to acoustic shadowing of the lung, whereas it was graded as mild or moderate when the affected region was ≤4 cm or >4 cm deep relative to the capsule, respectively [22]. The skin-to-liver capsule depth was measured using the electrical caliper of the US scanner. Cardiac motion or pulsations were considered to be present when gross motion of the right lobe of the liver or distension of the right hepatic vein was observed. The breath-holding ability of the participant was classified as excellent (>5 seconds), good (3 to 5 seconds), or poor (<3 seconds).
2D Shear Wave Elastography
Finally, LS measurements were made with 2D-SWE using the same scanner. 2D-SWE utilizes time-aligned sequential tracking and combpush US shear elastography technology [23,24]. Since an adequate B-mode image is required for LS measurements with 2D-SWE [25], a well-visualized 1×1 cm2 region of interest was established on the right anterior section of the liver using an intercostal approach [26]. This was accomplished by placing the right arm across the sternum (so that the intercostal space was not strained or stretched) while avoiding large vessels and areas with artifacts, approximately 1.5-2.0 cm away from the Glisson capsule (4-5 cm from the transducer) and less than 6 cm deep relative to the capsule. The intention was to avoid reverberation artifacts and areas of increased subcapsular stiffness (Fig. 1).
While the LS was being measured, participants were instructed to hold their breath for approximately 5-7 seconds. At least 10 valid measurements were obtained for each participant. To reduce recall bias, the summary of the serial measurements taken with each system was not made available to the operator until all examinations were completed. For both the TE and 2D-SWE examinations, the median value of the LS measurements was automatically calculated and expressed in kilopascals (kPa). The interquartile range was calculated using statistical software.
Definition of Technical Failure and Reliable Measurements
Technical failure of TE or 2D-SWE was defined as the failure to acquire three valid LS measurements in 10 trials. Reliable measurements of LS were defined as a set of 10 measurements in which the interquartile range divided by the median LS of the set of measurements was less than 30%.
Outcome
Our primary endpoint was to compare conventional TE and US-guided TE with regard to technical success rates and reliable measurement rates. Our secondary endpoints were factors associated with conventional TE failure, the reliability of 2D-SWE in participants with conventional TE failure, and the correlations between LS values measured with conventional TE, US-guided TE, and 2D-SWE.
Statistical Analysis
The technical success rates and reliable measurement rates of the different elastography techniques were compared using chi-square analysis or the Fisher exact test as appropriate. To determine the risk factors associated with TE failure, both univariate and multivariate analyses were performed using logistic regression models. Factors with P-values ≤0.15 based on the univariate analyses were entered into the multivariate logistic regression models. Receiver operating characteristic curve analysis and the Youden index were used to identify cut-off values for quantitative parameters that maximized the average of sensitivity and specificity for predicting TE failure. The non-parametric Wilcoxon signed-rank test was used to compare the LS values obtained via TE and 2D-SWE. Subsequently, the Pearson correlation coefficient was calculated to determine the level of correlation between the techniques. To assess the level of agreement among the three methods, the intraclass correlation coefficient (ICC) was calculated from the mean LS values. An ICC greater than 0.75 indicated good agreement. A 95% confidence interval (CI) was calculated for each predictive test, and a P-value <0.05 was considered to indicate statistical significance. All statistical analyses were performed using a commercially available software program (MedCalc version 19.1.3, MedCalc Software, Mariakerke, Belgium).
Results
A total of 292 participants (male:female, 189:103; median age, 60 years) were enrolled, and TE, US-guided TE, B-mode US, and 2D-SWE were performed for all participants. A majority of the participants had chronic hepatitis B (199 of 292, 68.2%). The median BMI was 24.6 kg/m2 (range, 16.0 to 38.2 kg/m2), and the median skin-to-liver capsule depth was 19.0 mm (range, 9.9 to 33 mm). The clinical characteristics, B-mode parameters, and breath-holding ability of the participants are presented in Table 1.
Clinical and Imaging Factors Associated with Conventional TE Failure
Conventional TE had a technical failure rate of 12.3% (36 of 292). Of the participants for whom conventional TE failed, 83.3% (30 of 36) exhibited moderate to severe reverberation artifacts, 14.7% (15 of 36) were overweight (BMI ≥25.0 kg/m2), and 19.4% (7 of 36) had large vessels (>3 mm) in the scan range. In univariate analysis, female sex, severe reverberation artifacts, great skin-to-liver capsule depth, and poor breath-holding capacity were associated with conventional TE failure. Age, BMI, history of diabetes, the presence of large vessels in the scan range, and the presence of cardiac motion or pulsations were not associated with TE failure. Factors that remained significant in the multivariate analysis were sex (odds ratio [OR], 5.85; 95% CI, 1.30 to 26.40), the degree of reverberation artifacts (OR, 8.79; 95% CI, 3.93 to 19.69), and the skin-to-liver capsule depth (OR, 1.23; 95% CI, 1.09 to 1.39) (Table 2).
In the subgroup with a skin-to-liver capsule depth >21.6 mm (cut-off value derived from the receiver operating characteristic curve, n=58), 2D-SWE had a higher technical success rate (56 of 58, 96.6%) and a higher reliable measurement rate (46 of 58, 79.3%) than conventional TE (technical success rate, 63.8% [37 of 58], P<0.001; reliable measurement rate, 51.7% [30 of 58], P=0.002). Similarly, in the subgroup with moderate to severe reverberation artifacts (n=83), the technical success and reliable measurement rates of 2D-SWE (74 of 83 [89.2%] and 64 of 83 [77.1%], respectively) were significantly higher than those of conventional TE (technical success rate, 63.9% [53 of 83], P<0.001; reliable measurement rate, 49.4% [41 of 83], P<0.001).
Comparison of Technical Success Rates and Reliable Measurement Rates between TE and US-Guided TE
US-guided TE was associated with a significantly lower technical failure rate than conventional TE (11 of 292 [3.8%] vs. 36 of 292 [12.3%], P<0.001). US-guided TE (266 of 292, 91.1%) also exhibited a significantly higher reliable measurement rate than conventional TE (231 of 292, 79.1%) (P<0.001) (Table 3). Of the 36 participants for whom conventional TE failed, 72.2% (26 of 36) had US-guided TE that was technically successful (Fig. 2). In a subgroup analysis according to the fibrosis grade as classified by APRI, significant differences in the technical success rates were found between conventional TE and US-guided TE in participants with an APRI <0.5 and an APRI ≥0.5 but <2.0 (no fibrosis and significant fibrosis; P=0.002 and P=0.020, respectively), while no significant difference in technical success rates was observed in participants with APRI ≥2.0 (cirrhosis; P>0.99) (Table 4).
2D-SWE in Participants with Failed/Unreliable TE Measurements
Compared with conventional TE, 2D-SWE was associated with a significantly lower rate of technical failure (6 of 292, 2.1%; P<0.001) and a significantly higher rate of reliable LS measurements (257 of 292, 88.0%; P=0.004). However, no significant difference in technical failure rate was observed between 2D-SWE and US-guided TE (P>0.99).
Among the 36 participants for whom conventional TE failed, 36.1% (13 of 36) had significant fibrosis according to the classification of fibrosis grade by APRI (≥0.5). In those 36 participants, subsequent 2D-SWE had a technical success rate of 100% (36 of 36) and a reliable measurement rate of 83.3% (30 of 36). Among 61 participants with unreliable measurements on a conventional TE examination, 2D-SWE provided agreeable rates of technical success (59 of 61, 96.7%) and reliable measurements (49 of 61, 80.3%).
Correlation of LS Values between Conventional TE, US-Guided TE, and 2D-SWE
Among the 185 participants with reliable LS measurements on all examinations, the mean LS values obtained with conventional TE, US-guided TE, and 2D-SWE were 7.0 kPa, 6.4 kPa, and 6.8 kPa, respectively (Fig. 3). Significant differences were observed between LS values obtained with conventional TE and those obtained with US-guided TE (median difference, 0.3; 95% CI, 0.2 to 0.4; P<0.001), those obtained with conventional TE and 2D-SWE (median difference, 0.7; 95% CI, 0.4 to 0.9; P<0.001), and those obtained with US-guided TE and 2D-SWE (median difference, 1.0; 95% CI, 0.8 to 1.2; P<0.001) (Table 3). Significant positive correlations were observed between conventional TE and US-guided TE (r=0.78, P<0.001), between conventional TE and 2D-SWE (r=0.67, P<0.001), and between US-guided TE and 2D-SWE (r=0.85, P<0.001) (Fig. 4). The ICC of LS measurements with conventional TE, US-guided TE, and 2D-SWE was 0.83 (95% CI, 0.79 to 0.87), indicating excellent correlations.
Discussion
In our prospective study, US-guided TE provided higher rates of technical success (96.2% vs. 87.7%, P<0.001) and reliable LS measurements (91.1% vs. 79.1%, P<0.001) than conventional TE. Furthermore, 72.2% of participants for whom conventional TE failed were successfully measured with US-guided TE. Since the technical success rate of conventional TE obtained in our study is comparable with that found in the literature, we believe that the subsequent improvement may be attributable to US guidance. As we hypothesized, US guidance can be used to identify and avoid any interfering objects, including large blood vessels, lung air, rib shadows, or reverberation artifacts. For example, if an LS measurement is taken near a large blood vessel without US guidance, the LS value may be affected by stiffness or motion of the vessel wall. Although M-mode guidance for TE can provide such information, it is easier to find hepatic vessels using B-mode US than with M-mode guidance. Furthermore, B-mode US may be helpful to find a suitable sonic window even in patients with narrow intercostal spaces. In addition, as the B-mode image from which LS measurements were obtained can be saved and reviewed in a picture archiving and communication system, US-guided TE may have improved reproducibility over repeated examinations, an advantage that has also been noted for 2D-SWE [27]. Therefore, we infer that B-mode guidance is beneficial for overcoming the limitation of one-dimensional M-mode guidance for conventional TE.
Indeed, several previous studies have demonstrated that the risk factors of TE failure include high BMI, followed by old age, central obesity, and female sex [9,12-16,28,29]. Similarly, in our study, technical TE failure was found to be associated with female sex, the presence of severe reverberation artifacts, and a relatively high skin-to-liver capsule depth. However, high BMI was not found to be associated with technical TE failure in our study, likely because of the relatively low overall BMI (median, 24.6 kg/m2; range, 16.0 to 38.2 kg/m2) in the study group (which was composed of Korean patients), as well as the use of XL probes in obese participants in accordance with the recommendation made by the TE system after application of an M probe [30]. In addition, among the B-mode parameters evaluated in the US guidance process, reverberation artifacts and skin-to-liver capsule depth were found to be associated with TE failure. Reverberation artifacts are known technical limitations that hinder accurate measurement in US elastography [31,32]. A previous study showed that when either M and XL probes are used depending on the skin-to-liver capsule depth, a lower measurement failure rate is achieved than when solely M probes are used [33]. However, in the present study, despite the use of an XL probe, skin-to-liver capsule depth was found to be significantly associated with the technical failure of TE.
In this study, the median LS values of different elastography techniques significantly differed from each other. However, the difference in LS values was smaller between conventional TE and US-guided TE (median difference, 0.3) than between conventional TE and 2D-SWE or US-guided TE and 2D-SWE (median differences, 0.66 and 0.99, respectively), which can be explained by the distinct nature of LS measurement in TE and 2D-SWE. Furthermore, LS values obtained with the three methods showed excellent correlations. The correlation between the LS values obtained with TE and 2D-SWE was in agreement with previous studies [15,34]. In participants for whom TE failed, 2D-SWE showed high rates of technical success (100%) and reliable measurement (83.3%). Thus, based on the findings of our study, we believe that 2D-SWE may be applied as an alternative test when TE fails or cannot be used due to ascites or severe obesity. In clinical practice, both TE and 2D-SWE are used for the assessment of liver fibrosis in patients with chronic liver disease. However, until now, despite the high correlation between the two methods in LS measurement, there has been no strong evidence supporting the interchangeability of LS measurements obtained with these methods. Unlike in TE, cut-off values for 2D-SWE in predicting fibrosis grade have not been validated by large studies. Therefore, further large-scale studies evaluating the interchangeability between LS values obtained with TE and 2D-SWE and establishing cut-off values for 2D-SWE in predicting hepatic fibrosis may be required.
Another aspect of US-guided TE examination is the associated clinical workflow and cost-effectiveness. In the evaluation of patients with CLD, at many institutions, both B-mode US examination and LS measurements via TE are performed separately in different examination rooms for different purposes. The evaluation of liver echotexture and morphological changes, assessment of complications of portal hypertension (including gastroesophageal varices and ascites), and screening for hepatocellular carcinoma are necessarily performed by B-mode US, often supplemented by portal venous Doppler US [35]. The US system used in our study was equipped with a scanner in which TE and 2D-SWE were available in a single unit, therefore allowing simultaneous hepatic B-mode and elastographic assessment. Recently, several studies reported that magnetic resonance elastography and 2D-SWE provide the advantage of obtaining a large sampling volume at one time, resulting in high diagnostic accuracy [11,36]. However, magnetic resonance elastography is expensive and has limited accessibility compared to US-based elastography, and several 2D-SWE techniques require further validation of their diagnostic accuracy in large studies. Indeed, TE is the most widely used elastographic technique due to its extensive validation and ease of use as a point-of-care testing method. Additionally, both the American Association for the Study of Liver Diseases and the European Association for the Study of the Liver guidelines recommend TE as an alternative method of liver biopsy for chronic hepatitis B [37-39]. Considering the clinical need for US screening for hepatocellular carcinoma and hepatic fibrosis monitoring using TE or 2D-SWE, the inclusive US platform used in the current study may reduce the number of examinations and hospital visits for patients, serve as a one-stop shopping tool, and enhance the clinical usability of TE.
Our study had several limitations. First, we did not evaluate the diagnostic performance of each modality in the context of liver fibrosis staging. However, our study aimed to evaluate the added value of US guidance in TE with regard to improving the technical success rate. The added value in diagnostic performance must be evaluated in future studies. Second, we did not evaluate the intraobserver or interobserver reliability of LS measurements, which was a limitation of our own study design. However, good to excellent interobserver and intraobserver reliability have been reported for TE and 2D-SWE [40]. Third, the three examinations were performed serially during the same session. Therefore, the results may have been impacted by bias due to possible learning effects. However, TE and 2D-SWE measurements are well established and highly standardized. Thus, bias due to fixed order and learning effects is expected to have been minimal. Finally, we assessed the 2D-SWE technique using the LOGIQ S8 US system alone, and further studies with other imaging systems are warranted.
In conclusion, US guidance with TE improved the technical success rates and reliable measurement rates in the evaluation of LS in patients with CLD. In participants for whom TE failed, subsequent 2D-SWE successfully delivered reliable LS values.
Notes
Author Contributions
Conceptualization: Lee J, Kang HJ, Yoon JH, Lee JM. Data acquisition: Kang HJ, Yoon J, Lee JM. Data analysis or interpretation: Lee J, Lee JM. Drafting of the manuscript: Lee J, Lee JM. Critical revision of the manuscript: Lee J, Yoon J, Lee JM. Approval of the final version of the manuscript: all authors.
This study was funded by GE Healthcare (Milwaukee, WI, USA).