Intra-individual comparison of liver stiffness measurements by magnetic resonance elastography and two-dimensional shear-wave elastography in 888 patients

Article information

Ultrasonography. 2023;42(1):65-77

Publication date (electronic) : 2022 June 21

doi : https://doi.org/10.14366/usg.22052

Hideo Ichikawa ¹

, Eisuke Yasuda^,¹

, Takashi Kumada ²

, Kenji Takeshima ³, Sadanobu Ogawa ³, Akikazu Tsunekawa ³, Tatsuya Goto ³, Koji Nakaya ¹, Tomoyuki Akita ⁴

, Junko Tanaka ⁴

¹Department of Medical Imaging, Graduate School of Health Science, Suzuka University of Medical Science, Suzuka, Japan

²Department of Nursing, Faculty of Nursing, Gifu Kyoritsu University, Ogaki, Japan

³Department of Imaging Diagnosis, Ogaki Municipal Hospital, Ogaki, Japan

⁴Department of Epidemiology, Infectious Disease Control, and Prevention, Hiroshima University Institute of Biomedical and Health Sciences, Hiroshima, Japan

Correspondence to: Eisuke Yasuda, PhD, Department of Medical Imaging, Graduate School of Health Science, Suzuka University of Medical Science, Suzuka, Mie 510- 0293, Japan Tel. +81-059-383-8991 Fax. +81-059-383-9666 E-mail: ei.yasuda@gmail.com

Received 2022 March 28; Revised 2022 June 20; Accepted 2022 June 21.

Abstract

Purpose

Quantitative elastography methods, such as ultrasound two-dimensional shear-wave elastography (2D-SWE) and magnetic resonance elastography (MRE), are used to diagnose liver fibrosis. The present study compared liver stiffness determined by 2D-SWE and MRE within individuals and analyzed the degree of agreement between the two techniques.

Methods

In total, 888 patients who underwent 2D-SWE and MRE were analyzed. Bland-Altman analysis was performed after both types of measurements were log-transformed to a normal distribution and converted to a common set of units using linear regression analysis for differing scales. The expected limit of agreement (LoA) was defined as the square root of the sum of the squares of 2D-SWE and MRE precision. The percentage difference was expressed as (2D-SWEMRE)/ mean of the two methods×100.

Results

A Bland-Altman plot showed that the bias and upper and lower LoAs (ULoA and LLoA) were 0.0002 (95% confidence interval [CI], -0.0057 to 0.0061), 0.1747 (95% CI, 0.1646 to 0.1847), and -0.1743 (95% CI, -0.1843 to -0.1642), respectively. In terms of percentage difference, the mean, ULoA, and LLoA were -0.5944%, 19.8950%, and -21.0838%, respectively. The calculated expected LoA was 17.1178% (95% CI, 16.6353% to 17.6002%), and 789 of 888 patients (88.9%) had a percentage difference within the expected LoA. The intraclass correlation coefficient of the two methods indicated an almost perfect correlation (0.8231; 95% CI, 0.8006 to 0.8432; P<0.001).

Conclusion

Bland-Altman analysis demonstrated that 2D-SWE and MRE were interchangeable within a clinically acceptable range.

Keywords: Two-dimensional shear-wave elastography; Magnetic resonance elastography; Bland-Altman analysis; Intraclass correlation coefficient; Proton density fat fraction

Introduction

Hepatic fibrosis, a form of scarring that results from repeated liver injury, leads to the accumulation of extracellular matrix components in the liver parenchyma [1]. Fibrosis can progress to cirrhosis and is an important risk factor for hepatocellular carcinoma (HCC) and hepatic failure [2]. An accurate diagnosis of the degree of hepatic fibrosis is essential for patient management, including for predicting the prognosis and monitoring responses to fibrosis therapies.

Liver biopsy is the gold standard for staging hepatic fibrosis. However, it is an invasive procedure and has several disadvantages, including patient reluctance, pain, and hemoperitoneum, and its complications may be life-threatening [3]. These disadvantages limit the role of biopsy for serial monitoring. Furthermore, liver biopsy assesses only about 1/50,000th of the whole liver volume and is thus prone to sampling error and intra- and inter-observer variation [4].

Magnetic resonance elastography (MRE) has emerged as a highly accurate, noninvasive imaging test to measure liver stiffness (LS) and thus quantify liver fibrosis [5]. However, using MRE to test a large number of patients at risk for liver fibrosis is costly and practically difficult. There are also complaints that the MRE examination space is small, the vibrations cause feelings of sickness, and the examination time is too long, all of which limit the feasibility of magnetic resonance imaging (MRI) examinations. Ultrasound-based methods for LS quantification can also assess fibrosis and are quicker to perform. Of these techniques, transient elastography (TE) may be the most widely performed, and it has been extensively investigated [6,7]. Nonetheless, studies comparing MRE and TE have shown that MRE has superior performance [5,8,9].

Two-dimensional shear-wave elastography (2D-SWE) has been introduced as an additional approach for ultrasound-based LS measurement. Unlike TE, 2D-SWE offers real-time simultaneous B-mode visualization of the liver and incorporates flexible placement of larger regions of interest (ROIs), thereby potentially reducing technical failures and providing more robust assessment in challenging cases [10,11]. A relatively limited number of investigations have compared MRE and 2D-SWE [5,9,12], and they yielded heterogeneous results, had limited sample sizes, or did not focus on factors impacting the agreement of measurements obtained by both methods. The aim of this study was thus to perform an intra-individual comparison of LS measurements using MRE and 2D-SWE in a large patient sample, with attention to factors impacting agreement.

Materials and Methods

Compliance with Ethical Standards

This retrospective study was approved by the Institutional Review Board (20200423-5) of Ogaki Municipal Hospital and was carried out in compliance with the Helsinki Declaration. The Institutional Review Board approved this study after the examinations were completed and waived the requirement for further consent.

Study Population

All MRE and 2D-SWE examinations were performed for clinical purposes unrelated to this investigation. At the time of the MRE and 2D-SWE examinations, patients provided written informed consent to use the test results in future clinical research.

At the authors’ institution, both MRE and 2D-SWE were routinely performed in patients with chronic liver disease. Fig. 1 shows a flowchart of patient selection. A retrospective search of patients with suspected liver disease who underwent B-mode ultrasonography identified 2,710 consecutive patients with chronic liver disease who underwent both MRE and 2D-SWE between April 2015 and December 2020. Of these, 1,236 patients underwent MRE and 2D-SWE, performed in either order, within a 3-month window. Additional patients were then excluded for the following reasons: treatment for HCC before MRE or 2D-SWE examination (n=283); presence of liver metastasis (n=20); large-volume ascites (n=18); and severe jaundice (n=10) [6,13]. These exclusions resulted in a study sample of 905 patients (453 women, 452 men; median age, 67 years).

Fig. 1.

Flowchart of patient selection.

HCC, hepatocellular carcinoma.

Alanine aminotransferase (ALT), aspartate aminotransferase (AST), gamma-glutamyl transpeptidase, platelet count, albumin, and total bilirubin were measured in all patients at the time of 2D-SWE. The fibrosis-4 (FIB-4) score, as a noninvasive simple serum marker, was calculated using the following formula: age (years)×AST (U/L)/[platelet count (10⁹/L)×ALT (U/L)] [14].

LS Measured by MRE

MRE was performed using a 3.0 T MRI system (Discovery 750, GE Healthcare, Waukesha, WI, USA) with a 32-channel phased-array coil [15]. Patients were placed in the supine position, and a cylindrical passive driver was attached to the right chest wall using a rubber belt. Axial wave images were acquired using a 2D spin-echo planar MRE sequence with the following parameters, as previously reported [16]: repetition time/echo time, 1,000/59.3-236 ms; continuous sinusoidal vibration, 60 Hz; field of view, 42 cm; matrix size, 64×64; flip angle, 90°; section thickness, 7 mm; and four evenly spaced phase offsets and four pairs of 60-Hz trapezoidal motion-encoding gradients with zeroth- and first-order moment nulling along the through-plane direction. The processing was automated and yielded confidence maps of tissue shear stiffness measured in kilopascals (kPa). To assess the proton density fat fraction (PDFF), the examinations also included a modified Dixon sequence with advanced processing (IDEAL IQ, GE Healthcare), using previously described acquisition parameters [16].

Each MRI examination was reviewed by one of two radiologists (S.O. and T.G., who specialized in hepatology and had 10 and 7 years of experience in hepatobiliary imaging, respectively) who were blinded to each patient’s clinical data, including 2D-SWE results. The first radiologist to review the examination assessed whether the case demonstrated technical failure of MRE, as defined according to Wagner et al. [13]. If it did, then the reason for failure was recorded. For the remaining cases, the radiologist placed ROIs on each slice of the MRE magnitude images, including only liver parenchyma and avoiding the liver edge and large blood vessels. The ROIs also excluded portions of the liver in which the phase signal-to-noise ratio (i.e., the ratio of wave amplitude to noise in the wave images) was less than 5. One ROI was placed on each of the four slices, and the mean value was recorded to obtain the LS according to MRE (hereafter, LS_MRE). Advanced fibrosis (≥F3) was defined as an MRE value ≥4.8 kPa [17].

The radiologist who measured LS_MRE also placed ROIs on the in-phase and out-of-phase images to obtain PDFF measures. The steatosis grade was classified as grade 0 for PDFF <5.2%, grade 1 for PDFF ≥5.2% but <11.3%, grade 2 for PDFF ≥11.3% but <17.1%, and grade 3 for PDFF ≥17.1% [17].

In a post hoc consensus review after the completion of the independent LS_MRE interpretations, the radiologists assessed the presence of mild to moderate ascites. LS and PDFF measurements were analyzed by the same two radiologists (S.O. and T.G.), who were blinded to each patient’s clinical data.

LS Measured by 2D-SWE

The 2D-SWE scans were performed using a LOGIQ S8, E9, or E10 ultrasound system (GE Healthcare) with a C1–6-D abdominal convex probe at a frequency of 1–4 MHz. Each scan was performed by one of three ultrasound technologists (non-authors, each with 10–15 years of experience in performing clinical abdominal ultrasound examinations and 3–5 years of experience in performing SWE examinations, including over 200 ultrasound elastography examinations). The operators were unaware of the MRE findings in each patient. Patients were required to have been fasting for at least 4 hours. During the examination, each patient lay in a supine position with the right arm in maximum abduction. Color-coded elasticity maps were generated from the right hepatic lobe using an intercostal approach. The machine automatically performed quality assessment by assessing shear wave propagation and excluding pixels on grayscale B-mode images that were judged to be of low quality due to poor probe contact, motion artifacts (secondary to breathing or heartbeats), and other artifacts (acoustic attenuation, reflection, scattering, and rib shadowing). Stiffness measurements were performed by the scanning technologist at the time of the examination; no scans were repeated specifically for the purpose of this investigation. The scanning technologist placed circular ROIs (approximately 15.0 mm in diameter) in the color box, perpendicular to the liver surface at a depth of at least 2.0 cm from the liver surface, and the value was recorded (expressed in kPa). Measurements were recorded from at least five ROIs, and the median value was calculated as the patient’s LS according to 2D-SWE (hereafter, LS_SWE) [18]. If the ratio between the interquartile range (IQR) and the median of the five (or more) stiffness measurements was higher than 30%, then the measurements were deemed unreliable, and the 2D-SWE examination was classified as a technical failure. Advanced fibrosis (≥F3) was defined as a 2D-SWE value ≥8.9 kPa [18].

Fig. 2 shows examples of MRE and 2D-SWE images in representative patients without fibrosis (F0), with moderate fibrosis (F2), and with advanced fibrosis (F4) [19].

Fig. 2.

Two-dimensional shear-wave elastography (2D-SWE) and magnetic resonance elastography (MRE) in three representative patients.

The degree of liver stiffness determined by 2D-SWE and MRE increased with the degree of fibrosis progression, defined as the staging of liver fibrosis in chronic hepatitis C by MRE [19]. The left figure shows no fibrosis (F0, 58-year-old, female, hepatitis C virus [HCV] infection), the middle shows moderate fibrosis (F2, 82-year-old, female, HCV infection), and right shows advanced fibrosis (F4, 82-year-old, female, HCV infection).

Statistical Analysis

Each pair of LS_MRE and LS_SWE values was compared and analyzed with the Bland-Altman method. Before starting the comparison, a normal probability plot was used to verify that the LS_MRE and LS_SWE values were normally distributed; if not, they were transformed into a normal distribution [20]. In addition, for the subsequent analysis, linear regression was used to convert LS_MRE values to the same scale as the modified LS_SWE values [21–23]. The agreement between LS_MRE and LS_SWE was assessed by determining bias and precision [24–28]. Bias was calculated as the mean difference between LS_MRE and LS_SWE. The precision (P) of each method was defined as two standard deviations of the difference between replicate measurements of method x (2D-SWE and MRE), and given as a percentage:

P=2×1n∑i=1n(Xi−x¯)2X¯×100

where n is the number of replicated experiments, and X¯ is the average of measurements [26]. After the precision of 2D-SWE and MRE was determined, the expected limit of agreement (LoA) was calculated [27] using the following equation:

Expected LoA=PSWE2+PMRE2

Linear regression and Bland-Altman analyses were used to determine the agreement between 2D-SWE and MRE. The percentage difference (% difference) was defined as:

% Difference=LSSWE−LSMRE(LSSWE+LSMRE)/2×100

The percentage error (PE) for comparing two methods x and y (in this study, 2D-SWE and MRE) was calculated similarly to the precision of replicate measurements. The equation for PE was given as:

PE=2×SD(Xn−Yn)(X¯+Y¯)/2×100

where n is the patient number, and X¯ and Y¯ are the average values obtained for methods x (2D-SWE) and y (MRE). Finally, the expected LoA was compared with the PE or percentage difference according to the criteria proposed by Critchley and Critchley [25]. The same Bland-Altman analyses were performed according to body mass index (BMI) and PDFF. In addition, the intraclass correlation coefficient (ICC) was calculated. The cutoff values to interpret the ICC was classified as follows: <0, poor; 0–0.20, slight; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, substantial; and 0.81–1.00, almost perfect [29].

Statistical significance was defined as a P-value <0.05. All statistical analyses were performed with EZR (version 1.53, Saitama Medical Center, Jichi Medical University, Saitama, Japan), which is a graphical user interface for R (R Foundation for Statistical Computing, Vienna, Austria) [30].

Results

Patient Characteristics

Table 1 shows the baseline characteristics of the study patients. The causes of chronic liver disease were as follows: non-alcoholic fatty liver disease (n=226); alcoholic liver disease (n=57); hepatitis B virus infection (n=99); hepatitis C virus infection (n=420); autoimmune hepatitis (n=19); primary biliary cholangitis (n=13); and others (n=71). The median age was 67 years (IQR, 58 to 75 years), and the median BMI was 23.7 kg/m² (IQR, 21.4 to 26.1 kg/m²). The median LS_MRE was 2.7 kPa (IQR, 2.3 to 3.7 kPa), the median LS_SWE was 6.2 kPa (IQR, 5.0 to 8.0 kPa), and the median PDFF was 3.9% (IQR, 2.0% to 9.7%).

Table 1.

Baseline patient characteristics

The Technical Success Rates of MRE and 2D-SWE

The technical success rate of MRE was 98.9% (895/905). The reasons for MRE technical failure were as follows: no pixel values with a confidence index higher than 95% on the confidence map (n=4), poor breath-holding (n=3), and mild to moderate ascites (n=3). The technical success rate of 2D-SWE was 99.3% (898/905). The LS_SWE measurements were deemed unreliable in seven patients because the ratio between the IQR and median measurements was larger than 30%. The technical success rate was not significantly different between MRE and 2D-SWE (P=0.627). The 888 patients for whom both MRE and 2D-SWE were technically successful were included in subsequent analyses (Fig. 1).

Analysis by the Bland-Altman Method

Supplementary Fig. 1 shows the correlation between LS_MRE and LS_SWE values, indicating substantial agreement (Spearman rank correlation coefficient, 0.786; 95% confidence interval [CI], 0.761 to 0.811; P<0.001) [29]. The normal probabilities of LS_MRE and LS_SWE values were evaluated using a normal probability plot before starting Bland-Altman analysis. Since neither LS_MRE nor LS_SWE values showed a normal distribution, both types of values were log-transformed [20]. The log LS_MRE was converted into the modified log LS_SWE by the following simple linear regression model: modified log LS_SWE=0.4176+0.8193×log LS_MRE [18,19].

Fig. 3A shows a Bland-Altman plot presenting the difference between modified log LS_SWE and log LS_SWE, along with the plots of both means. The Bland-Altman plot demonstrated that the bias, upper limit of agreement (ULoA), and lower limit of agreement (LLoA) were 0.0002 (95% CI, -0.0057 to 0.0061), 0.1747 (95% CI, 0.1646 to 0.1847), and -0.1743 (95% CI, -0.1843 to -0.1642), respectively. Fig. 3B shows the differences between modified log LS_SWE and log LS_SWE as percentages of values (percentage differences). The mean, ULoA, and LLoA were -0.5944%, 19.8950%, and -21.0838%, respectively. The calculated PE was 21.9647%, which was greater than the calculated expected LoA of 17.1177% (95% CI, 16.6353% to 17.6002%). However, 789 of the 888 patients (88.9%) had a percentage difference within 17.1177%. Fig. 3C illustrates the relationship between the modified log LS_SWE and log LS_SWE values. The ICC was 0.8231, indicating almost perfect agreement (P<0.001) [27]. The various parameters are listed in Supplementary Table 1.

Fig. 3.

Bland-Altman analysis.

A. Bland-Altman plot where differences are presented as units. The Bland-Altman plot demonstrated that the bias, upper limit of agreement (ULoA), and lower limit of agreement (LLoA) were 0.0002 (95% confidence interval [CI], -0.0057 to 0.0061), 0.1747 (95% CI, 0.1646 to 0.1847), and -0.1743 (95% CI, -0.1843 to -0.1642), respectively. B. Bland-Altman plot shows the difference as a percentage (% difference). The figure shows the difference (% difference) between the adjusted log LS_SWE values and the log LS_SWE values, where the mean, ULoA, and LLoA were –0.5944%, 19.8950%, and –21.0838%, respectively. C. Intraclass correlation coefficients are as follows: r=0.8231; 95% CI, 0.8006 to 0.8432. The relationship between the modified log LS_SWE and log LS_SWE values is shown; the intraclass correlation coefficient (ICC) is 0.8231, indicating almost perfect agreement (P<0.001). MRE, magnetic resonance elastography; 2D-SWE, twodimensional shear-wave elastography; LS_MRE, liver stiffness according to MRE; LS_SWE, liver stiffness according to 2D-SWE; modified log LS_SWE=0.4176+0.8193×log LS_MRE; % difference=[modified log LS_SWElog LS_SWE)/(0.5×(modified log LS_SWE+log LS_SWE]×100%.

In 29 of the 888 patients (3.3%), the correlation between 2D-SWE and MRE fell outside the 95% CI (Supplementary Fig. 1). Twenty-three patients showed an upward divergence (upper group), six showed a downward divergence (lower group), and 859 showed no divergence (non-divergent group). The percentage of patients with poor-quality ultrasound B-mode images was non-significantly higher in the lower group than in the upper group (100.0% for the former, 43.5% for the latter, P=0.068). A color-coded, heterogeneous shear-wave velocity map overlying the grayscale B-mode image within a selected ROI was examined in 27 of these 29 patients (93.1%). The FIB-4 scores in the lower group (n=6), upper group (n=23), and non-divergent group (n=859) were 3.85 (2.20–5.60), 4.19 (2.70–5.88), and 1.95 (1.24–3.00), respectively, indicating a significant difference between the upper and non-divergent groups (Steel-Dwass test, P<0.001). As shown in Supplementary Table 2, the kappa coefficient for advanced fibrosis evaluated by 2D-DWE and MRE was 0.624 (95% CI, 0.550 to 0.702; P<0.001), indicating substantial agreement [29].

Bland-Altman Analysis According to BMI and Hepatic Steatosis

Table 2 shows the Bland-Altman analysis according to BMI. There were no discrepancies in the percentage difference according to BMI category (<25.0 kg/m², ≥25.0 kg/m² but <30.0 kg/m², or ≥30.0 kg/m²). The bias, percentage difference, and expected LoA increased significantly with BMI (Jonckheere-Terpstra test, P<0.001, P<0.001, and P<0.001, respectively). The ICCs gradually decreased as BMI increased, but indicated almost perfect agreement except for the category of BMI ≥30 kg/m² (Fig. 4).

Table 2.

Bland-Altman analysis in subgroups defined by BMI

Fig. 4.

Intraclass correlation coefficients (ICCs) according to body mass index (BMI).

The correlation between different BMI values is shown: BMI <25.0 kg/m² (n=569) (A); 25.0≤BMI<30.0 kg/m² (n=254) (B); BMI ≥30.0 kg/m² (n=65) (C). The ICCs gradually decreased as BMI increased, but indicated almost perfect agreement except for patients with a BMI ≥30 kg/m². LS_MRE, liver stiffness according to magnetic resonance elastography; LS_SWE, liver stiffness according to twodimensional shear-wave elastography; CI, confidence interval.

Table 3 shows the Bland-Altman analysis according to hepatic steatosis grade measured by PDFF. The bias, percentage difference, and expected LoA increased significantly with the hepatic steatosis grade (Jonckheere-Terpstra test, P<0.001, P<0.001, and P=0.004, respectively). The ICCs gradually decreased as the steatosis grade increased, but almost perfect agreement in grade 0 and 1 patients (Fig. 5).

Table 3.

Bland-Altman analysis in subgroups defined by hepatic steatosis

Fig. 5.

Intraclass correlation coefficients (ICCs) according to steatosis grade.

The correlation between different PDFF is shown: grade 0 (n=512), PDFF <5.2% (A); grade 1 (n=185), 5.2%≤PDFF<11.3% (B); grade 2 (n=91), 11.3%≤PDFF<17.1% (C); grade 3 (n=100), PDFF ≥17.1% (D). LS_MRE, liver stiffness according to magnetic resonance elastography; LS_SWE, liver stiffness according to twodimensional shear-wave elastography; CI, confidence interval.

Discussion

In this study, Bland-Altman analysis revealed that LS_MRE and LS_SWE showed good agreement. With the Bland-Altman method, using correlations as a measure of agreement is problematic because this method actually assesses the ordering of the LS_SWE and LS_MRE values and their relative spacing, rather than whether or not the numbers themselves agree. In this study, the ICC indicated almost perfect agreement (r=0.823). Bland-Altman plots comparing two methods are very useful for assessing agreement. However, if the numbers are incommensurate, it makes no sense to try to determine whether they agree. The two sets of measurements used in Bland-Altman analysis must be normally distributed, and to achieve this, the LS_SWE and LS_MRE values were log-transformed [20]. In addition, when both measures are continuous and are roughly normally distributed, they can be converted to standardized scores [21–23]. Many studies have demonstrated that this transformation is methodologically acceptable [21-23,26,27]. In general, elasticity properties such as the shear modulus (μ) and Young modulus (E) describe the mechanical responses of a medium under shear stress and longitudinal stress, respectively. The Young modulus is the ratio between longitudinal stress and longitudinal strain, and the shear modulus is the ratio between shear stress and shear strain. Since the Poisson ratio (γ) for most soft tissues is very close to that of an incompressible liquid (γ=0.5), the shear modulus and Young modulus in the liver differ by a scaling factor of 3: E=3μ [31]. In previous studies, LS measured by FibroScan and the Aixplorer ultrasound system showed three-fold greater values than those measured by MRE [5,31–33]. However, LS obtained with a LOGIQ ultrasound system showed lower values than those derived with FibroScan or the Aixplorer ultrasound system [34,35]. Previous reports showed that the ratio of LS-SWE to LS-MRE values ranged from 2.10 to 2.95 for FibroScan [8,9,32,36,37], from 2.54 to 6.09 for Aixplorer [5,12] and from 1.94 to 2.23 for LOGIQ [9,36]. FibroScan and 2D-SWE have different shear-wave frequencies, and their respective excitation methods are mechanical force and the acoustic radiation force impulse. In addition, 2D-SWE values differ depending on the device, but the details of device-specific measurement methods have not been published, and this information remains a black box. Therefore, Iijima et al. [38] examined the measured values among instruments from six companies and reported that the correlation between models was good compared to TE, which is generally considered the gold standard for liver fibrosis, and that the diagnosis of fibrosis was comparable regardless of which instrument was used. Furthermore, a conversion equation from 2D-SWE to TE has been developed and can be applied clinically [38].

The 2D-SWE assessments in the present study were performed using the LOGIQ S8, E9, or E10 system (GE Healthcare), and MRE was evaluated using a 3.0-T MRI system (GE Healthcare). This aligns with the report by Iijima et al. [38], and the authors believe that this methodology yielded acceptable results for daily clinical practice.

In the present study, the mean LS_SWE value was 2.197 (±0.479) times higher than the mean LS_MRE value. Based on this finding, the two measurements were converted to the same scale using simple linear regression, described above. Critchley and Critchley [25] reported that the precision and acceptability of one technique can then be compared with those of another, usually a reference method, such as thermodilution for cardiac output. In their article, they provided three objective criteria for deciding whether to accept or reject a new method: (1) a limit of agreement of less than 1 L/min, (2) a percentage limit of agreement of less than 20%, and (3) the finding that over 75% of readings vary from the mean by less than 20%. In this study, the calculated expected LoA was 17.118% (95% CI, 16.6353% to 17.6002%). The objective criterion of 20% provided by Critchley and Critchley was replaced with 17.1178% in the present study, and it was found that criteria 1 and 2 were not satisfied, but criterion 3 was. In practice, 789 of the 888 patients (88.9%) showed a percentage difference within 17.1178%, thus meeting objective criterion 3. Therefore, the interchangeability of 2D-SWE and MRE values was demonstrated by Bland-Altman analysis in this study.

The ICCs indicated almost perfect agreement between LS_SWE and LS_MRE in patients with BMI <30.0 kg/m² and degree of steatosis <11.3%. The ICCs in patients with BMI ≥30.0 kg/m² and the degree of steatosis ≥11.3% were lower, but still indicated substantial agreement. Liver fibrosis assessment by LS_SWE may be unreliable in obese patients, though the recent development of the XL probe for FibroScan has circumvented this problem to a certain extent [39,40]. A C1-6-D abdominal convex probe at a frequency of 1–4 MHz was used in all patients. The differences in ICCs between LS_SWE and LS_MRE values for different BMI values and degrees of hepatic steatosis were deemed to be acceptable. This procedure is likely to be less affected by obesity than other approaches.

In this study, 29 patients demonstrated a discrepancy between 2D-SWE and MRE values, specifically 23 in the upper group and six in the lower group. All six patients in the lower group had poor-quality B-mode images (6/6, 100%), while five had a mixture of red and blue colors on color-coded 2D-SWE maps (5/6, 83.3%). The SWE values were considered to be inaccurate because no problems could be found with the MRE measurements. Twenty-three patients in the upper group also had poor-quality B-mode images (13/23, 56.5%), and 22 exhibited a mixture of red and blue colors on color-coded 2D-SWE maps (22/23, 95.7%). In addition, the patients in the upper group had significantly higher FIB-4 scores than those in the non-divergent group. The MRE values most likely correctly reflected the fibrosis state in this group because 2D-MRE values are sometimes inaccurate in the presence of advanced fibrosis [41].

The present study has several limitations. First, the 2D-SWE technique is operator-dependent, and simultaneous measurements in the same patient may vary depending on the operator’s expertise. Compared with LS_SWE, LS_MRE has higher repeatability and reproducibility and thus provides more reliable LS measurements [42,43]. In this series, three experienced sonographers with at least 10 years of clinical experience in performing abdominal ultrasound examinations performed 2D-SWE, and there were few technical issues. Second, this study was retrospective in nature. However, almost all consecutive patients with chronic liver disease who visited the authors’ institution underwent MRE, and selection bias was unlikely. Third, few liver biopsies were performed in this study. However, the original aim was to directly compare 2D-SWE and MRE to determine if the two methods are interchangeable. Fourth, there are relatively few individuals in Japan with a BMI ≥30 kg/m2, and this study therefore had insufficient power for analyzing this group, in which 2D-SWE is less accurate. Fifth, there were sampling differences between 2D-SWE and MRE. MRE measures a larger area of the liver than 2D-SWE, and this difference should be kept in mind when evaluating the results obtained with both methods.

In conclusion, Bland-Altman analysis demonstrated that LS_SWE and LS_MRE were interchangeable within a clinically acceptable range.

Notes

Author Contributions

Conceptualization: Ichikawa H, Kumada T, Yasuda E. Data acquisition: Ichikawa H, Yasuda E, Kumada T, Takeshima K, Ogawa S, Tsunekawa A, Goto T, Nakaya K, Akita T, Tanaka J. Data analysis or interpretation: Takeshima K, Ogawa S, Tsunekawa A, Goto T, Nakaya K. Drafting of the manuscript: Ichikawa H, Kumada T, Yasuda E. Critical revision of the manuscript: Ichikawa H, Kumada T, Yasuda E. Approval of the final version of the manuscript: all authors.

No potential conflict of interest relevant to this article was reported.

Supplementary Material

Supplementary Table 1.

The results of Bland-Altman analysis (n=888) https://doi.org/10.14366/usg.22052).

usg-22052-supple1.pdf

Supplementary Table 2.

Correlation of advanced fibrosis defined by 2D-SWE and MRE values https://doi.org/10.14366/usg.22052).

usg-22052-supple1.pdf

Supplementary Fig. 1.

Correlation between two-dimensional shearwave elastography (2D-SWE) and magnetic resonance elastography (MRE). The liver stiffness according to MRE (LS_MRE) and liver stiffness according to 2D-SWE (LS_SWE) values were correlated, showing substantial agreement. The Spearman rank correlation coefficient is 0.786 (95% confidence interval [CI], 0.761 to 0.811; P<0.001). The dashed line shows the 95% CI. Patients with an upward divergence are classified as the upper group (n=23), those with a downward divergence as the lower group (n=6), and those within the 95% CI, as the non-divergent group (n=859) https://doi.org/10.14366/usg.22052).

usg-22052-supple1.pdf

References

1. Bataller R, Brenner DA. Liver fibrosis. J Clin Invest 2005;115:209–218.

2. Bosch FX, Ribes J, Cleries R, Diaz M. Epidemiology of hepatocellular carcinoma. Clin Liver Dis 2005;9:191–211.

3. Bravo AA, Sheth SG, Chopra S. Liver biopsy. N Engl J Med 2001;344:495–500.

4. Poynard T, Lenaour G, Vaillant JC, Capron F, Munteanu M, Eyraud D, et al. Liver biopsy analysis has a low level of performance for diagnosis of intermediate stages of fibrosis. Clin Gastroenterol Hepatol 2012;10:657–663.

5. Yoon JH, Lee JM, Joo I, Lee ES, Sohn JY, Jang SK, et al. Hepatic fibrosis: prospective comparison of MR elastography and US shearwave elastography for evaluation. Radiology 2014;273:772–782.

6. Dietrich CF, Bamber J, Berzigotti A, Bota S, Cantisani V, Castera L, et al. EFSUMB guidelines and recommendations on the clinical use of liver ultrasound elastography, update 2017 (long version). Ultraschall Med 2017;38:e16–e47.

7. Shiina T, Nightingale KR, Palmeri ML, Hall TJ, Bamber JC, Barr RG, et al. WFUMB guidelines and recommendations for clinical use of ultrasound elastography: Part 1: basic principles and terminology. Ultrasound Med Biol 2015;41:1126–1147.

8. Hsu C, Caussy C, Imajo K, Chen J, Singh S, Kaulback K, et al. Magnetic resonance vs transient elastography analysis of patients with nonalcoholic fatty liver disease: a systematic review and pooled analysis of individual participants. Clin Gastroenterol Hepatol 2019;17:630–637.

9. Furlan A, Tublin ME, Yu L, Chopra KB, Lippello A, Behari J. Comparison of 2D shear wave elastography, transient elastography, and MR elastography for the diagnosis of fibrosis in patients with nonalcoholic fatty liver disease. AJR Am J Roentgenol 2020;214:W20–W26.

10. Osman AM, El Shimy A, Abd El Aziz MM. 2D shear wave elastography (SWE) performance versus vibration-controlled transient elastography (VCTE/fibroscan) in the assessment of liver stiffness in chronic hepatitis. Insights Imaging 2020;11:38.

11. Wei H, Jiang HY, Li M, Zhang T, Song B. Two-dimensional shear wave elastography for significant liver fibrosis in patients with chronic hepatitis B: a systematic review and meta-analysis. Eur J Radiol 2020;124:108839.

12. Yoon JH, Lee JM, Woo HS, Yu MH, Joo I, Lee ES, et al. Staging of hepatic fibrosis: comparison of magnetic resonance elastography and shear wave elastography in the same individuals. Korean J Radiol 2013;14:202–212.

13. Wagner M, Corcuera-Solano I, Lo G, Esses S, Liao J, Besa C, et al. Technical failure of MR elastography examinations of the liver: experience from a large single-center study. Radiology 2017;284:401–412.

14. Sterling RK, Lissen E, Clumeck N, Sola R, Correa MC, Montaner J, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology 2006;43:1317–1325.

15. Silva AM, Grimm RC, Glaser KJ, Fu Y, Wu T, Ehman RL, et al. Magnetic resonance elastography: evaluation of new inversion algorithm and quantitative analysis method. Abdom Imaging 2015;40:810–817.

16. Tada T, Kumada T, Toyoda H, Sone Y, Takeshima K, Ogawa S, et al. Viral eradication reduces both liver stiffness and steatosis in patients with chronic hepatitis C virus infection who received directacting anti-viral therapy. Aliment Pharmacol Ther 2018;47:1012–1022.

17. Imajo K, Kessoku T, Honda Y, Tomeno W, Ogawa Y, Mawatari H, et al. Magnetic resonance imaging more accurately classifies steatosis and fibrosis in patients with nonalcoholic fatty liver disease than transient elastography. Gastroenterology 2016;150:626–637.

18. Abe T, Kuroda H, Fujiwara Y, Yoshida Y, Miyasaka A, Kamiyama N, et al. Accuracy of 2D shear wave elastography in the diagnosis of liver fibrosis in patients with chronic hepatitis C. J Clin Ultrasound 2018;46:319–327.

19. Ichikawa S, Motosugi U, Ichikawa T, Sano K, Morisaka H, Enomoto N, et al. Magnetic resonance elastography for staging liver fibrosis in chronic hepatitis C. Magn Reson Med Sci 2012;11:291–297.

20. Euser AM, Dekker FW, le Cessie S. A practical approach to BlandAltman plots and variation coefficients for log transformed variables. J Clin Epidemiol 2008;61:978–982.

21. Inayama T, Higuchi Y, Tsunoda N, Uchiyama H, Sakuma H. Associations between abdominal visceral fat and surrogate measures of obesity in Japanese men with spinal cord injury. Spinal Cord 2014;52:836–841.

22. Abu-Arafeh A, Jordan H, Drummond G. Reporting of method comparison studies: a review of advice, an assessment of current practice, and specific suggestions for future reports. Br J Anaesth 2016;117:569–575.

23. Gerke O. Reporting standards for a Bland-Altman agreement analysis: a review of methodological reviews. Diagnostics (Basel) 2020;10:334.

24. Giavarina D. Understanding Bland Altman analysis. Biochem Med (Zagreb) 2015;25:141–151.

25. Critchley LA, Critchley JA. A meta-analysis of studies using bias and precision statistics to compare cardiac output measurement techniques. J Clin Monit Comput 1999;15:85–91.

26. Brandt AH, Olesen JB, Moshavegh R, Jensen JA, Nielsen MB, Hansen KL. Common carotid artery volume flow: a comparison study between ultrasound vector flow imaging and phase contrast magnetic resonance imaging. Neurol Int 2021;13:269–278.

27. Brandt AH. Evaluation of new ultrasound techniques for clinical imaging in selected liver and vascular applications. Dan Med J 2018;65:B5455.

28. Stapel SN, Weijs PJM, Girbes ARJ, Oudemans-van Straaten HM. Indirect calorimetry in critically ill mechanically ventilated patients: Comparison of E-sCOVX with the deltatrac. Clin Nutr 2019;38:2155–2160.

29. Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–174.

30. Kanda Y. Investigation of the freely available easy-to-use software 'EZR' for medical statistics. Bone Marrow Transplant 2013;48:452–458.

31. Oudry J, Chen J, Glaser KJ, Miette V, Sandrin L, Ehman RL. Crossvalidation of magnetic resonance elastography and ultrasoundbased transient elastography: a preliminary phantom study. J Magn Reson Imaging 2009;30:1145–1150.

32. Bensamoun SF, Wang L, Robert L, Charleux F, Latrive JP, Ho Ba Tho MC. Measurement of liver stiffness with two imaging techniques: magnetic resonance elastography and ultrasound elastometry. J Magn Reson Imaging 2008;28:1287–1292.

33. Motosugi U, Ichikawa T, Amemiya F, Sou H, Sano K, Muhi A, et al. Cross-validation of MR elastography and ultrasound transient elastography in liver stiffness measurement: discrepancy in the results of cirrhotic liver. J Magn Reson Imaging 2012;35:607–610.

34. Baldea V, Bende F, Popescu A, Sirli R, Sporea I. Comparative study between two 2D-shear waves elastography techniques for the non-invasive assessment of liver fibrosis in patients with chronic hepatitis C virus (HCV) infection. Med Ultrason 2021;23:257–264.

35. Baldea V, Sporea I, Lupusoru R, Bende F, Mare R, Popescu A, et al. Comparative study between the diagnostic performance of point and 2-D shear-wave elastography for the non-invasive assessment of liver fibrosis in patients with chronic hepatitis C using transient elastography as reference. Ultrasound Med Biol 2020;46:2979–2988.

36. Matos J, Paparo F, Bacigalupo L, Cenderello G, Mussetto I, De Cesari M, et al. Noninvasive liver fibrosis assessment in chronic viral hepatitis C: agreement among 1D transient elastography, 2D shear wave elastography, and magnetic resonance elastography. Abdom Radiol (NY) 2019;44:4011–4021.

37. Park CC, Nguyen P, Hernandez C, Bettencourt R, Ramirez K, Fortney L, et al. Magnetic resonance elastography vs transient elastography in detection of fibrosis and noninvasive measurement of steatosis in patients with biopsy-proven nonalcoholic fatty liver disease. Gastroenterology 2017;152:598–607.

38. Iijima H, Tada T, Kumada T, Kobayashi N, Yoshida M, Aoki T, et al. Comparison of liver stiffness assessment by transient elastography and shear wave elastography using six ultrasound devices. Hepatol Res 2019;49:676–686.

39. Castera L, Foucher J, Bernard PH, Carvalho F, Allaix D, Merrouche W, et al. Pitfalls of liver stiffness measurement: a 5-year prospective study of 13,369 examinations. Hepatology 2010;51:828–835.

40. de Ledinghen V, Vergniol J, Foucher J, El-Hajbi F, Merrouche W, Rigalleau V. Feasibility of liver transient elastography with FibroScan using a new probe for obese patients. Liver Int 2010;30:1043–1048.

41. Bruce M, Kolokythas O, Ferraioli G, Filice C, O'Donnell M. Limitations and artifacts in shear-wave elastography of the liver. Biomed Eng Lett 2017;7:81–89.

42. Xiao G, Zhu S, Xiao X, Yan L, Yang J, Wu G. Comparison of laboratory tests, ultrasound, or magnetic resonance elastography to detect fibrosis in patients with nonalcoholic fatty liver disease: a meta-analysis. Hepatology 2017;66:1486–1501.

43. Yin M, Venkatesh SK. Ultrasound or MR elastography of liver: which one shall I use? Abdom Radiol (NY) 2018;43:1546–1551.

Article information Continued

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/) which permits unrestricted noncommercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Notes

Key point

Many studies have demonstrated that magnetic resonance elastography (MRE) has the same or significantly better diagnostic accuracy than two-dimensional shear-wave elastography (2D-SWE) for detecting fibrosis stages using liver biopsy as a reference. Bland- Altman analysis of 2D-SWE and MRE showed that the mean, upper limit of agreement (LoA), and lower LoA expressed in terms of the percentage difference were -0.5944%, 19.8950%, and -21.0838%, respectively. The calculated expected LoA was 17.1178%, and 789 of 888 patients (88.9%) had a percentage difference within the expected LoA. Bland-Altman analysis demonstrated that 2D-SWE and MRE were interchangeable within a clinically acceptable range.

	Value
No.	905
Chronic liver disease NAFLD/ALD/HBV/HCV/AIH/PBC/Others	226/57/99/420/19/13/71
Age (year)	67 (58–75)
Sex (female:male)	453 (50.1):452 (49.9)
Body mass index (kg/m²)	23.7 (21.4–26.1)
Alcohol abuse (present:absent)	151 (16.7):754 (83.3)
Smoking (present:absent)	325 (35.9): 580 (64.1)
Platelet count (/10⁴ μL)	19.7 (14.9–24.7)
AST (U/L)	27 (21–41)
ALT (U/L)	24 (15–43)
FIB-4 score	1.98 (1.28–3.12)
γ-GT (U/L)	29 (18–56)
Total bilirubin (mg/dL)	0.7 (0.8–0.9)
Albumin (g/dL)	4.4 (4.1–4.6)
MRE (kPa)	2.7 (2.3–3.7)
2D-SWE (kPa)	6.2 (5.0–8.0)
MRI-PDFF (%)	3.9 (2.0–9.7)

	BMI (kg/m²)
	<25.0 (n=569)	≥25.0 and <30.0 (n=254)	≥30.0 (n=65)
Bias^a)	−0.0078	0.0104	0.0287
ULoA	0.1530	0.1618	0.1612
LLoA	−0.1685	−0.1409	−0.1038
% Difference (mean)^a) (LLoA–ULoA)	−1.7351 (−21.9633 to 18.4931)	0.9613 (−17.8331 to 19.7558)	3.3118 (−12.9850 to 19.6086)
Expected LoA (%)^a) (95% CI)	16.3827 (15.8227 to 16.9428)	18.2328 (17.2313 to 19.2342)	19.1951 (17.2878 to 21.1024)
ICC (95% CI)	0.8385 (0.8123 to 0.8613)	0.8181 (0.7729 to 0.8551)	0.7299 (0.5212 to 0.8266)

	PDFF (%)
	Grade 0	Grade 1	Grade 2	Grade 3
	<5.2 (n=512)	≥5.2 and <11.3 (n=185)	≥11.3 and <17.1 (n=91)	≥17.1 (n=100)
Bias^a)	−0.0096	0.0030	0.0092	0.0363
ULoA	0.1525	0.1411	0.1572	0.1698
LLoA	−0.1718	−0.1350	−0.1388	−0.0973
% Difference (mean)^a) (LLoA–ULoA)	−1.8783 (−21.6484 to 17.8915)	−0.1047 (−18.2256 to 18.01162)	0.3546 (−19.0591 to 19.7683)	4.2099 (−12.8750 to 21.2948)
Expected LoA (%)^a) (95% CI)	16.7042 (16.0669 to 17.3416)	17.2076 (16.1394 to 18.2759)	17.7823 (16.2986 to 19.2660)	18.4642 (17.0260 to 19.9024)
ICC (95% CI)	0.8435 (0.8165 to 0.8668)	0.8429 (0.7953 to 0.8801)	0.7287 (0.6150 to 0.8127)	0.7138 (0.6004 to 0.7975)