^{1}

^{1}

^{2}

^{3}

^{3}

^{3}

^{3}

^{1}

^{4}

^{4}

Quantitative elastography methods, such as ultrasound two-dimensional shear-wave elastography (2D-SWE) and magnetic resonance elastography (MRE), are used to diagnose liver fibrosis. The present study compared liver stiffness determined by 2D-SWE and MRE within individuals and analyzed the degree of agreement between the two techniques.

In total, 888 patients who underwent 2D-SWE and MRE were analyzed. Bland-Altman analysis was performed after both types of measurements were log-transformed to a normal distribution and converted to a common set of units using linear regression analysis for differing scales. The expected limit of agreement (LoA) was defined as the square root of the sum of the squares of 2D-SWE and MRE precision. The percentage difference was expressed as (2D-SWEMRE)/ mean of the two methods×100.

A Bland-Altman plot showed that the bias and upper and lower LoAs (ULoA and LLoA) were 0.0002 (95% confidence interval [CI], -0.0057 to 0.0061), 0.1747 (95% CI, 0.1646 to 0.1847), and -0.1743 (95% CI, -0.1843 to -0.1642), respectively. In terms of percentage difference, the mean, ULoA, and LLoA were -0.5944%, 19.8950%, and -21.0838%, respectively. The calculated expected LoA was 17.1178% (95% CI, 16.6353% to 17.6002%), and 789 of 888 patients (88.9%) had a percentage difference within the expected LoA. The intraclass correlation coefficient of the two methods indicated an almost perfect correlation (0.8231; 95% CI, 0.8006 to 0.8432; P<0.001).

Bland-Altman analysis demonstrated that 2D-SWE and MRE were interchangeable within a clinically acceptable range.

Many studies have demonstrated that magnetic resonance elastography (MRE) has the same or significantly better diagnostic accuracy than two-dimensional shear-wave elastography (2D-SWE) for detecting fibrosis stages using liver biopsy as a reference. Bland- Altman analysis of 2D-SWE and MRE showed that the mean, upper limit of agreement (LoA), and lower LoA expressed in terms of the percentage difference were -0.5944%, 19.8950%, and -21.0838%, respectively. The calculated expected LoA was 17.1178%, and 789 of 888 patients (88.9%) had a percentage difference within the expected LoA. Bland-Altman analysis demonstrated that 2D-SWE and MRE were interchangeable within a clinically acceptable range.

Hepatic fibrosis, a form of scarring that results from repeated liver injury, leads to the accumulation of extracellular matrix components in the liver parenchyma [

Liver biopsy is the gold standard for staging hepatic fibrosis. However, it is an invasive procedure and has several disadvantages, including patient reluctance, pain, and hemoperitoneum, and its complications may be life-threatening [

Magnetic resonance elastography (MRE) has emerged as a highly accurate, noninvasive imaging test to measure liver stiffness (LS) and thus quantify liver fibrosis [

Two-dimensional shear-wave elastography (2D-SWE) has been introduced as an additional approach for ultrasound-based LS measurement. Unlike TE, 2D-SWE offers real-time simultaneous B-mode visualization of the liver and incorporates flexible placement of larger regions of interest (ROIs), thereby potentially reducing technical failures and providing more robust assessment in challenging cases [

This retrospective study was approved by the Institutional Review Board (20200423-5) of Ogaki Municipal Hospital and was carried out in compliance with the Helsinki Declaration. The Institutional Review Board approved this study after the examinations were completed and waived the requirement for further consent.

All MRE and 2D-SWE examinations were performed for clinical purposes unrelated to this investigation. At the time of the MRE and 2D-SWE examinations, patients provided written informed consent to use the test results in future clinical research.

At the authors’ institution, both MRE and 2D-SWE were routinely performed in patients with chronic liver disease.

Alanine aminotransferase (ALT), aspartate aminotransferase (AST), gamma-glutamyl transpeptidase, platelet count, albumin, and total bilirubin were measured in all patients at the time of 2D-SWE. The fibrosis-4 (FIB-4) score, as a noninvasive simple serum marker, was calculated using the following formula: age (years)×AST (U/L)/[platelet count (10^{9}/L)×

MRE was performed using a 3.0 T MRI system (Discovery 750, GE Healthcare, Waukesha, WI, USA) with a 32-channel phased-array coil [

Each MRI examination was reviewed by one of two radiologists (S.O. and T.G., who specialized in hepatology and had 10 and 7 years of experience in hepatobiliary imaging, respectively) who were blinded to each patient’s clinical data, including 2D-SWE results. The first radiologist to review the examination assessed whether the case demonstrated technical failure of MRE, as defined according to Wagner et al. [_{MRE}). Advanced fibrosis (≥F3) was defined as an MRE value ≥4.8 kPa [

The radiologist who measured LS_{MRE} also placed ROIs on the in-phase and out-of-phase images to obtain PDFF measures. The steatosis grade was classified as grade 0 for PDFF <5.2%, grade 1 for PDFF ≥5.2% but <11.3%, grade 2 for PDFF ≥11.3% but <17.1%, and grade 3 for PDFF ≥17.1% [

In a _{MRE} interpretations, the radiologists assessed the presence of mild to moderate ascites. LS and PDFF measurements were analyzed by the same two radiologists (S.O. and T.G.), who were blinded to each patient’s clinical data.

The 2D-SWE scans were performed using a LOGIQ S8, E9, or E10 ultrasound system (GE Healthcare) with a C1–6-D abdominal convex probe at a frequency of 1–4 MHz. Each scan was performed by one of three ultrasound technologists (non-authors, each with 10–15 years of experience in performing clinical abdominal ultrasound examinations and 3–5 years of experience in performing SWE examinations, including over 200 ultrasound elastography examinations). The operators were unaware of the MRE findings in each patient. Patients were required to have been fasting for at least 4 hours. During the examination, each patient lay in a supine position with the right arm in maximum abduction. Color-coded elasticity maps were generated from the right hepatic lobe using an intercostal approach. The machine automatically performed quality assessment by assessing shear wave propagation and excluding pixels on grayscale B-mode images that were judged to be of low quality due to poor probe contact, motion artifacts (secondary to breathing or heartbeats), and other artifacts (acoustic attenuation, reflection, scattering, and rib shadowing). Stiffness measurements were performed by the scanning technologist at the time of the examination; no scans were repeated specifically for the purpose of this investigation. The scanning technologist placed circular ROIs (approximately 15.0 mm in diameter) in the color box, perpendicular to the liver surface at a depth of at least 2.0 cm from the liver surface, and the value was recorded (expressed in kPa). Measurements were recorded from at least five ROIs, and the median value was calculated as the patient’s LS according to 2D-SWE (hereafter, LS_{SWE}) [

Each pair of LS_{MRE} and LS_{SWE} values was compared and analyzed with the Bland-Altman method. Before starting the comparison, a normal probability plot was used to verify that the LS_{MRE} and LS_{SWE} values were normally distributed; if not, they were transformed into a normal distribution [_{MRE} values to the same scale as the modified LS_{SWE} values [21–23]. The agreement between LS_{MRE} and LS_{SWE} was assessed by determining bias and precision [24–28]. Bias was calculated as the mean difference between LS_{MRE} and LS_{SWE}. The precision (

where n is the number of replicated experiments, and

Linear regression and Bland-Altman analyses were used to determine the agreement between 2D-SWE and MRE. The percentage difference (% difference) was defined as:

The percentage error (PE) for comparing two methods

where

Statistical significance was defined as a P-value <0.05. All statistical analyses were performed with EZR (version 1.53, Saitama Medical Center, Jichi Medical University, Saitama, Japan), which is a graphical user interface for R (R Foundation for Statistical Computing, Vienna, Austria) [

^{2} (IQR, 21.4 to 26.1 kg/m^{2}). The median LS_{MRE} was 2.7 kPa (IQR, 2.3 to 3.7 kPa), the median LS_{SWE} was 6.2 kPa (IQR, 5.0 to 8.0 kPa), and the median PDFF was 3.9% (IQR, 2.0% to 9.7%).

The technical success rate of MRE was 98.9% (895/905). The reasons for MRE technical failure were as follows: no pixel values with a confidence index higher than 95% on the confidence map (n=4), poor breath-holding (n=3), and mild to moderate ascites (n=3). The technical success rate of 2D-SWE was 99.3% (898/905). The LS_{SWE} measurements were deemed unreliable in seven patients because the ratio between the IQR and median measurements was larger than 30%. The technical success rate was not significantly different between MRE and 2D-SWE (P=0.627). The 888 patients for whom both MRE and 2D-SWE were technically successful were included in subsequent analyses (

_{MRE} and LS_{SWE} values, indicating substantial agreement (Spearman rank correlation coefficient, 0.786; 95% confidence interval [CI], 0.761 to 0.811; P<0.001) [_{MRE} and LS_{SWE} values were evaluated using a normal probability plot before starting Bland-Altman analysis. Since neither LS_{MRE} nor LS_{SWE} values showed a normal distribution, both types of values were log-transformed [_{MRE} was converted into the modified log LS_{SWE} by the following simple linear regression model: modified log LS_{SWE}=0.4176+0.8193×log LS_{MRE} [

_{SWE} and log LS_{SWE}, along with the plots of both means. The Bland-Altman plot demonstrated that the bias, upper limit of agreement (ULoA), and lower limit of agreement (LLoA) were 0.0002 (95% CI, -0.0057 to 0.0061), 0.1747 (95% CI, 0.1646 to 0.1847), and -0.1743 (95% CI, -0.1843 to -0.1642), respectively. _{SWE} and log LS_{SWE} as percentages of values (percentage differences). The mean, ULoA, and LLoA were -0.5944%, 19.8950%, and -21.0838%, respectively. The calculated PE was 21.9647%, which was greater than the calculated expected LoA of 17.1177% (95% CI, 16.6353% to 17.6002%). However, 789 of the 888 patients (88.9%) had a percentage difference within 17.1177%. _{SWE} and log LS_{SWE} values. The ICC was 0.8231, indicating almost perfect agreement (P<0.001) [

In 29 of the 888 patients (3.3%), the correlation between 2D-SWE and MRE fell outside the 95% CI (

^{2}, ≥25.0 kg/m^{2} but <30.0 kg/m^{2}, or ≥30.0 kg/m^{2}). The bias, percentage difference, and expected LoA increased significantly with BMI (Jonckheere-Terpstra test, P<0.001, P<0.001, and P<0.001, respectively). The ICCs gradually decreased as BMI increased, but indicated almost perfect agreement except for the category of BMI ≥30 kg/m^{2} (

In this study, Bland-Altman analysis revealed that LS_{MRE} and LS_{SWE} showed good agreement. With the Bland-Altman method, using correlations as a measure of agreement is problematic because this method actually assesses the ordering of the LS_{SWE} and LS_{MRE} values and their relative spacing, rather than whether or not the numbers themselves agree. In this study, the ICC indicated almost perfect agreement (r=0.823). Bland-Altman plots comparing two methods are very useful for assessing agreement. However, if the numbers are incommensurate, it makes no sense to try to determine whether they agree. The two sets of measurements used in Bland-Altman analysis must be normally distributed, and to achieve this, the LS_{SWE} and LS_{MRE} values were log-transformed [

The 2D-SWE assessments in the present study were performed using the LOGIQ S8, E9, or E10 system (GE Healthcare), and MRE was evaluated using a 3.0-T MRI system (GE Healthcare). This aligns with the report by Iijima et al. [

In the present study, the mean LS_{SWE} value was 2.197 (±0.479) times higher than the mean LS_{MRE} value. Based on this finding, the two measurements were converted to the same scale using simple linear regression, described above. Critchley and Critchley [

The ICCs indicated almost perfect agreement between LS_{SWE} and LS_{MRE} in patients with BMI <30.0 kg/m^{2} and degree of steatosis <11.3%. The ICCs in patients with BMI ≥30.0 kg/m^{2} and the degree of steatosis ≥11.3% were lower, but still indicated substantial agreement. Liver fibrosis assessment by LS_{SWE} may be unreliable in obese patients, though the recent development of the XL probe for FibroScan has circumvented this problem to a certain extent [_{SWE} and LS_{MRE} values for different BMI values and degrees of hepatic steatosis were deemed to be acceptable. This procedure is likely to be less affected by obesity than other approaches.

In this study, 29 patients demonstrated a discrepancy between 2D-SWE and MRE values, specifically 23 in the upper group and six in the lower group. All six patients in the lower group had poor-quality B-mode images (6/6, 100%), while five had a mixture of red and blue colors on color-coded 2D-SWE maps (5/6, 83.3%). The SWE values were considered to be inaccurate because no problems could be found with the MRE measurements. Twenty-three patients in the upper group also had poor-quality B-mode images (13/23, 56.5%), and 22 exhibited a mixture of red and blue colors on color-coded 2D-SWE maps (22/23, 95.7%). In addition, the patients in the upper group had significantly higher FIB-4 scores than those in the non-divergent group. The MRE values most likely correctly reflected the fibrosis state in this group because 2D-MRE values are sometimes inaccurate in the presence of advanced fibrosis [

The present study has several limitations. First, the 2D-SWE technique is operator-dependent, and simultaneous measurements in the same patient may vary depending on the operator’s expertise. Compared with LS_{SWE}, LS_{MRE} has higher repeatability and reproducibility and thus provides more reliable LS measurements [

In conclusion, Bland-Altman analysis demonstrated that LS_{SWE} and LS_{MRE} were interchangeable within a clinically acceptable range.

Conceptualization: Ichikawa H, Kumada T, Yasuda E. Data acquisition: Ichikawa H, Yasuda E, Kumada T, Takeshima K, Ogawa S, Tsunekawa A, Goto T, Nakaya K, Akita T, Tanaka J. Data analysis or interpretation: Takeshima K, Ogawa S, Tsunekawa A, Goto T, Nakaya K. Drafting of the manuscript: Ichikawa H, Kumada T, Yasuda E. Critical revision of the manuscript: Ichikawa H, Kumada T, Yasuda E. Approval of the final version of the manuscript: all authors.

No potential conflict of interest relevant to this article was reported.

The results of Bland-Altman analysis (n=888)

Correlation of advanced fibrosis defined by 2D-SWE and MRE values

Correlation between two-dimensional shearwave elastography (2D-SWE) and magnetic resonance elastography (MRE). The liver stiffness according to MRE (LS_{MRE}) and liver stiffness according to 2D-SWE (LS_{SWE}) values were correlated, showing substantial agreement. The Spearman rank correlation coefficient is 0.786 (95% confidence interval [CI], 0.761 to 0.811; P<0.001). The dashed line shows the 95% CI. Patients with an upward divergence are classified as the upper group (n=23), those with a downward divergence as the lower group (n=6), and those within the 95% CI, as the non-divergent group (n=859)

HCC, hepatocellular carcinoma.

The degree of liver stiffness determined by 2D-SWE and MRE increased with the degree of fibrosis progression, defined as the staging of liver fibrosis in chronic hepatitis C by MRE [19]. The left figure shows no fibrosis (F0, 58-year-old, female, hepatitis C virus [HCV] infection), the middle shows moderate fibrosis (F2, 82-year-old, female, HCV infection), and right shows advanced fibrosis (F4, 82-year-old, female, HCV infection).

A. Bland-Altman plot where differences are presented as units. The Bland-Altman plot demonstrated that the bias, upper limit of agreement (ULoA), and lower limit of agreement (LLoA) were 0.0002 (95% confidence interval [CI], -0.0057 to 0.0061), 0.1747 (95% CI, 0.1646 to 0.1847), and -0.1743 (95% CI, -0.1843 to -0.1642), respectively. B. Bland-Altman plot shows the difference as a percentage (% difference). The figure shows the difference (% difference) between the adjusted log LS_{SWE} values and the log LS_{SWE} values, where the mean, ULoA, and LLoA were –0.5944%, 19.8950%, and –21.0838%, respectively. C. Intraclass correlation coefficients are as follows: _{SWE} and log LS_{SWE} values is shown; the intraclass correlation coefficient (ICC) is 0.8231, indicating almost perfect agreement (P<0.001). MRE, magnetic resonance elastography; 2D-SWE, twodimensional shear-wave elastography; LS_{MRE}, liver stiffness according to MRE; LS_{SWE}, liver stiffness according to 2D-SWE; modified log LS_{SWE}=0.4176+0.8193×log LS_{MRE}; % difference=[modified log LS_{SWE}log LS_{SWE})/(0.5×(modified log LS_{SWE}+log LS_{SWE}]×100%.

The correlation between different BMI values is shown: BMI <25.0 kg/m^{2} (n=569) (A); 25.0≤BMI<30.0 kg/m^{2} (n=254) (B); BMI ≥30.0 kg/m^{2} (n=65) (C). The ICCs gradually decreased as BMI increased, but indicated almost perfect agreement except for patients with a BMI ≥30 kg/m^{2}. LS_{MRE}, liver stiffness according to magnetic resonance elastography; LS_{SWE}, liver stiffness according to twodimensional shear-wave elastography; CI, confidence interval.

The correlation between different PDFF is shown: grade 0 (n=512), PDFF <5.2% (A); grade 1 (n=185), 5.2%≤PDFF<11.3% (B); grade 2 (n=91), 11.3%≤PDFF<17.1% (C); grade 3 (n=100), PDFF ≥17.1% (D). LS_{MRE}, liver stiffness according to magnetic resonance elastography; LS_{SWE}, liver stiffness according to twodimensional shear-wave elastography; CI, confidence interval.

Baseline patient characteristics

Value | |
---|---|

No. | 905 |

Chronic liver disease NAFLD/ALD/HBV/HCV/AIH/PBC/Others | 226/57/99/420/19/13/71 |

Age (year) | 67 (58–75) |

Sex (female:male) | 453 (50.1):452 (49.9) |

Body mass index (kg/m^{2}) |
23.7 (21.4–26.1) |

Alcohol abuse (present:absent) | 151 (16.7):754 (83.3) |

Smoking (present:absent) | 325 (35.9): 580 (64.1) |

Platelet count (/10^{4} μL) |
19.7 (14.9–24.7) |

AST (U/L) | 27 (21–41) |

ALT (U/L) | 24 (15–43) |

FIB-4 score | 1.98 (1.28–3.12) |

γ-GT (U/L) | 29 (18–56) |

Total bilirubin (mg/dL) | 0.7 (0.8–0.9) |

Albumin (g/dL) | 4.4 (4.1–4.6) |

MRE (kPa) | 2.7 (2.3–3.7) |

2D-SWE (kPa) | 6.2 (5.0–8.0) |

MRI-PDFF (%) | 3.9 (2.0–9.7) |

Values are presented as medians (first and third quartiles).

NAFLD, non-alcoholic fatty liver disease; ALD, alcoholic liver disease; HBV, hepatitis B virus; HCV, hepatitis C virus; AIH, autoimmune hepatitis; PBC, primary biliary cholangitis; AST, aspartate aminotransferase; ALT, alanine aminotransferase; FIB-4, fibrosis-4; γ-GT, gamma-glutamyl transpeptidase; MRE, magnetic resonance elastography; 2D-SWE, two-dimensional shear-wave elastography; MRI-PDFF, magnetic resonance imaging–proton density fat fraction.

Bland-Altman analysis in subgroups defined by BMI

BMI (kg/m^{2}) |
|||
---|---|---|---|

<25.0 (n=569) | ≥25.0 and <30.0 (n=254) | ≥30.0 (n=65) | |

Bias^{a)} |
−0.0078 | 0.0104 | 0.0287 |

ULoA | 0.1530 | 0.1618 | 0.1612 |

LLoA | −0.1685 | −0.1409 | −0.1038 |

% Difference (mean)^{a)} (LLoA–ULoA) |
−1.7351 (−21.9633 to 18.4931) | 0.9613 (−17.8331 to 19.7558) | 3.3118 (−12.9850 to 19.6086) |

Expected LoA (%)^{a)} (95% CI) |
16.3827 (15.8227 to 16.9428) | 18.2328 (17.2313 to 19.2342) | 19.1951 (17.2878 to 21.1024) |

ICC (95% CI) | 0.8385 (0.8123 to 0.8613) | 0.8181 (0.7729 to 0.8551) | 0.7299 (0.5212 to 0.8266) |

BMI, body mass index; ULoA, upper limit of agreement; LLoA, lower limit of agreement; LoA, limit of agreement; CI, confidence interval; ICC, intraclass correlation coefficient.

Bias, % difference, and expected LoA increased significantly according to BMI (Jonckheere–Terpstra test, P<0.001, P<0.001, and P<0.001, respectively).

Bland-Altman analysis in subgroups defined by hepatic steatosis

PDFF (%) |
||||
---|---|---|---|---|

Grade 0 | Grade 1 | Grade 2 | Grade 3 | |

<5.2 (n=512) | ≥5.2 and <11.3 (n=185) | ≥11.3 and <17.1 (n=91) | ≥17.1 (n=100) | |

Bias^{a)} |
−0.0096 | 0.0030 | 0.0092 | 0.0363 |

ULoA | 0.1525 | 0.1411 | 0.1572 | 0.1698 |

LLoA | −0.1718 | −0.1350 | −0.1388 | −0.0973 |

% Difference (mean)^{a)} (LLoA–ULoA) |
−1.8783 (−21.6484 to 17.8915) | −0.1047 (−18.2256 to 18.01162) | 0.3546 (−19.0591 to 19.7683) | 4.2099 (−12.8750 to 21.2948) |

Expected LoA (%)^{a)} (95% CI) |
16.7042 (16.0669 to 17.3416) | 17.2076 (16.1394 to 18.2759) | 17.7823 (16.2986 to 19.2660) | 18.4642 (17.0260 to 19.9024) |

ICC (95% CI) | 0.8435 (0.8165 to 0.8668) | 0.8429 (0.7953 to 0.8801) | 0.7287 (0.6150 to 0.8127) | 0.7138 (0.6004 to 0.7975) |

PDFF, proton density fat fraction; ULoA, upper limit of agreement; LLoA, lower limit of agreement; LoA, limit of agreement; CI, confidence interval; ICC, intraclass correlation coefficient.

Bias, % difference, and expected LoA increased significantly according to hepatic steatosis grade (Jonckheere–Terpstra test, P<0.001, P<0.001, and P=0.004, respectively).