Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jan 1.
Published in final edited form as: Eur Radiol. 2020 Aug 12;31(1):264–275. doi: 10.1007/s00330-020-07123-x

Complex Confounder-Corrected R2* Mapping for Liver Iron Quantification with MRI

Diego Hernando 1,2, Rachel J Cook 3,4, Naila Qazi 1,5, Colin A Longhurst 6, Carol A Diamond 7, Scott B Reeder 1,2,3,8,9
PMCID: PMC7755713  NIHMSID: NIHMS1619955  PMID: 32785766

Abstract

Objectives:

MRI-based R2* mapping may enable reliable and rapid quantification of liver iron concentration (LIC). However, the performance and reproducibility of R2* across acquisition protocols remain unknown. Therefore, the objective of this work was to evaluate the performance and reproducibility of complex confounder-corrected R2* across acquisition protocols, at both 1.5T and 3.0T.

Methods:

In this prospective study, 40 patients with suspected iron overload and 10 healthy controls were recruited with IRB approval and informed written consent, and imaged at both 1.5T and 3.0T. For each subject, acquisitions included four different R2* mapping protocols at each field strength, and an FDA-approved R2-based method performed at 1.5T as a reference for LIC. R2* maps were reconstructed from the complex data acquisitions including correction for noise effects and fat signal. For each subject, field strength, and R2* acquisition, R2* measurements were performed in each of the nine liver Couinaud segments, and the spleen. R2* measurements were compared across protocols, and field-strength (1.5T and 3.0T), and R2* was calibrated to LIC for each acquisition and field strength.

Results:

R2* demonstrated high reproducibility across acquisition protocols (p>0.05 for 96/108 pairwise comparisons across 2 field strengths and 9 liver segments, ICC > 0.91 for each field strength/segment combination), and high predictive ability (AUC > 0.95 for four clinically relevant LIC thresholds). Calibration of R2* to LIC was [LIC=−0.04+2.62×10−2 R2*] at 1.5T and [LIC=0.00+1.41×10−2 R2*] at 3.0T.

Conclusions:

Complex confounder-corrected R2* mapping enables LIC quantification with high reproducibility across acquisition protocols, at both 1.5T and 3.0T.

Keywords: Magnetic resonance imaging, liver, iron overload, biomarkers

Introduction

Abnormal deposition of iron is a common clinical disease state resulting from two major causes: hereditary hemochromatosis[1; 2] and transfusional hemosiderosis in patients requiring repeated blood transfusions[37]. Excess iron is toxic[1; 811], and can lead to liver damage[8; 12], cirrhosis, liver failure, and hepatocellular carcinoma, as well as many other comorbidities such as pancreatic dysfunction[13], pituitary dysfunction[14], and cardiomyopathy[15]. Current treatment for iron overload depends on its etiology, and includes phlebotomy (typically in patients with hemochromatosis)[12; 16], and chelation therapy (typically in patients with transfusional hemosiderosis)[17; 18]. A critical component of the treatment for iron overload is accurate assessment of body iron stores.

Most clinicians rely on serum ferritin for the assessment of body iron stores. Although serum ferritin concentration correlates with body iron stores[8], ferritin levels are confounded by other factors including inflammation, and may not accurately reflect body iron levels[1922]. Liver iron concentration (LIC, mg Fe/g dry tissue) is widely considered the best biomarker to assess total body iron stores, as the liver is the only organ whose iron content is consistently increased with increased body iron[9; 20; 23; 24]. Historically, non-targeted liver biopsy with biochemical quantification of iron concentration has been used for detection and treatment monitoring. However, biopsy is expensive, invasive, and many patients have coexisting thrombocytopenia, a contraindication to biopsy, due to the risk of uncontrolled bleeding. In addition, biopsy-based LIC measurement has wide sampling variability[25; 26], which is highly undesirable for treatment monitoring.

MRI is highly sensitive to abnormal deposition of iron[2730]. Spin echo-based R2 mapping methods have been calibrated against biopsy and an R2-based method[31] is currently commercially available and FDA-approved. The main disadvantage of R2 relaxometry is the lengthy acquisition time (10–20 minutes), incomplete coverage of the liver, and ghosting artifacts related to the free breathing acquisition.

As an alternative, gradient echo (GRE)-based R2* methods have a linear relationship with iron concentration, and 3D GRE data can be acquired rapidly over the entire liver in a single breath hold[32; 33]. However, previous magnitude-based R2* relaxometry methods have suffered from several major confounding factors, including the presence of noise floor effects[34], fat[35; 36], and background magnetic field inhomogeneity effects near air-tissue interfaces[37; 38]. Recently developed complex chemical shift encoded (CSE)-MRI techniques that use both the magnitude and phase of the GRE signal have shown promise to address these challenges and enable confounder-corrected R2* mapping[11; 33; 34; 39; 40]. Therefore, the purpose of this work was to validate a complex-based confounder-corrected CSE-MRI method to measure R2* as a quantitative biomarker of liver iron concentration.

Materials and Methods

Study Participants

This was a prospective, HIPAA compliant study performed after obtaining approval from the local Institutional Review Board (IRB). After obtaining informed consent, 40 study participants with known or suspected iron overload from the local hematology clinic were recruited. Inclusion criteria included those with known or suspected iron overload, age 10 and up. Determination of iron overload was based on known elevated serum ferritin (>500ng/mL), or an established diagnosis of a genetic condition that predisposed patients to iron overload, such as hereditary hemochromatosis. Exclusion criteria included contraindications to MRI. Patients undergoing phlebotomy or chelation therapy were not excluded.

In addition, 10 healthy controls with no known history of iron overload were recruited from an IRB-approved database of healthy volunteers. The same exclusion criteria applied to the control subjects. The number of subjects in this study was driven by budget and feasibility constraints.

43 of the 50 subjects recruited for this study have been previously reported for separate analyses, including validation of a method to measure magnetic susceptibility directly from B0 field maps[41], validation of an algorithm for quantitative susceptibility mapping[42], and assessment of fat quantification methods in the presence of liver iron[43]. In contrast, this study evaluates confounder-corrected R2* as a reproducible biomarker of iron overload.

Study Visit

Study visits occurred between 1/2012 and 5/2015. Each participant underwent a single visit, which included MRI at 1.5T and 3.0T, performed within approximately one hour. Imaging was performed on clinical MRI systems (1.5T: HDxt; 3.0T: Discovery MR750, GE Healthcare) using a phased array torso coil (1.5T: 8-channel; 3.0T: 32-channel). In addition, all participants underwent phlebotomy for serum ferritin levels immediately before or after MRI.

Confounder-Corrected R2* Quantification

An investigational version of a 3D multi-echo spoiled gradient echo acquisition was implemented at both 1.5T and 3.0T[44]. In order to assess reproducibility across acquisition parameters, multiple R2* mapping acquisitions with different parameters were obtained by constructing four unique protocols at both 1.5T and 3.0T (see Table 1). These protocols were chosen to obtain data with varying spatial resolution, image orientation, echo spacing and number of echoes. The goal of varying these parameters is to determine their effect on R2* bias and variability, as introduced by noise floor effects, fat signals, and macroscopic B0 field variations.

Table 1:

MRI acquisition protocols.

1.5T 3.0T
Ferriscan Protocol 1 Protocol 2 Protocol 3 Protocol 4 Protocol 1 Protocol 2 Protocol 3 Protocol 4
Orientation Axial Axial Axial Coronal Axial Axial Axial Coronal Axial
Slice (mm) 6 8 6 5 8 8 6 5 8
Slices 11 32 36 40 32 32 32 40 32
FOV (cm) 44×33 40×36 40×36 40×36 40×36 40×32 40×32 40×40 40×32
Matrix size 256×256 256×160 256×160 256×160 144×128 256×144 256×144 256×144 128×128
Voxel dimensions (mm) AP×RL×SI 1.7×1.7×6.0 2.5×1.6×8.0 2.5×1.6×6.0 5.0×2.5×l.6 3.1×2.8×8.0 2.8×1.6×8.0 2.8×1.6×6.0 5.0×2.8×l.6 3.1×3.1×8.0
Flip angle (°) 90 5 5 5 5 4 4 4 4
TR (ms) 1000 14.1 14.1 14.1 11.0 8.6 8.6 8.6 5.9
Number of echoes 5 6 6 6 12 6 6 6 8
ETL 1 6 6 6 6 3 3 3 4
TE1 6 1.2 1.2 1.2 0.9 1.2 1.2 1.2 0.6
ΔTE 3 2.0 2.0 1.0 0.7 1.0 1.0 1.0 0.6
Acquisition time (min) 17:00 0:19 0:21 0:21 0:22 0:22 0:22 0:23 0:20

For each R2* mapping acquisition, R2* maps were reconstructed using five different algorithms (Table 2). These algorithms included four combinations of: magnitude-based (using magnitude echo images while discarding the phase) versus complex-based fitting, and fat-uncorrected versus fat-corrected (including simultaneous fat-water separation and R2* fitting). Note that complex-based fitting has been shown to avoid noise floor related bias, whereas fat-correction has been shown to avoid R2* bias due to the presence of fat-related signal oscillations[34]. In order to address the instability of fat-corrected R2* fitting at very high R2* values[34], a complex-based hybrid R2* fitting algorithm was also used to return fat-corrected R2* fitting at moderate R2* values (for R2*<500s−1 at 1.5T, and for R2*<1000s−1 at 3.0T), and fat-uncorrected R2* fitting at higher R2* values (R2*≥500s−1 at 1.5T, and for R2*≥1000s−1 at 3.0T). Given that eight different R2* mapping acquisitions were performed on each subject (two field strengths, four protocols per field strength), these five reconstructions result in a total of 40 R2* maps per subject.

Table 2:

R2* measurement methods.

Magnitude fat-uncorrected Magnitude fat-corrected Complex fat-uncorrected Complex fat-corrected Complex hybrid
Signal fitting method Least-squares Least-squares Least-squares Least-squares Least-squares
Magnitude vs. complex Magnitude Magnitude Complex Complex Complex
Fat correction None Multi-peak fat model None Multi-peak fat model Hybrid: multi-peak fat model up to R2* threshold

R2-Based LIC Quantification as Reference

Each patient underwent LIC measurement using a commercially available R2-based spin echo technique (FerriScan, Resonance Health)[31] at 1.5T. Briefly, this standardized protocol consists of free breathing, 2D spin echo images obtained with 11 axial slices over the liver, using a repetition time (TR) of 1,000ms and five distinct echo times (TE) (6,9,12,15,18ms).

Data Analysis

After reconstruction of R2* maps, images were loaded into an analysis software package (Osirix, Pixmeo). Circular regions of interest (ROI) were placed within each of the nine Couinaud segments of the liver (ie: segments I, II, III, IVa, IVb, V, VI, VII, VIII) as well as the spleen to estimate the average local R2* value. ROIs as large as possible to avoid large vessels, bile ducts, liver lesions, or obvious image artifacts were measured using a standardized approach described by Campo et al[45]. For R2* maps reconstructed using the same source echo data but in a different manner (eg: magnitude vs complex, Table 2), ROIs were copied and pasted to ensure identical regions were interrogated. For R2* maps generated from different acquisitions, attempts were made to co-register the ROIs across the three axial acquisitions at each field strength, using the copy/paste function, and any slight manual adjustment needed to account for slight differences in breath-holding. From each ROI, the mean R2* value was recorded. Finally, the average liver R2* value across the nine Couinaud segments was also computed.

In addition, co-localized measurements of proton-density fat-fraction (PDFF) were performed from protocol 4 acquired at 1.5T using a complex fat-corrected reconstruction to determine the prevalence of hepatic steatosis in this population and the potential confounding effects of fat.

Also, 1.5T spin echo images were uploaded to an independent core lab (Resonance Health), using secure web transfer. The core lab returned a summary LIC value (mg Fe/g dry tissue), which was used as the reference value for each subject.

Statistical Analysis

To assess reproducibility and bias in measured R2* between protocols, two-tailed, paired Wilcoxon rank sum tests for each protocol pair (six pairs) within each segment, reconstruction and field strength combination.

For brevity, subsequent analyses were limited to the complex hybrid R2* reconstruction. To further assess the reproducibility across R2* acquisition protocols, intraclass correlation coefficients (ICC) were estimated from linear mixed models[46]. To estimate the correlation between liver-averaged R2* and LIC, for each protocol and field strength, Spearman’s ρ, a rank-based correlation coefficient, was calculated between R2* and LIC (along with 95% BCa-bootstrapped confidence intervals). Pearson’s correlation coefficient was estimated to assess the relationship between serum ferritin and LIC at the subject level. Corresponding calibration curves (slope and intercept) were estimated using simple linear regression. To compare protocols across liver-averaged R2* measurements (complex hybrid reconstruction only, as seen in Figure 4) at 1.5T versus 3.0T, a linear mixed model was fit to the data where individual patient was modeled as a random effect. The resulting model p-values were estimated using Satterwhaite’s approximation.

Figure 4:

Figure 4:

Regression analysis between liver-averaged R2* measurements (complex hybrid reconstruction) at 1.5T versus 3.0T. High correlation with a slope close to 2.0 is observed for each of the acquisition protocols. The slope of protocol 3 was found to be significantly lower than protocols 1, 2 and 4 (p<.001).

To assess between-segment variability in R2* (eg: due to the presence of segment-specific macroscopic B0 field-related bias in R2*), eight linear mixed effect models (one for each protocol in each field strength) were fit to the complex hybrid reconstruction R2* measurements. Each model adjusted for segment (a nine-level factor) as a fixed effect, where R2* measurements in segment VI were used as the reference (as this segment is generally free of macroscopic B0 related R2* bias or other obvious artifacts), and the individual patient was modeled as a random effect.

The association between LIC and both R2* and ferritin was also investigated using receiver operating characteristic (ROC) curves. The LIC variable was dichotomized (above or below) according to four threshold values, 1.8, 3.2, 7 and 15 mg/g dry, which are commonly used thresholds for normal (<1.8), mild (1.8–3.2), moderate (3.2–7), severe (7–15) and extreme (>15) iron overload, respectively[31; 47; 48]. To calculate common model metrics assessing discriminatory abilities such as sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV), an optimal threshold was calculated using Youden’s index (J) which equally weights sensitivity and specificity.

Finally, the association between liver-averaged PDFF and liver-averaged R2* (both measured at 1.5T from protocol 4) was assessed using Spearman’s rank correlation. All statistical analyses were conducted using R (V 3.6.1, R Core Team)[49; 50].

Results

Forty patients with known or suspected iron overload (age (mean±std): 43.9±21.1, age range: 11–79, 28 men and 12 women) were successfully recruited. Eleven patients had hereditary hemochromatosis, one patient with hyperferritinemia not otherwise specified, and 28 patients with transfusional hemosiderosis from a variety of causes. Causes of transfusional hemosiderosis included myelodysplastic syndrome (7), acute myelocytic leukemia (5), acute leukocytic leukemia (3), aplastic anemia (2), lymphoma (2), leukemia (1), sickle cell disease (1), myelofibrosis (1), and other anemias (6). Table 3 summarizes the diagnosis and/or etiology of hepatic iron overload. Further, we recruited 10 healthy control subjects (age (mean±std): 41.0±15.6, age range: [24,73], 5 men and 5 women). One patient was unable to undergo MRI due to large body habitus. For the remaining 39 patients and 10 healthy control subjects, acquisition of all R2* protocols at 1.5T and 3.0T, as well as spin echo R2-based relaxometry at 1.5T was successful in all subjects with no technical failures. The measured ROIs in the liver (across all segments) had area 3.6±1.9 cm2. The sizes of the ROIs by liver segment, as well as in the spleen, are reported in Table S1 (Supplemental Material).

Table 3:

Subject characteristics (HH: hereditary hemochromatosis, TH: transfusional hemosiderosis, MDS: myelodysplastic syndrome, AA: aplastic anemia, AML: acute myeloid leukemia, ALL: acute lymphoblastic leukemia, SCD: sickle cell disease, NOS: not otherwise specified).

Subject# Reason for suspected iron overload (grouped) M/F Age (years) LIC (mg Fe/g dry) Serum ferritin (ng/mL)
Patients
1 TH (MDS) F 70 4.5 1045
2 TH(MDS) M 66 13.8 4074
3 TH (Other anemia) M 23 1.4 183
4 TH (AA) F 19 2.2 523
5 TH (AML) M 50 7.2 926
6 TH (ALL) F 12 19.2 2634
7 HH M 48 1.6 293
8 HH M 59 0.9 1206
9 TH (MDS) F 66 7.6 1022
10 TH (AML) M 61 12.6 2806
11 HH M 35 3.1 397
12 TH (AML) F 14 1 76
13 TH(AA) M 11 2.3 133
14 TH (SCD) F 16 7.2 2484
15 TH (ALL) F 24 3.4 1003
16 TH (Lymphoma) M 79 1.7 951
17 TH (Lymphoma) M 57 1.9 596
18 TH (Other anemia) M 15 4.4 1697
19 TH (AML) F 19 12.7 4852
20 TH (MDS) F 64 2.1 4178
21 HH M 33 7.1 1565
22 HH M 21 6 372
23 HH F 68 8 867
24 TH (MDS) M 69 4.7 2144
25 HH M 60 2 686
26 TH (Leukemia) M 69 3.6 1165
27 HH M 41 0.4 562
28 TH M 57 10.8 2382
29 TH (MDS) M 67 9.2 3366
30 TH(MDS) M n/a: large body habitus, did not complete MRI
31 TH (Other anemia) M 35 2.7 4
32 TH (AML) F 53 14.1 7427
33 TH (Other anemia) M 63 3 2830
34 HH M 50 5.7 2555
35 TH (ALL) F 20 9.8 1865
36 HH M 57 15.7 1240
37 TH (AML) M 17 4.5 709
38 TH (Other anemia) M 54 3.8 2094
39 TH (Other anemia) M 21 7 1515
40 Hyperferritinemia (NOS) M 51 3.3 531
Healths controls
1 M 26 0.6 139
2 F 63 1.4 39
3 F 47 1.1 30
4 M 25 1 118
5 M 44 0.3 14
6 M 29 1 38
7 F 62 0.8 11
8 F 57 1.4 110
9 M 24 1.3 150
10 F 33 0.6 23

Figure 1 shows representative examples of two subjects with moderate iron overload, and severe iron overload, showing R2* maps acquired at 1.5T, 3.0T, as well as R2*-based LIC maps (produced using the R2*-LIC calibration described below). R2* maps without obvious artifacts were obtained at both field strengths over a wide range of LIC values. Further, even though R2* values are field strength-dependent, LIC maps (obtained based on the calibration described below) are field strength-independent.

Figure 1:

Figure 1:

Representative R2* maps (top two rows) and corresponding LIC maps (bottom two rows) for a patient with moderate iron overload (left) and a patient with severe iron overload (right). The LIC maps were obtained from the R2* maps using the calibration derived in this manuscript. R2* and LIC maps are shown for both 1.5T and 3.0T acquisitions, demonstrating the field strength dependence of R2*, and the field strength independence of LIC maps. (SF: Serum ferritin concentration).

Figure 2 evaluates the reproducibility of each R2* reconstruction method across acquisition protocols by assessing pairwise protocol biases. In this figure, six pairwise comparisons between R2* measured with four different acquisition protocols are shown, for different combinations of field strength (1.5T and 3.0T), measurement location (nine Couinaud liver segments, spleen and overall liver mean), and reconstruction method. For each of these comparisons, a colored tile denotes a significant result (p<0.05) from the paired Wilcoxon rank-sum test at the corresponding field strength, reconstruction and segment combination. A significant p-value provides evidence of a possible protocol-level bias for that particular combination. For most segments and at each field strength, complex fitting based R2* mapping results in no significant difference (p>0.05) in R2* measurements between each pair of protocols, providing no evidence of a bias and demonstrating reproducibility of R2* mapping. In contrast, magnitude fitting R2* measurements, both fat-corrected or uncorrected, result in significant differences (p<0.05) between acquisition protocols for a large number of liver segments as well as the spleen.

Figure 2:

Figure 2:

R2* mapping with complex reconstruction is highly reproducible across acquisition protocols. The plots show p-values evaluating the presence of systematic differences in R2* for each pair of acquisition protocols. This comparison was performed for each reconstruction algorithm (magnitude fat-uncorrected, magnitude fat-corrected, complex fat-uncorrected, complex fat-corrected, and complex hybrid), for each liver segment as well as the spleen and liver mean, and for each field strength (1.5T and 3.0T). For most segments and at both field strengths, complex fitting based R2* mapping results in no significant difference (p>0.05) in R2* measurements between each pair of protocols. In contrast, magnitude fitting R2* measurements result in significant differences (p<0.05) between acquisition protocols for many field strength and segment combinations.

Subsequent analysis was focused on the complex hybrid R2* reconstruction. This reconstruction leads to R2* that is highly reproducible across acquisition protocols, with an ICC between 0.91–1.00 for each liver segment (between 0.98–1.00 for liver-averaged R2*) at both 1.5T and 3.0T. To illustrate these comparisons across protocols, Figure S1 (Supplemental Material) shows the six Bland-Altman plots comparing protocols 1, 2 and 3 to protocol 4 for the mean liver R2*, in the complex hybrid reconstruction (for both field strengths).

Figure 3 shows correlation analysis between complex hybrid fitting mean liver R2* (obtained by averaging R2* measurements across all nine Couinaud liver segments) and LIC, for each of the four R2* mapping protocols acquired at each field strength. The calibration and correlation coefficient for each of these protocols between R2* and LIC are shown in Table 4.

Figure 3:

Figure 3:

Regression analysis and calibration between liver-averaged R2* measurements (complex hybrid reconstruction) from each of the acquisition protocols and LIC. This analysis is shown separately for 1.5T and 3.0T based R2* measurements. Similar calibrations are obtained at each field strength across acquisition protocols.

Table 4:

Regression analysis between liver-averaged R2* and LIC, for various R2* acquisition protocols, using complex hybrid reconstruction.

Field Strength (T) Protocol Intercept ± SE (mg/g) Slope ± SE (×10−2 mg/g/s−1) Correlation
1.5 1 −0.19 ± 0.40 2.71 ± 0.16 0.94
1.5 2 −0.20 ± 0.39 2.75 ± 0.16 0.94
1.5 3 −0.26 ± 0.42 2.78 ± 0.17 0.94
1.5 4 −0.04 ± 0.39 2.62 ± 0.15 0.95
3.0 1 −0.16 ± 0.40 1.47 ± 0.09 0.94
3.0 2 −0.12 ± 0.41 1.46 ± 0.09 0.94
3.0 3 −0.29 ± 0.54 1.56 ± 0.13 0.92
3.0 4 −0.00 ± 0.42 1.42 ± 0.09 0.93

Figure 4 shows correlation between mean liver R2* measured at 1.5T versus R2* measured at 3.0T, for each of the four acquisition protocols reconstructed with the complex hybrid approach. High correlation was observed for each of the protocols (1: r=0.997, 2: r=0.992, 3: r=0.983, 4: r=0.996), with slope close to 2.0, as expected[51; 52]. The slope of the acquisition protocol 3 was significantly lower than that of protocols 1, 2 and 4, as assessed by the linear mixed model (p<.001 for each comparison).

Figure 5 shows ROC analysis for the diagnostic accuracy of R2* to diagnose different thresholds of iron overload using LIC as the reference. For this plot, mean liver R2* from the acquisition protocol 4 and complex hybrid reconstruction 4 were used. An area under the ROC curve higher than 0.95 was obtained for all LIC thresholds under consideration (1.8, 3.2, 7.0, 15.0 mg Fe/g dry), at either field strength. Detection of these four LIC thresholds can be performed from 1.5T R2* measurements with thresholds 97.2, 131.3, 273.0, and 459.1 s−1, respectively. Similarly, detection of these four LIC thresholds can be performed from 3.0T R2* measurements with thresholds 152.7, 236.6, 350.6, and 839.7 s−1, respectively (see table S2 in Supplemental Material for details).

Figure 5:

Figure 5:

ROC analysis for R2*-based detection of clinically relevant LIC thresholds (1.8, 3.2, 7.0, 15.0 mg/g dry) demonstrates excellent predictive ability for each of the LIC thresholds, using liver-average R2* measured from protocol 4 (complex hybrid reconstruction) at both 1.5T and 3.0T.

No significant segment variability was observed for R2* measurements in the four 1.5T models, or in the 3.0T models for protocols 2 and 4, when comparing to the reference R2* measurements from liver Couinaud segment VI (p>0.05 for each Couinaud segment in each model). Within the model fit to the 3.0T and protocol 3 data, segment IVa and segment VIII had significantly higher mean R2* measurements compared to segment VI (p<0.05). With regard to the model fit to the 3.0T and protocol 1 data, segment II had significantly lower mean R2* measurements compared to segment VI (p<0.01).

Figure 6 shows a scatter plot between serum ferritin concentration and LIC with the associated linear regression curve, as well as ROC plot for serum ferritin concentration to diagnose different thresholds of iron overload using LIC as the reference. Although a significant correlation exists between serum ferritin concentration and LIC (r=0.68, p<.0001), serum ferritin concentration results in a moderate area under the ROC curve for detection of higher LIC thresholds (eg: 0.76 for LIC>15 mg Fe/g dry). Detection of the four LIC thresholds considered in this work (1.8, 3.2, 7.0, and 15.0 mg Fe/g dry) can be performed from serum ferritin measurements with thresholds 372, 709, 867, and 1240 ng/mL, respectively (see table S2 in Supplemental Material).

Figure 6;

Figure 6;

Correlation analysis shows moderate correlation between serum ferritin concentration and LIC (r=0.68). ROC analysis for serum ferritin-based detection of clinically relevant LIC thresholds demonstrates good predictive ability at low LIC thresholds, and moderate accuracy at high thresholds (eg: 15.0 mg/g dry).

Liver PDFF measurements were 4.7±6.6% across all subjects (min PDFF: −2.3%, max PDFF: 34.4%). No significant correlation (based on Spearman’s rank correlation analysis) was found between PDFF and R2* across the reconstructed data sets (p=0.84).

Discussion

In this work we investigated the performance of a complex confounder-corrected R2* mapping method to quantify hepatic iron overload. We evaluated eight different acquisition protocols (four at 1.5T and four at 3.0T), using an FDA-approved spin echo based LIC quantification technique as the reference. A linear relationship with excellent correlation was observed between all R2* protocols at both 1.5T and 3.0T with the LIC reference. Further, R2* values across the different protocols were highly reproducible, even over a wide range of spatial resolution, echo times, number of echoes, and image plane orientation, which indicates insensitivity to the confounding effects of macroscopic susceptibility, as well as noise floor effects. Therefore, we have successfully validated confounder-corrected CSE-MRI to measure R2* as an accurate and reproducible (across protocols) quantitative biomarker of liver iron overload.

In addition, we confirmed the known relationship of R2* at 1.5T versus 3.0T, as predicted theoretically[53] and as measured in previous studies[51; 52]. Further, comparison of R2* values obtained at both 1.5T and 3.0T on the same day demonstrated very strong correlation, with regression parameters similar to those observed in previous works, eg: slope of 1.86 and intercept of −6.4 s−1 in our study (protocol 4) compared to slope of 1.92 and intercept of −9.1 s−1 in Meloni et al [52]. In combination with the excellent agreement of R2* measured from different acquisition parameters at a single field strength (also shown in this manuscript), this provides indirect evidence that the precision of R2*-based relaxometry may be very high. Further work with test-retest repeatability studies will be needed to confirm this speculation.

In addition, this work demonstrated excellent diagnostic accuracy of R2* for the detection and differentiation of different stages of iron overload based on clinically accepted LIC thresholds. Comparison of serum ferritin levels obtained on the same day as MRI showed a significant but moderate correlation with LIC.

A widely recognized R2* versus LIC calibration performed by Wood et al[32], demonstrates similar, although slightly different calibration compared to our results. It is important to note that the technique described by Wood was markedly different, as it did not correct for the presence of fat, noise floor effects, used a large number of echo times, and used relatively large voxels (15 mm slices) and single slice 2D acquisition. Similarly, a comparison between (1.5T) R2* and FerriScan-based LIC was also recently performed using magnitude-based R2* fitting[40], also leading to a similar calibration to the one shown in this work. Due to these technical differences in R2* mapping methods, it is difficult to make a direct comparison of the complex confounder-corrected CSE-MRI method used in this work. We do note, however, that we have successfully demonstrated excellent reproducibility across different protocols acquiring data with different echo spacing, different echo times, and different voxel sizes. Thus, the generalizability of the calibration determined in this work should be very broad. Despite the high reproducibility across acquisition protocols, R2* mapping in the presence of high iron overload will benefit from acquisitions with high SNR and short TEs. For this reason, from the acquisitions performed in this study, we expect protocol 4, which has the shortest echo times, to provide the broadest dynamic range.

There are several limitations of this study. First, detailed effects on the etiology of iron overload on the R2*-LIC calibration, as well as the effects of phlebotomy or chelation therapy were not examined. Another limitation is the lack of biopsy data. A follow-up study including biopsy calibration would be highly informative. However, liver biopsies for LIC measurement are rarely performed at our institution, so conducting such biopsy calibration studies may be challenging. Importantly, the reference standard used in this study is FDA-approved, based on previous liver biopsy-MRI correlation studies. Thus, the LIC values used in this reference are equivalent to those obtained from liver biopsy. Finally, this prospective study reports the results from a single center study using a single MRI vendor. Future validation across multiple centers, vendors, and platforms are needed to demonstrate the widespread reproducibility of this technique across sites and MRI vendors and platforms.

Importantly, this study included relatively few patients with severe iron overload. Further work validating the methods in populations with higher iron burdens (e.g. thalassemia major) are needed. One additional technical limitation of the R2* mapping described in this work is that it is based on a Cartesian acquisition with an initial echo time of approximately 1ms. As a result, accurate measurement of R2* in patients with extreme iron overload may be compromised, i.e.: the dynamic range of this technique may be limited. Non-Cartesian radial ultra-short TE (UTE) techniques may enable a broader dynamic range for those patients with extreme iron overload[5456].

In conclusion, we have successfully validated a complex confounder-corrected CSE-MRI method to quantify R2* as an accurate and reproducible biomarker of liver iron concentration in a prospective single center study. Further, we have provided calibration between R2* and LIC, at both 1.5T and 3.0T, enabling estimation of LIC at these two important clinical field strengths, using complex gradient echo methods which can be acquired in a single breath-hold. Given the widespread commercialization of 3D multi-echo gradient echo methods for R2*-corrected fat quantification, translation of the proposed technique in this work into a commercially available application on all vendor platforms should be relatively straightforward. Further work validating such methods in multi-center studies will be needed.

Supplementary Material

330_2020_7123_MOESM1_ESM

Key points:

  • Confounder-corrected R2* of the liver provides reproducible R2* across acquisition protocols, including different spatial resolutions, echo times, and slice orientations, at both 1.5T and 3.0T.

  • For all acquisition protocols, high correlation with R2-based liver iron concentration (LIC) quantification was observed.

  • The calibration between confounder-corrected R2* and LIC, at both 1.5T and 3.0T is determined in this study.

Funding information:

This study has received funding from the WARF Accelerator Program, from the NIH (R01 DK100651, K24 DK102595, R01 DK083380, R01 DK117354), as well as from GE Healthcare who provides research support to UW-Madison. Further, Dr. Reeder’ is a Romnes Faculty Fellow, and has received an award provided from the University of Wisconsin-Madison Office of the Vice Chancellor of Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation.

List of abbreviations:

AA

aplastic anemia

ALL

acute lymphoblastic leukemia

AML

acute myeloid leukemia

CSE

Chemical Shift-Encoded

GRE

Gradient-Recalled Echo

HH

hereditary hemochromatosis

IRB

Institutional Review Board

LIC

Liver Iron Concentration

MDS

myelodysplastic syndrome

NOS

not otherwise specified

SCD

sickle cell disease

TH

transfusional hemosiderosis

Footnotes

Publisher's Disclaimer: This Author Accepted Manuscript is a PDF file of a an unedited peer-reviewed manuscript that has been accepted for publication but has not been copyedited or corrected. The official version of record that is published in the journal is kept up to date and so may therefore differ from this version.

Guarantor:

The scientific guarantor of this publication is Diego Hernando, PhD.

Conflict of Interest:

The authors of this manuscript declare relationships with the following companies: Dr. Hernando is a cofounder of Calimetrix. Dr. Reeder consults for ArTara Therapeutics, and HeartVista. Dr. Reeder is a cofounder of Calimetrix.

Statistics and Biometry:

One of the coauthors (Colin Longhurst, MS, UW Department of Biostatistics and Medical Informatics) has significant statistical expertise.

Informed Consent:

Written informed consent was obtained from all subjects (patients) in this study.

Ethical Approval:

Institutional Review Board approval was obtained.

Study subjects or cohorts overlap:

Some study subjects or cohorts have been previously reported. Hernando et al (Magnetic resonance in medicine 70 (3), 648–656) demonstrated the feasibility of quantifying liver iron concentration from measured B0 field maps. Sharma et al (Magnetic resonance in medicine 74 (3), 673–683) validated a method for quantitative susceptibility mapping in the liver. Horng et al (Journal of Magnetic Resonance Imaging 45 (2), 428–439) evaluated the accuracy of a single-R2* signal model for fat quantification in the presence of liver iron overload.

Methodology:
  • prospective
  • cross-sectional
  • single-institution study.

References

  • 1.Pietrangelo A (2004) Hereditary hemochromatosis--a new look at an old disease. N Engl J Med 350:2383–2397 [DOI] [PubMed] [Google Scholar]
  • 2.Edwards CQ, Ajioka RS, Kushner JP (2000) Hemochromatosis: A genetic definition In: Barton JC, Edwards CQ, (eds) Hemochromatosis: Genetics, Pathophysiology, Diagnosis and Treatmen. Cambridge University Press, Cambridge, UK, pp 8–11 [Google Scholar]
  • 3.Angastiniotis M, Modell B (1998) Global epidemiology of hemoglobin disorders. Ann N Y Acad Sci 850:251–269 [DOI] [PubMed] [Google Scholar]
  • 4.Hassell KL (2010) Population estimates of sickle cell disease in the U.S. Am J Prev Med 38:S512–521 [DOI] [PubMed] [Google Scholar]
  • 5.Chernoff AI (1959) The distribution of the thalassemia gene: a historical review. Blood 14:899–912 [PubMed] [Google Scholar]
  • 6.Mahesh S, Ginzburg Y, Verma A (2008) Iron overload in myelodysplastic syndromes. Leuk Lymphoma 49:427–438 [DOI] [PubMed] [Google Scholar]
  • 7.Babitt JL, Lin HY (2012) Mechanisms of anemia in CKD. J Am Soc Nephrol 23:1631–1634 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Brittenham GM, Cohen AR, McLaren CE et al. (1993) Hepatic iron stores and plasma ferritin concentration in patients with sickle cell anemia and thalassemia major. Am J Hematol 42:81–85 [DOI] [PubMed] [Google Scholar]
  • 9.Alustiza JM, Castiella A, De Juan MD, Emparanza JI, Artetxe J, Uranga M (2007) Iron overload in the liver diagnostic and quantification. Eur J Radiol 61:499–506 [DOI] [PubMed] [Google Scholar]
  • 10.Wood JC (2007) Magnetic resonance imaging measurement of iron overload. Curr Opin Hematol 14:183–190 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sirlin CB, Reeder SB (2010) Magnetic resonance imaging quantification of liver iron. Magn Reson Imaging Clin N Am 18:359–381, ix [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Niederau C, Fischer R, Sonnenberg A, Stremmel W, Trampisch HJ, Strohmeyer G (1985) Survival and causes of death in cirrhotic and in noncirrhotic patients with primary hemochromatosis. N Engl J Med 313:1256–1262 [DOI] [PubMed] [Google Scholar]
  • 13.Au WY, Lam WW, Chu W et al. (2008) A T2* magnetic resonance imaging study of pancreatic iron overload in thalassemia major. Haematologica 93:116–119 [DOI] [PubMed] [Google Scholar]
  • 14.Wood JC, Noetzl L, Hyderi A, Joukar M, Coates T, Mittelman S (2010) Predicting pituitary iron and endocrine dysfunction. Ann N Y Acad Sci 1202:123–128 [DOI] [PubMed] [Google Scholar]
  • 15.Anderson LJ, Holden S, Davis B et al. (2001) Cardiovascular T2-star (T2*) magnetic resonance for the early diagnosis of myocardial iron overload. European Heart Journal 22:2171–2179 [DOI] [PubMed] [Google Scholar]
  • 16.Adams P, Brissot P, Powell LW (2000) EASL International Consensus Conference on Haemochromatosis. J Hepatol 33:485–504 [DOI] [PubMed] [Google Scholar]
  • 17.Porter JB (2001) Practical management of iron overload. Br J Haematol 115:239–252 [DOI] [PubMed] [Google Scholar]
  • 18.Vichinsky E (2008) Oral iron chelators and the treatment of iron overload in pediatric patients with chronic anemia. Pediatrics 121:1253–1256 [DOI] [PubMed] [Google Scholar]
  • 19.Harmatz P, Butensky E, Quirolo K et al. (2000) Severity of iron overload in patients with sickle cell disease receiving chronic red blood cell transfusion therapy. Blood 96:76–79 [PubMed] [Google Scholar]
  • 20.Brittenham GM, Badman DG, National Institute of D, Digestive, Kidney Diseases W (2003) Noninvasive measurement of iron: report of an NIDDK workshop. Blood 101:15–19 [DOI] [PubMed] [Google Scholar]
  • 21.Karam LB, Disco D, Jackson SM et al. (2008) Liver biopsy results in patients with sickle cell disease on chronic transfusions: poor correlation with ferritin levels. Pediatr Blood Cancer 50:62–65 [DOI] [PubMed] [Google Scholar]
  • 22.Nielsen P, Gunther U, Durken M, Fischer R, Dullmann J (2000) Serum ferritin iron in iron overload and liver damage: correlation to body iron stores and diagnostic relevance. J Lab Clin Med 135:413–418 [DOI] [PubMed] [Google Scholar]
  • 23.Angelucci E, Brittenham GM, McLaren CE et al. (2000) Hepatic iron concentration and total body iron stores in thalassemia major. N Engl J Med 343:327–331 [DOI] [PubMed] [Google Scholar]
  • 24.Siegelman ES, Mitchell DG, Semelka RC (1996) Abdominal iron deposition: metabolism, MR findings, and clinical importance. Radiology 199:13–22 [DOI] [PubMed] [Google Scholar]
  • 25.Emond MJ, Bronner MP, Carlson TH, Lin M, Labbe RF, Kowdley KV (1999) Quantitative study of the variability of hepatic iron concentrations. Clin Chem 45:340–346 [PubMed] [Google Scholar]
  • 26.Villeneuve JP, Bilodeau M, Lepage R, Cote J, Lefebvre M (1996) Variability in hepatic iron concentration measurement from needle-biopsy specimens. J Hepatol 25:172–177 [DOI] [PubMed] [Google Scholar]
  • 27.Guyader D, Gandon Y, Robert JY et al. (1992) Magnetic resonance imaging and assessment of liver iron content in genetic hemochromatosis. J Hepatol 15:304–308 [DOI] [PubMed] [Google Scholar]
  • 28.Dixon RM, Styles P, al-Refaie FN et al. (1994) Assessment of hepatic iron overload in thalassemic patients by magnetic resonance spectroscopy. Hepatology 19:904–910 [PubMed] [Google Scholar]
  • 29.Gandon Y, Guyader D, Heautot JF et al. (1994) Hemochromatosis: diagnosis and quantification of liver iron with gradient-echo MR imaging. Radiology 193:533–538 [DOI] [PubMed] [Google Scholar]
  • 30.Henninger B, Alustiza J, Garbowski M, Gandon Y (2020) Practical guide to quantification of hepatic iron with MRI. Eur Radiol 30:383–393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.St Pierre TG, Clark PR, Chua-anusorn W et al. (2005) Noninvasive measurement and imaging of liver iron concentrations using proton magnetic resonance. Blood 105:855–861 [DOI] [PubMed] [Google Scholar]
  • 32.Wood JC, Enriquez C, Ghugre N et al. (2005) MRI R2 and R2* mapping accurately estimates hepatic iron concentration in transfusion-dependent thalassemia and sickle cell disease patients. Blood 106:1460–1465 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Vasanawala SS, Yu H, Shimakawa A, Jeng M, Brittain JH (2012) Estimation of liver T*2 in transfusion-related iron overload in patients with weighted least squares T*2 IDEAL. Magn Reson Med 67:183–190 [DOI] [PubMed] [Google Scholar]
  • 34.Hernando D, Kramer HJ, Reeder SB (2013) Multipeak Fat-Corrected Complex R2* Relaxometry: Theory, Optimization, and Clinical Validation. Magn Reson Med 70:1319–1331 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Ahmed A, Wong RJ, Harrison SA (2015) Nonalcoholic Fatty Liver Disease Review: Diagnosis, Treatment, and Outcomes. Clin Gastroenterol Hepatol 13:2062–2070 [DOI] [PubMed] [Google Scholar]
  • 36.Reeder SB, Sirlin CB (2010) Quantification of liver fat with magnetic resonance imaging. Magn Reson Imaging Clin N Am 18:337–357, ix [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Fernandez-Seara MA, Wehrli FW (2000) Postprocessing technique to correct for background gradients in image-based R*(2) measurements. Magn Reson Med 44:358–366 [DOI] [PubMed] [Google Scholar]
  • 38.Hernando D, Vigen KK, Shimakawa A, Reeder SB (2012) R2* mapping in the presence of macroscopic B(0) field variations. Magn Reson Med 68:830–840 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Yu H, Shimakawa A, McKenzie CA, Brodsky E, Brittain JH, Reeder SB (2008) Multiecho water-fat separation and simultaneous R2* estimation with multifrequency fat spectrum modeling. Magn Reson Med 60:1122–1134 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Jhaveri KS, Kannengiesser SAR, Ward R, Kuo K, Sussman MS (2019) Prospective Evaluation of an R2* Method for Assessing Liver Iron Concentration (LIC) Against FerriScan: Derivation of the Calibration Curve and Characterization of the Nature and Source of Uncertainty in the Relationship. J Magn Reson Imaging 49:1467–1474 [DOI] [PubMed] [Google Scholar]
  • 41.Hernando D, Cook RJ, Diamond C, Reeder SB (2013) Magnetic susceptibility as a B0 field strength independent MRI biomarker of liver iron overload. Magn Reson Med 70:648–656 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Sharma SD, Hernando D, Horng DE, Reeder SB (2015) Quantitative susceptibility mapping in the abdomen as an imaging biomarker of hepatic iron overload. Magn Reson Med 74:673–683 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Horng DE, Hernando D, Reeder SB (2017) Quantification of liver fat in the presence of iron overload. J Magn Reson Imaging 45:428–439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Meisamy S, Hines CD, Hamilton G et al. (2011) Quantification of hepatic steatosis with T1-independent, T2-corrected MR imaging with spectral modeling of fat: blinded comparison with MR spectroscopy. Radiology 258:767–775 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Campo CA, Hernando D, Schubert T, Bookwalter CA, Pay AJV, Reeder SB (2017) Standardized Approach for ROI-Based Measurements of Proton Density Fat Fraction and R2* in the Liver. AJR Am J Roentgenol 209:592–603 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Bates D, Maechler M, Bolker B, Walker S (2015) Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software 67:1–48 [Google Scholar]
  • 47.Bassett ML, Halliday JW, Powell LW (1986) Value of hepatic iron measurements in early hemochromatosis and determination of the critical iron level associated with fibrosis. Hepatology 6:24–29 [DOI] [PubMed] [Google Scholar]
  • 48.Olivieri NF, Brittenham GM (1997) Iron-chelating therapy and the treatment of thalassemia. Blood 89:739–761 [PubMed] [Google Scholar]
  • 49.(2019) R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria [Google Scholar]
  • 50.Wickham H (2016) ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York [Google Scholar]
  • 51.Storey P, Thompson AA, Carqueville CL, Wood JC, de Freitas RA, Rigsby CK (2007) R2* imaging of transfusional iron burden at 3T and comparison with 1.5T. J Magn Reson Imaging 25:540–547 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Meloni A, Positano V, Keilberg P et al. (2012) Feasibility, reproducibility, and reliability for the T*2 iron evaluation at 3 T in comparison with 1.5 T. Magn Reson Med 68:543–551 [DOI] [PubMed] [Google Scholar]
  • 53.Ghugre NR, Doyle EK, Storey P, Wood JC (2015) Relaxivity-iron calibration in hepatic iron overload: Predictions of a Monte Carlo model. Magn Reson Med 74:879–883 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Krafft AJ, Loeffler RB, Song R et al. (2017) Quantitative ultrashort echo time imaging for assessment of massive iron overload at 1.5 and 3 Tesla. Magn Reson Med 78:1839–1851 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Doyle EK, Toy K, Valdez B, Chia JM, Coates T, Wood JC (2018) Ultra-short echo time images quantify high liver iron. Magn Reson Med 79:1579–1585 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Wiens C, Zhu A, Johnson K, Reeder S, Hernando D (2017) Accuracy and Reproducibility of Iron Quantification using Ultra-Short TE Imaging at 1.5T and 3.0TProceedings of the 25th Annual Meeting of ISMRM, Honolulu, pp 371 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

330_2020_7123_MOESM1_ESM

RESOURCES