Abstract
Purpose
To compare three types of MRI liver iron content (LIC) measurement performed in daily clinical routine in a single center over a 6-year period.
Methods
Patients undergoing LIC MRI-scans (1.5T) at our center between January 1, 2008 and December 31, 2013 were retrospectively included. LIC was measured routinely with signal intensity ratio (SIR) and MR-relaxometry (R 2 and R 2*) methods. Three observers placed regions-of-interest. The success rate was the number of correctly acquired scans over the total number of scans. Interobserver agreement was assessed with intraclass correlation coefficients (ICC) and Bland–Altman analysis, correlations between LICSIR, R 2, R 2*, and serum values with Spearman’s rank correlation coefficient. Diagnostic accuracies of LICSIR, R 2 and serum transferrin, transferrin-saturation, and ferritin compared to increased R 2* (≥44 Hz) as indicator of iron overload were assessed using ROC-analysis.
Results
LIC MRI-scans were performed in 114 subjects. SIR, R 2, and R 2* data were successfully acquired in 102/114 (89%), 71/114 (62%), and 112/114 (98%) measurements, with the lowest success rate for R 2. The ICCs of SIR, R 2, and R 2* did not differ at 0.998, 0.997, and 0.999. R 2 and serum ferritin had the highest diagnostic accuracies to detect elevated R 2* as mark of iron overload.
Conclusions
SIR and R 2* are preferable over R 2 in terms of success rates. R 2*’s shorter acquisition time and wide range of measurable LIC values favor R 2* over SIR for MRI-based LIC measurement.
Electronic supplementary material
The online version of this article (doi:10.1007/s00261-016-0831-7) contains supplementary material, which is available to authorized users.
Keywords: Magnetic resonance imaging, Iron overload, Hemochromatosis, Blood transfusion, Biomarker, Relaxometry
Various diseases are associated with increased liver iron content (LIC), which may induce or contribute to liver damage [1–3]. Serial measurement of LIC during long-term follow-up and treatment is highly desirable, but repeated invasive measurements are not recommended due to risks of complications of serial liver biopsies. Surrogate biochemical markers including serum ferritin and transferrin-saturation are widely used, but are flawed by limited specificity. Thus, accurate non-invasive MRI-based methods of LIC measurement are used in clinical practice for patients (suspected) with increased LIC [4, 5].
Several types of MRI LIC measurement have been described in the literature. Straightforward in–out phase gradient echo (GRE) shows signal loss at the later echo time (TE) but is only qualitative and easily confounded by the presence of hepatic steatosis. Quantitative approaches include (i) signal intensity ratio (SIR) measurement (e.g., the Gandon method) and (ii) MR-relaxometry. The Gandon method (henceforth referred to as “SIR”) utilizes the liver-to-muscle SIR on differently weighted MRI-scans [6]. This method allows easy and free calculation of the LICSIR, by entering ROI values in an online tool [7]. Hence, assuming the acquisition and placement of regions-of-interest (ROIs) are performed correctly, the method is robust to observer influences. A major limitation is its upper limit of detection of 350 µmol/g (equal to 20 mg/g): changes above that threshold cannot be measured.
MR-relaxometry relies on the calculation of tissue relaxation rates (R 2 and R 2*, the inverse of relaxation times T 2 and T 2*), which increase as iron accumulates and are sensitive to changes in LIC values well above the SIR-threshold. One commercialized R 2 approach using single-echo spin-echo (SE) MRI is the FDA-approved St. Pierre method [FerriScan®], performed in 10 min in free-breathing [8]. The per-scan analysis price is ~$300, on top of the costs of the MRI-scan itself. Alternative free-of-charge approaches are available for R 2 using free-breathing or respiratory triggered SE-MRI and for R 2* using single breath-hold GRE MRI [9].
Recent developments in MR-relaxometry include multipeak fat corrections and the use of complex instead of magnitude-only data fitting [10], assessment of the effect of fat suppression on R 2* [11] and the comparison of advanced data fit models [12] and analysis approaches [13].
A comparative study of LICSIR, R 2, and R 2* in 94 patients with β-thalassemia reported high correlations [14]. However, success rates, interobserver agreement, and applicability for diseases other than β-thalassemia were not investigated, nor were serum markers assessed. The latter may be useful to screen for elevated LIC (i.e., >36 µmol/g), saving expensive and limited MRI time. We hypothesize that R 2* is preferable over SIR and R 2 in terms of success rate, acquisition time, and range of detection and over serum values in terms of accuracy in detecting elevated LIC.
In our center, the clinical LIC protocol has included SIR, R 2, and R 2* since 2005, with regular weekly clinical referrals since 2008. The SIR measurement is recommended by the national guideline for hemochromatosis [15]. It is supplemented by R 2 and R 2* measurements to fill the gap caused by the SIR method’s hard cut-off at 350 µmol/g. To investigate our hypothesis, we (i) assessed SIR, R 2, and R 2* LIC measurements and their success rates and interobserver agreement; and (ii) compared the diagnostic accuracies of LICSIR, R 2, and surrogate serum markers for correctly predicting elevated LIC based on increased R 2*.
Materials and methods
Ethical
All data used for this study were acquired in clinical setting and were anonymized prior to analysis. Informed consent was waived by the Medical Research Ethics Committee of the AMC Amsterdam.
Patients
All MRI-based LIC measurements performed between January 1, 2008 and December 31, 2013 were retrospectively included in this study. As additional measurements were added to the protocol in 2014, only measurements up to end 2013 were included. Clinical diagnosis and—when available—serum markers of iron metabolism (total iron, transferrin, transferrin-saturation, ferritin) were collected and subsequently anonymized by a colleague not otherwise involved in this study.
MRI
MRI-scanning was performed supine, feet first on a 1.5T Avanto MRI-scanner (Siemens AG, Erlangen, Germany) using phased-array coils (body array and spine coil) for localizers and R 2 and R 2* measurements and the body coil for the SIR measurement [6]. Use of the body coil provided an as homogenous B1 field as possible, reducing variation in SIR measurements due to variations of flip angles between patients. For R 2* and R 2, the B1 variation is eliminated via the data fit. Breath-hold imaging (localizers, SIR and R 2*) was performed in expiration. Three 10-mm slices with a variable slice gap to cover the liver were equally positioned for all three LIC measurements. Especially for the GRE-based SIR and R 2* measurements, careful B0 shimming is important to achieve a homogenous B0 field, ensuring correct measurements. Shimming was performed with a shim box covering the field-of-view in the feet-head direction and the contours of the abdomen (i.e., excluding the arms) in the left–right and anterior-posterior directions. The SIR measurement according to Gandon et al. requires five (T1, PD, T2, T2+, and T2++) image weightings with specific TR/TE combinations [6]. Table 1 contains an overview of the relevant scan parameters. Of note, the TE interval used for R 2* was shorter (1.41 ms) than the standard in- and out-of-phase interval (2.26 ms).
Table 1.
SIR | R 2 | R 2* | |
---|---|---|---|
Technique | GRE | SE | GRE |
TR (ms) | 120 | 3000–4000a | 300 |
TE1 (ms) | Variable [7] | 6.2 | 0.99 |
ΔTE (ms) | n/a | 6.2 | 1.41 |
Number of echoes | 1 | 16 (multiecho) | 12 (multiecho) |
FA (°) | Variable [7] | 180 | 20 |
FOV (mm × mm) | 380 × 285 | 380 × 285 | 380 × 285 |
Acquisition matrix | 256 × 256 | 256 × 256 | 128 × 96b |
Reconstruction matrix | 256 × 192 | 256 × 192 | 256 × 192 |
Parallel imaging | No | GRAPPA | GRAPPA |
Acceleration factor | n/a | 2 | 2 |
Bandwidth (Hz/pixel) | 140 | 465 | 1963 |
Slice thickness | 10 | 10 | 10 |
Slice gap | Variablec | Variablec | Variablec |
Number of slices | 3 | 3 | 3 |
Acquisition time | 100 s (5 × 20 s) | 9–16 mina | 20 s |
aDepending on the patient’s respiratory frequency: one TR per respiratory cycle
bZero-padding was used to fit R 2* acquisition in breath-hold time (20 s)
cThe slice gap was adjusted per patient so as to cover the whole liver with the three slices
Data analyses
After inclusion all measurements were checked for correct TRs, TEs, and RF coils using DICOM header information as for SIR measurements, specific TR/TE combinations and the use of the body coil are mandatory. Image quality was assessed by a research trainee (JHR, 4 years of experience) and an abdominal radiologist (JS, 20 years of experience) using a 3-point scale (good/adequate/inadequate). The type of artifact(s) was noted. Measurements with incorrect scan parameters or inadequate image quality were classified unsuccessful.
ROI-placement
SIR, R 2, and R 2* data were processed using custom-made software that allowed ROI-placement, LICSIR calculation, and R 2 and R 2* data fitting. Three blinded observers (JHR, MAT, and EMA) with four, a half and 9 years of experience, respectively, independently placed regions-of-interest (ROIs) for three slices per scan. First, the liver parenchyma was masked on R 2* source data, excluding a rim near the liver edge (Fig. 1 A). Next, non-liver voxels (e.g., vessels, gall bladder) inside the liver contour were masked (Fig. 1 B). By subtracting ROI-2 from ROI-1, only liver parenchyma remained (Fig. 1 C). Liver ROIs were copied from the R 2* data for SIR analysis, with two additional ROIs in both paraspinal muscles, carefully avoiding areas of signal intensity loss close to the lung (Fig. 1 D). This also allowed a check to identify whether patients had moved between R 2* and SIR measurements, in which case new ROIs were placed. Ghosting artifacts caused by aortic blood flow were present in SIR measurements before November 2012 (when saturation slabs were added). Separate ROIs were placed to remove these artifacts from the liver and muscle ROIs (Fig. 1 E, F). Some reports indicate that susceptibility artifacts may affect R 2* measurements when using a single ROI in liver segments VII or VIII [16]. Due to the limited number of slices, we did not formally assess segmental variations of R 2, R 2*, or LICSIR in this study.
The respiratory triggering applied for R 2 data acquisition resulted in slight changes in slice positioning so that new ROIs were placed using R 2 source data as described above.
LICSIR
The calculations published by Gandon et al. were entered into the aforementioned program [7, 17], which automatically chooses the most reliable SIR (i.e., T1, PD, T2, T2+, or T2++) which is converted to LICSIR. The mean LICSIR of three slices was used and, when one or more values exceeded the 350 µmol/g threshold, the final value was noted as >350 µmol/g. In two subanalyses, the R 2 and R 2* values and the individual SIR ratios in patients with LICSIR >350 µmol/g were evaluated.
R2*
In magnitude images, the noise is distributed in a non-Gaussian manner. This is known as Rician noise [18]. At high signal levels, the non-zero mean has a negligible effect on the average signal, but near the noise level, a noise bias exists which needs to be taken into account when fitting R 2*. We explored three different fit routines: a truncated exponential fit (A) [19, 20], an exponential + constant fit (B) [9, 21], and an exponential + Rician noise (C).
The truncated exponential method A is considered the reference standard, but is time-consuming, where methods B + C do not require further manual input. We compared method B and C with method A as reference using Bland–Altman analysis and R 2* data from a single reader (EMA). Based on this comparison (mean paired difference () was 0.8 Hz for A–C and 33.6 Hz for A–B), we employed method C (Rician noise bias) for the remaining analyses [22, 23].
R 2* calculation was thus performed with a monoexponential model (Eq. 1) with a Rician noise factor. In Eq. 1, E R describes the Rice distribution (Online Resource 1), where σ is a noise parameter and reflects the true magnitude value. Data were averaged inside the ROI before data fitting (average-then-fit).
1 |
The effect of intrahepatic fat on R 2* was assessed by applying a biexponential model in a subset (n = 10) with definite presence of fat, as identified by the presence of a oscillating signal intensity decay over time. R 2* values with and without correction were compared using Bland–Altman analysis. The () was 0.1 Hz—indicating low overall fat content in this cohort—and deemed negligible compared to the subset mean of 70 Hz. Monoexponentially fitted R 2* values were used for all comparisons.
R2
For R 2 calculation an average-then-fit routine was applied using a biexponential model as shown in Eqs. 2 and 3. In Eq. 2, S T (TE) is the signal intensity without noise at time TE, S 0 is the signal intensity at TE = 0, and R 2 is the relaxation rate. The subscripts a and b indicate fast and slow relaxation components, respectively. For R 2, Rician noise bias was approximated by the Pythagorean addition of an extra fit parameter, the noise factor ‘ν’ in Eq. 3.
2 |
3 |
In the biexponential model, an iron-dense and an iron-sparse component are assumed, with short and long R 2, respectively. For further comparisons with LICSIR and R 2*, the bulk R 2 was calculated (Eq. 4) in accordance with the literature [8, 9, 14].
4 |
Comparison with the literature
The relations between the LICSIR, R 2, and R 2* were compared to published regression analysis results based on either biopsy-proven LIC (LICBIOPSY) [8, 9, 19–21] or LICSIR [14].
Statistical analyses
Data are described as number (%) or median (interquartile range, IQR). Results of observers were compared using a Friedman test and Wilcoxon Signed-Rank test as post hoc. Success rates are defined as the number of correctly acquired scans of at least “adequate” quality divided by the total number of measurements. These were compared using a McNemar test. Correlations were assessed with Spearman’s correlation coefficients (r S), interobserver agreement with two-way random, and absolute intraclass correlation coefficients (ICCs). Both were graded according to Landis et al. [24]. Bland–Altman analysis was performed to compare accuracy between the three MRI methods for a single observer and compare the performance of the three observers [22]. In a separate analysis, the calculated R 2 and R 2* values were converted to values in μmol/g using the formulas provided by St. Pierre et al. and Garbowski et al. [8, 20] as these were established with image analysis protocols similar to ours.
ROC-analyses were performed for LICSIR, R 2, and serum values with significant correlation with R 2* to establish their diagnostic accuracy to identify increased R 2*, i.e., ≥44 Hz [9]. R 2* was chosen as a reference value as it had the best success rate and shortest acquisition time. The optimal cut-off value for R 2 was found by optimizing the Youden index, while for LICSIR we used the established cut-off value of >36 µmol/g. P values of <0.05 were accepted as statistically significant. Statistical analyses were performed using SPSS Version 22 (IBM Corp, Armonk, NY), MedCalc Statistical Software version 16.2.0 (MedCalc Software bvba, Ostend, Belgium; https://www.medcalc.org; 2016), and GraphPad Prism 5.0 (GraphPad Software, La Jolla, CA).
Results
Patients
Between January 1, 2008 and December 31, 2013, a total of 114 patients (M/F: 74/40) underwent 144 MRI-scans for routine LIC measurement. Patient characteristics and clinical indications for LIC measurement are described in Table 2. Thirty patients had multiple measurements. To prevent a repeated measurements effect on correlation assessment between LICSIR, R 2, and R 2*, only the 114 baseline measurements were used. SIR, R 2, and R 2* data were available for 108/114 (95%), 72/114 (63%), and 113/114 (99%) baseline measurements.
Table 2.
Number (%) or median (IQR) | |
---|---|
Patients | 114 |
Male/female | 74/40 (65/35%) |
Age (years) | 44 (28.5–58.1) |
Indications | |
Sickle cell anemia | 21/114 (19%) |
MDSa/leukemia | 19/114 (17%) |
Thalassemia | 17/114 (15%) |
Gaucher’s disease | 16/114 (14%) |
Hemochromatosis | 14/114 (12%) |
Hemosiderosis (not specified) | 6/114 (5%) |
Other | 21/114 (18%) |
a MDS myelodysplastic syndrome
MRI success rates
Five SIR measurements were classified unsuccessful because a surface coil was used, one due to erroneous TR/TE combinations. Furthermore, image quality was inadequate (respiration artifacts) in a single patient (only R 2 and R 2* acquired). Hence, SIR was successful in 102/114 (89%), R 2 in 71/114 (62%), and R 2* in 112/114 (98%) subjects. The success rate of R 2 was lower than that of SIR and R 2* (P < 0.0001, each). Missing datasets were presumed to not have been scanned, with time constraints and respiratory triggering problems as the major cause of the low success rate of the R 2 measurement. For subsequent analyses, only successful baseline measurements were used.
Interobserver agreement
LICSIR and R 2 values differed between observer 1 and the other observers (Table 3). However, these differences (median values: 80–85 µmol/g and 33–34 Hz for R 2) would be negligible in clinical practice. This was confirmed by high ICCs for SIR, R 2, and R 2* of 0.998, 0.997, and 0.999, respectively. Bland–Altman analysis between pairs of observers showed a single outlier for SIR, while R 2 and R 2* showed differences up to 5% for higher values, reflecting the uncertainties in the data fit at very high LIC (Online Resource 1).
Table 3.
MRI method | Observer 1 | Observer 2 | Observer 3 | P value |
---|---|---|---|---|
LICSIR (µmol/g) | 84 (30–205) | 80 (25–197) | 85 (26–196) | <0.001a |
R 2 (Hz) | 33 (23–48) | 34 (24–49) | 34 (24–49) | <0.001a |
R 2* (Hz) | 123 (56–321) | 126 (55–326) | 123 (55–317) | 0.092 |
aPost hoc analysis using Wilcoxon Signed-Rank tests showed that LICSIR and R 2 values of observer 1 differed significantly from either observer 2 or 3 (who did not differ from each other)
LICSIR, R2, and R2*
Median (IQR) LIC SIR , R 2, and R 2* (given for observer 1 and LICSIR <350 µmol/g) were 84 (30–205), 33 (23–48), and 123 (56–321). LICSIR correlated positively with R 2 and R 2* with r S of 0.90 (95% confidence interval (CI) 0.84–0.94, P < 0.0001, n = 57) and 0.98 (95% CI 0.97–0.99, P < 0.0001, n = 87), respectively. R 2 correlated positively with R 2*: r S of 0.95 (95% CI 0.93–0.97, P < 0.0001, n = 71). Figure 2 A, B shows scatter plots of (SIR-based or biopsy-proven) LIC against R 2 and R 2*. Solid lines indicate regression analysis results (95% CI bands as dashed lines). In our patient cohort, R 2 increased linearly with LICSIR (Eq. 5), while R 2* appeared to have a clear non-linear relationship with LICSIR, well described by a quadratic polynomial (Eq. 6).
5 |
6 |
The LICSIR upper threshold of 350 µmol/g was reached in 15/102 (15%) measurements. In these measurements, only the T1W SIR correlated with R 2*, with r S of −0.72 (95% CI −0.9 to −0.31, P = 0.003, n = 15). Figure 3 shows the T1 W SIR against R 2*, indicating that for LICSIR >350 µmol/g, the discriminatory value of the T1W SIR becomes progressively smaller.
Comparison with the literature
Figure 2 A, B also shows published regression lines between either LICSIR or LICBIOPSY and R 2 (Fig. 2 A) and R 2* (Fig. 2 B). Contrary to our finding, these lines indicate a linear increase of R 2* as LIC increases, and a non-linear increase of R 2 as LIC increases. To assess whether this is caused by LICSIR or by R 2 or R 2*, we applied established conversion formulae to convert our R 2 (Eq. 7) and R 2* (Eq. 8) values to LIC values [8, 20]. We then compared these LICR2* and LICR2 values to our LICSIR values.
7 |
8 |
These established conversion formulae show a non-linear relation between R 2 and true LIC (Eq. 7) and linear relation between R 2* and true LIC (Eq. 8). Hence, the scatter plot between LICR2* and LICSIR also revealed a quadratic relation, and that between LICSIR and LICR2 a linear one (data not shown).
Diagnostic accuracies of LICSIR, R2, and serum values
Serum total iron, transferrin, transferrin-saturation, and ferritin were available for 56, 56, 54, and 96 out of 114 measurements. All four correlated significantly with R 2*, with best correlation for ferritin at r S = 0.80 (P < 0.0001, n = 94).
Increased R 2* (≥44 Hz) was present in 91 subjects. Of the MRI and serum methods, R 2 and ferritin had best diagnostic accuracies to detect increased R 2* (Table 4). Figure 4 A–C shows true and false positive and negative results of R 2 (Fig. 4 A), LICSIR (Fig. 4 B), and ferritin (Fig. 4 C) for establishing increased R 2*.
Table 4.
R 2 | LICSIR | Iron | Transferrin | Transferrin-% | Ferritin | |
---|---|---|---|---|---|---|
Cases | 64/64 | 75/80 | 18/41 | 36/41 | 20/40 | 72/80 |
Cut-off | ≥18.3 Hz | ≥36 mg/g | ≥22.6 | ≤2.21 | ≥0.40 | ≥524 |
AUROC | 1.00 (0.95–1.0) | 0.97 (0.91–0.99) | 0.66 (0.53–0.79) | 0.84 (0.72–0.93) | 0.77 (0.64–0.87) | 0.98 (0.93–1.0) |
Sensitivity | 100% (94.4–100%) | 93.8% (86.0–97.9%) | 43.9% (28.5–60.3%) | 87.8% (73.8–95.9%) | 50.0% (33.8–66.2%) | 90.0% (81.2–95.6%) |
Specificity | 100% (59.0–100%) | 100% (83.9–100%) | 100% (76.8–100%) | 71.4% (41.9–91.6%) | 92.3% (64.0–99.8%) | 100% (76.8–100%) |
PPV | 100% (93.7–100%) | 100% (95.2–100%) | 100% (82.6–100%) | 92.4% (79.8–98.3%) | 96.3% (78.3–100%) | 100% (94.7–100%) |
NPV | 100% (77.3–100%) | 80.2% (59.7–93.2%) | 31.1% (16.7–48.7%) | 59.7% (30.3–84.7%) | 31.8% (16.4–50.9%) | 71.7% (51.0–87.3%) |
AUROC area under the ROC curve, PPV positive predictive value, NPV negative predictive value
Values in parentheses reflect the 95% confidence intervals
Discussion
This study shows that for routine clinical MRI-based LIC measurements SIR and R 2* are more often successful than R 2. Interobserver agreement was near perfect (ICC > 0.9) for all methods. R 2 and R 2* methods provided relaxation rates when the SIR-threshold (>350 µmol/g) was already exceeded. This gives them an advantage over SIR in subjects with transfusional hemosiderosis (at least 55% of our population), when LIC values can easily surpass 350 µmol/g. The combination of high success rate, high interobserver agreement, ability to detect changes in LIC over a wide range of LIC values, and single breath-hold acquisition favors the R 2* method for LIC measurement.
In our study, the relationship between R 2* and LICSIR was quadratic and remained quadratic when R 2* was expressed as a LIC value using a previously published (biopsy-proven) conversion formula. Other authors report linear relationships. Given the physics of the R 2*–iron relationship, which is basically linear [25], this discrepancy arises either from our R 2* acquisition and analysis or from the reference standard. To rule out the former, we compared three fit routines. The exponential + Rician noise factor fit provided identical results in a fraction of the required time to the established and widely applied but labor-intensive method of manual truncation before exponential fitting.
With respect to reference standard, St. Pierre et al. [8], Wood et al. [9], Hankins et al. [19], Garbowski et al. [20], and Anderson et al. [21] all used biopsy-determined LICBIOPSY as reference standard, whereas we and Christoforidis et al. [14] used the LICSIR according to Gandon. Given the similarity of our MRI protocols, it is unsurprising that Christoforidis’ and our data points show considerable overlap. Arguably, their linear relation between LICSIR and R 2* could also be described by a quadratic polynomial.
Apart from the linear relationship, the other authors report much steeper increase of R 2* as LIC increases [9, 19–21]. Anderson et al.’s very steep increase could be due a long TE1 of 2.2 ms compared to all other studies (range of TE1: 0.8–0.99 ms) that hampers the ability to accurately estimate high R 2* values. The fact that the control values of R 2* in subjects without iron overload in those studies but also in this paper hover around 40 Hz is a further argument that the observed difference in LIC–R 2* does not arise from the R 2* acquisition or analysis but from the reference standard.
Hence, the most likely cause of the deviating quadratic relation between R 2* and estimated LIC is the piecewise sampling of the LIC range with five differently weighted GRE-sequences for LICSIR. This has artificially imposed a quadratic behavior on the actually linear relationship between R 2* and true LICBIOPSY. If one looks at the fundamental GRE signal equation (Eq. 9), where PD is proton density and α is flip angle and applies this to the liver-to-muscle signal intensity ratio, the PD and sin(α) terms drop out. By taking the natural logarithm, we find Eqs. 10 and 11. The latter proves that the relationship between R 2* and SIR is logarithmic. Indeed, plotting Fig. 3 with a log-scale for the signal intensity ratio on the y-axis linearized the line (data not shown).
9 |
10 |
11 |
For R 2, single- and multiecho SE acquisitions are possible: multiecho SE decreases R 2 due to residual signal of stimulated echoes at a given TE. Single-echo SE increases R 2 because long TEs cause increased sensitivity to diffusion, hence increased signal loss at a given TE. Reported single-echo SE R 2 values [8, 9] were concordantly higher for the same estimated LIC compared to multiecho SE results as in this study and in [14]. In terms of R 2 data fitting, we as many others applied a biexponential model and we did not assess non-exponential decay models as for instance proposed by Jensen et al. [26].
The main limitation of our study is the lack of biopsy confirmation. In our center, liver biopsy for iron determination is seldom performed. Both the national, European and American guidelines recommend reluctance in performing biopsy and underline the high sensitivity of MRI [15, 27, 28]. Moreover, differing processing steps to obtain LICBIOPSY are reported, compromising generalizability. In Gandon’s method, paraffin-embedded liver biopsy specimens are dewaxed using a protocol with a triple xylene wash to remove lipid solids from the sample. This approach was shown to have an elevating effect on the dry weight liver iron calculation compared to processing fresh tissue samples [29]. Another limitation is the fact that we did not perform multipeak fat-correction on complex data [10]. This was not feasible with only magnitude data available. Comparison to other literature is further hampered by the use of different image acquisition and postprocessing protocols which directly influence the calibration curves between the reference standard and the index test. We have opted to compare our findings to calibration curves obtained with similar postprocessing protocols.
ROC-analyses showed that R 2 and ferritin have the highest diagnostic accuracy to identify increased R 2* (≥44 Hz). Both ferritin (≥524 µg/L) and R 2 (≥18.3 Hz) had positive predictive values of 100%, but the wide distribution of ferritin levels for R 2* ≥ 44 Hz indicates that it cannot be used confidently to follow-up treatment nor accurately determine the LIC. In contrast, R 2 shows a different picture with a close distribution around the regression line. In addition, ferritin lacks the spatial information that MRI provides, allowing segmental LIC measurement and follow-up.
R 2 datasets were missing (i.e., not scanned) in 42/114 (37%) subjects. As R 2 is part of our routine scan protocol, this illustrates that the long and artifact-prone R 2 series is skipped first by the radiographer. This makes the R 2 series less suited as first choice for LIC measurement.
Our results favor the use of R 2* measurements for daily clinical practice with the use of an exponential + Rician noise fit method to save time in analysis. The recommendation to (only) use R 2* comes with cautions. It requires careful consideration of scan parameters which should be kept equal for all measurements. Ideally, routine quality control with phantom testing should be performed.
In conclusion, as R 2* can be obtained in a single breath-hold with excellent success rates, high interobserver agreement, and ability to detect changes over a wide range of LIC values and is available from all major vendors without additional per-scan costs, it is our first choice for LIC measurement.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgments
The authors would like to acknowledge Paul F. Groot for anonymizing the data and Shandra Bipat for providing advice on statistical analyses.
Abbreviation
- LIC
Liver iron content
Compliance with ethical standards
Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical standard/informed consent
This was a retrospective study using data obtained in routine clinical practice that were anonymized before analysis. In light of the respective nature of the study, the obligation to obtain informed consent was waived by the Medical Ethical Committee of the AMC Amsterdam.
References
- 1.Tavill AS, AASLD. ACG Diagnosis and management of 2 hemochromatosis. Hepatology. 2001;33:1321–1328. doi: 10.1053/jhep.2001.24783. [DOI] [PubMed] [Google Scholar]
- 2.Pietrangelo A. Haemochromatosis. Gut. 2003;52(Suppl 2):ii23–ii30. doi: 10.1136/gut.52.suppl_2.ii23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Queiroz-Andrade M, Blasbalg R, Ortega CD, Rodstein MA, Baroni RH, Rocha MS, Cerri GG. MR imaging findings of iron overload. Radiographics. 2009;29:1575–1589. doi: 10.1148/rg.296095511. [DOI] [PubMed] [Google Scholar]
- 4.Bravo AA, Sheth SG, Chopra S. Liver biopsy. N Engl J Med. 2001;344:495–500. doi: 10.1056/NEJM200102153440706. [DOI] [PubMed] [Google Scholar]
- 5.Sirlin CB, Reeder SB. Magnetic resonance imaging quantification of liver iron. Magn Reson Imaging Clin N Am. 2010;18:359–381. doi: 10.1016/j.mric.2010.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Gandon Y, Olivie D, Guyader D, Aube C, Oberti F, Sebille V, Deugnier Y. Non-invasive assessment of hepatic iron stores by MRI. Lancet. 2004;363:357–362. doi: 10.1016/S0140-6736(04)15436-6. [DOI] [PubMed] [Google Scholar]
- 7.Gandon Y. Rennes—hemochromatosis. Y Gandon, Rennes. 10-06-2001. http://www.radio.univ-rennes1.fr/Sources/EN/Hemo.html. Accessed October 16, 2015.
- 8.St Pierre TG, Clark PR, Chua-anusorn W, Fleming AJ, Jeffrey GP, Olynyk JK, Pootrakul P, Robins E, Lindeman R. Noninvasive measurement and imaging of liver iron concentrations using proton magnetic resonance. Blood. 2005;105:855–861. doi: 10.1182/blood-2004-01-0177. [DOI] [PubMed] [Google Scholar]
- 9.Wood JC, Enriquez C, Ghugre N, Tyzka JM, Carson S, Nelson MD, Coates TD. MRI R2 and R2* mapping accurately estimates hepatic iron concentration in transfusion-dependent thalassemia and sickle cell disease patients. Blood. 2005;106:1460–1465. doi: 10.1182/blood-2004-10-3982. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hernando D, Kramer JH, Reeder SB. Multipeak fat-corrected complex R2* relaxometry: theory, optimization, and clinical validation. Magn Reson Med. 2013;70:1319–1331. doi: 10.1002/mrm.24593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Krafft AJ, Loeffler RB, Song R, Bian X, McCarville MB, Hankins JS, Hillenbrand CM. Does fat suppression via chemically selective saturation affect R2*-MRI for transfusional iron overload assessment? A clinical evaluation at 1.5T and 3T. Magn Reson Med. 2015 doi: 10.1002/mrm.25868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Yokoo T, Yuan Q, Senegas J, Wiethoff AJ, Pedrosa I. Quantitative R2* MRI of the liver with rician noise models for evaluation of hepatic iron overload: Simulation, phantom, and early clinical experience. J Magn Reson Imaging. 2015;42:1544–1559. doi: 10.1002/jmri.24948. [DOI] [PubMed] [Google Scholar]
- 13.Ibrahim EH, Khalifa AM, Eldaly AK. MRI T2* imaging for assessment of liver iron overload: study of different data analysis approaches. Acta Radiol. 2016 doi: 10.1177/0284185116628337. [DOI] [PubMed] [Google Scholar]
- 14.Christoforidis A, Perifanis V, Spanos G, Vlachaki E, Economou M, Tsatra I, Athanassiou-Metaxa M. MRI assessment of liver iron content in thalassamic patients with three different protocols: comparisons and correlations. Eur J Haematol. 2009;82:388–392. doi: 10.1111/j.1600-0609.2009.01223.x. [DOI] [PubMed] [Google Scholar]
- 15.Swinkels DW, van Bokhoven MA, Castel A, van Deursen CTBM, Giltay JC, van Krieken JHJM, Macfarlane JD, de Man RA, Marx JJM, Pijl MEJ, Raymakers RAP, de Sterke P, de Vries RA. Richtlijn Diagnostiek en behandeling van hereditaire hemochromatose. Utrecht: Nederlandse Internisten Vereeniging en Nederlandse Vereniging voor Klinische Chemie; 2007. [Google Scholar]
- 16.Meloni A, Luciani A, Positano V, De Marchi D, Valeri G, Restaino G, Cracolici E, Caruso V, Dell’amico MC, Favilli B, Lombardi M, Pepe A. Single region of interest versus multislice T2* MRI approach for the quantification of hepatic iron overload. J Magn Reson Imaging. 2011;33(2):348–355. doi: 10.1002/jmri.22417. [DOI] [PubMed] [Google Scholar]
- 17.Gandon Y. Gandon calculations. Y Gandon, Rennes. 10-06-2001. http://www.radio.univ-rennes1.fr/Images/Externe15.js. Accessed October 16, 2015
- 18.Gudbjartsson H, Patz S. The Rician distribution of noisy MRI data. Magn Reson Med. 1995;34:910–914. doi: 10.1002/mrm.1910340618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Hankins JS, McCarville MB, Loeffler RB, Smeltzer MP, Onciu M, Hoffer FA, Li CS, Wang WC, Ware RE, Hillenbrand CM. R2* magnetic resonance imaging of the liver in patients with iron overload. Blood. 2009;113:4853–4855. doi: 10.1182/blood-2008-12-191643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Garbowski MW, Carpenter JP, Smith G, Roughton M, Alam MH, He T, Pennell DJ, Porter JB. Biopsy-based calibration of T2* magnetic resonance for estimation of liver iron concentration and comparison with R2 Ferriscan. J Cardiovasc Magn Reson. 2014 doi: 10.1186/1532-429X-16-40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Anderson LJ, Holden S, Davis B, Prescott E, Charrier CC, Bunce NH, Firmin DN, Wonke B, Porter J, Walker JM, Pennell DJ. Cardiovascular T2-star (T2*) magnetic resonance for the early diagnosis of myocardial iron overload. Eur Heart J. 2001;22:2171–2179. doi: 10.1053/euhj.2001.2822. [DOI] [PubMed] [Google Scholar]
- 22.Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1:307–310. doi: 10.1016/S0140-6736(86)90837-8. [DOI] [PubMed] [Google Scholar]
- 23.Akkerman EM, Runge JH, Troelstra MA, Nederveen AJ, Stoker J (2015) Non-linear relationship between estimated liver iron concentration and R2*. ISMRM 3268
- 24.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159–174. doi: 10.2307/2529310. [DOI] [PubMed] [Google Scholar]
- 25.Ghugre NR, Wood JC. Relaxivity-iron calibration in hepatic iron overload: probing underlying biophysical mechanisms using a Monte Carlo model. Magn Reson Med. 2011;65:837–847. doi: 10.1002/mrm.22657. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Jensen JH, Chandra R. Theory of nonexponential NMR signal decay in liver with iron overload or superparamagnetic iron oxide particles. Magn Reson Med. 2002;47:1131–1138. doi: 10.1002/mrm.10170. [DOI] [PubMed] [Google Scholar]
- 27.European Association for the Study of the Liver EASL clinical practice guidelines for HFE hemochromatosis. J Hepatol. 2010;53:3–22. doi: 10.1016/j.jhep.2010.03.001. [DOI] [PubMed] [Google Scholar]
- 28.Bacon BR, Adams PC, Kowdley KV, Powell LW, Tavill AS. Diagnosis and management of hemochromatosis: 2011 practice guideline by the American Association for the Study of liver diseases. Hepatology. 2011;54:328–343. doi: 10.1002/hep.24330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Butensky E, Fischer R, Hudes M, Schumacher L, Williams R, Moyer TP, Vichinsky E, Harmatz P. Variability in hepatic iron concentration in percutaneous needle biopsy specimens from patients with transfusional hemosiderosis. Am J Clin Pathol. 2005;123:146–152. doi: 10.1309/PUUXEGXDLH26NXA2. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.