Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Nov 1.
Published in final edited form as: J Magn Reson Imaging. 2019 Feb 13;50(5):1620–1632. doi: 10.1002/jmri.26682

Autoregressive Moving Average Modeling for Hepatic Iron Quantification in the Presence of Fat

Aaryani Tipirneni-Sajja 1,2, Axel J Krafft 1,3, Ralf B Loeffler 1, Ruitian Song 1, Armita Bahrami 4, Jane S Hankins 5, Claudia M Hillenbrand 1
PMCID: PMC6785364  NIHMSID: NIHMS1034460  PMID: 30761652

Abstract

BACKGROUND:

Measuring hepatic R2* by fitting a mono-exponential model to the signal decay of a multi gradient-echo (mGRE) sequence non-invasively determines hepatic iron content (HIC). Concurrent hepatic steatosis introduces signal oscillations and confounds R2* quantification with standard mono-exponential models.

PURPOSE:

To evaluate an autoregressive moving average (ARMA) model for accurate quantification of HIC in the presence of fat using biopsy as reference.

STUDY TYPE:

Phantom study and in vivo cohort.

POPULATION

Twenty iron–fat phantoms covering clinically relevant R2* (30 – 800 s−1) and fat fraction (FF) ranges (0 – 40%), and 10 patients (4 male, 6 female, mean age 18.8years).

FIELD STRENGTH/SEQUENCE:

2D mGRE acquisitions at 1.5T and 3T.

ASSESSMENT:

Phantoms were scanned at both field strengths. In vivo data were analyzed using ARMA model to determine R2* and FF values, and compared to biopsy results.

STATISTICAL TESTS:

Linear regression analysis was used to compare ARMA R2* and FF results to those obtained using a conventional mono-exponential model, complex-domain non-linear least squares (NLSQ) fat–water model, and biopsy.

RESULTS:

In phantoms and in vivo, all models produced R2* and FF values consistent with expected values in low iron and low/high fat conditions. For high iron and no fat phantoms, mono-exponential and ARMA models performed excellently (slopes: 0.89–1.07), but NLSQ overestimated R2* (slopes: 1.14 – 1.36) and produced false FFs (12 – 17%) at 1.5T; in high iron and fat phantoms, NLSQ (slopes: 1.02 – 1.16) outperformed mono-exponential and ARMA models (slopes: 1.23 – 1.88). Results with NLSQ and ARMA improved in phantoms at 3T (slopes: 0.96 – 1.04). In patients, mean R2*-HIC estimates for mono-exponential and ARMA models were close to biopsy-HIC values (slopes: 0.90–0.95), whereas NLSQ substantially overestimated HIC (slope 1.4) and produced FF values (4 – 28%) with very high SDs (15 – 222%) in patients with high iron overload.

DATA CONCLUSION:

ARMA is superior in quantifying R2* and FF under high iron and no fat conditions, whereas NLSQ is superior for high iron and concurrent fat at 1.5T. Both models give improved R2* and FF results at 3T.

Keywords: ARMA modeling, R2* quantification, hepatic iron overload, hemosiderosis, fat fraction, steatosis

INTRODUCTION

Hepatic iron overload is a serious condition of hereditary hemochromatosis or chronic blood transfusions.1 Measuring hepatic iron content (HIC) is required to guide the iron removal treatment. Over recent years, magnetic resonance imaging (MRI) based on effective transverse relaxation rate (R2*) has become widely accepted as a reliable tool to estimate HIC.2 Previous studies showed a linear correlation between R2* measurements by MRI and HIC measurements by biopsy, and thus demonstrated that HIC can be non-invasively estimated from R2* using established R2*-HIC calibration curves.36

Currently published R2*-HIC calibration studies have measured hepatic R2* by fitting a mono-exponential model to the signal decay obtained from a multi gradient echo (mGRE) sequence.4,5 However in patients with hepatic steatosis (i.e., accumulation of fat in the liver), the mGRE signal does not follow a pure mono-exponential decay.

Hepatic steatosis has become a common condition, affecting about 20%−30% of the U.S. population, and is frequently linked to growing obesity and insulin resistance.7,8 Further, fatty infiltration of the liver is of increasing concern in oncology because of the side effects of certain chemotherapeutic agents.911 The co-occurrence of hepatic iron and fat is also being recently observed in blood transfusion dependent patients, especially cancer survivors,9,12,13 and patients with nonalcoholic fatty liver disease (NAFLD).14,15 The presence of fat introduces oscillations in the mGRE signal which are not sufficiently considered by a pure mono-exponential model so that R2* measurements will be confounded. The oscillations arise because fat and water exhibit slightly different resonance frequencies with a frequency difference of 3.4 ppm between the water and main lipid peak.16

A simple solution to estimate R2* in the presence of fat can be achieved by collecting signals only at echo times (TE) when water and fat are in-phase (i.e., 4.6 ms, 9.2 ms, … at 1.5T). However, such an acquisition scheme requires a long echo spacing, which yields inaccurate R2* values due to insufficient temporal sampling of the rapid signal decay in the presence of iron.17 Another solution is to suppress the fat signal via inversion-recovery or chemically selective fat saturation. However, recent studies show that these strategies can lead to R2* underestimation in cases of iron overload, even without fat.18,19 Another limitation of both concepts is that they do not consider the spectral complexity of fat, which comprises multiple lipid peaks.20

To overcome these limitations, multi-spectral signal modeling techniques have been proposed for simultaneous quantification of liver R2* and fat fraction, thus enabling the diagnosis of both hepatic iron deposit and steatosis. One of these techniques is based on non-linear least squares (NLSQ) fitting of the fat-water signal model either using magnitude data or complex data.21,22 This model requires a priori information about relative frequencies and amplitudes of the multiple lipid peaks. Further, although water and fat might have different T2* decays, this model fits only a single R2* value for both water and fat peaks in order to reduce model complexity.22 Although this model has been well validated for quantifying the fat fraction (FF), limited information is available on how accurately it quantifies R2* in the presence of iron and fat.23

Another multi-spectral signal modeling technique based on autoregressive moving averaging (ARMA)24 has been presented to quantify hepatic R2* in the presence of iron and for fat–water quantification in the absence of iron.25, 26 ARMA translates the temporal mGRE signal evolution into a rational polynomial in the z-domain, and can determine amplitudes, relative frequencies, and R2* rates specifically for water and multiple lipid species.24 So far, the performance of ARMA has not been systematically tested in the presence of both iron and fat.

Therefore, the purpose of this study, is to develop and validate a robust algorithm based on ARMA modeling, and to compare its performance with two common current signal modeling techniques – pure mono-exponential and NLSQ fat-water models – for accurate R2* quantification in iron-fat phantoms and iron overload patients, using biopsy as a reference.

METHODS

Phantom Experiments

Twenty cylindrical 140-mL iron-fat phantoms were made from 2% agar-water mixtures, peanut oil, and bionized nonferrites (BNF) iron particles.27 Phantoms with varying combinations of iron concentrations (0, 7.5, 15, 30, 60 μg/mL) and fat percentages (0%, 10%, 20%, 40%) covering clinically relevant R2* and FF values were created. All phantom bottles were stacked into a 4 × 5 rectangular array and scanned at 1.5T (Avanto, Siemens Healthineers, Malvern, PA) and 3T (Skyra, Siemens Healthineers) with an mGRE sequence acquiring all echoes with a monopolar readout gradient. The following acquisition parameters were used for the first multi-echo GRE acquisition (‘single-shot GRE’): TR = 200 ms, TE1 = 1.2 ms, ΔTE = 1.44 ms at 1.5T and 1.48 ms at 3T, flip angle = 25°, 10 echoes, matrix = 128 × 104, field of view (FOV) = 300 mm, slice thickness = 10 mm, pixel bandwidth = 1950 Hz/px. Then, a second multi-echo GRE sequence was acquired with same acquisition parameters but by incrementing all TEs of the first acquisition by 0.72 ms at 1.5T and 0.74 ms at 3T. Combining the two mGRE acquisitions (‘dual-shot GRE’) created an mGRE data set with denser temporal sampling and reduced the echo spacing to 0.72 ms and 0.74 ms at 1.5T and 3T, respectively.

In Vivo Studies

The institutional review board approved this study, and informed consent was obtained from all participants prior to any research assessment. In vivo data were collected from thirteen consecutive patients (4 male, 9 female; mean age 19.3 ± 11.6 years [range 2.4–39.7 years]) who underwent MRI scans and liver biopsy from November 2014 to September 2015 for clinical monitoring of HIC. Diagnoses included sickle cell disease (n = 8), acute lymphocytic leukemia (ALL) (n = 4), and severe aplastic anemia (n = 1). Two liver biopsy specimens were obtained from each patient, the first for HIC quantitation by atomic absorption spectrophotometry (Mayo Laboratories, Rochester, MN) and the second for pathologic review. Pathologic review was performed by a single pathologist blinded to the clinical status and HIC values. The biopsies were scored for the degree of steatosis, based on the amount of surface area involved by steatosis on microscopic examination, and graded into the following categories: minimal/negligible (<5%), mild (5%−33%), moderate (>33%−66%), and severe (>66%).28

All patients underwent 1.5T (Avanto, Siemens Healthineers) MRI exams with a protocol comprising the acquisition of mGRE images in axial orientation at the location of the portal vein: TR/TE1/ΔTE = 200/1.07/1.51 ms, 10 echoes, flip angle = 45°, monopolar readout gradient, matrix = 128 × 104, slice thickness = 5 mm, bandwidth = 1950 Hz/px, and FOV = 220–350 mm, depending on the patient’s size. Images were acquired in a single breath-hold of ~21 s in patients who could breath-hold, or in free-breathing with 3–4 averages (acquisition time ~63–84 s) in sedated patients to minimize respiratory motion artifacts. One sedated patient was excluded from the in vivo analysis because of severe respiratory motion artifacts in the MR images. Two cases with severe iron overload were excluded due to poor number of T2* fitted pixels (<30% of the total number of pixels).29 Therefore, a total of 10 patients were included in the analysis.

Data Analyses

In phantoms and patients, quantitative R2* maps were calculated in three ways: (i) using a magnitude-based mono-exponential model,5,30 and using two complex domain based fat–water modeling techniques (ii) NLSQ and (iii) ARMA (see Supplement for details) that also obtained FF maps. All models were implemented in Matlab (Mathworks, Natick, MA). The NLSQ model was implemented from the ISMRM Fat-Water Toolbox using the graph cut algorithm for B0 field estimation.31,32 This model assumes a single R2* for water and lipid peaks and uses fixed values for the relative frequencies of water and lipid peaks, and for the relative amplitudes of the lipid peaks according to Hamilton et al.20 In contrast, the ARMA model was implemented as an iterative approach starting with the maximum number of seven peaks (i.e., 1 water peak and 6 lipid peaks),20 and reducing the number of peaks iteratively until the frequencies of the detected lipid peaks fell within the range of the reported relative frequencies (± 0.5 ppm) according to Hamilton et al.20 After detecting the precise peaks, the water R2* and FF maps extracted with the ARMA model were obtained.

In phantoms, performance of the mono-exponential, NLSQ, and ARMA models was evaluated using single-shot and dual-shot mGRE acquisitions. Mean R2* values calculated with the mono-exponential model in pure iron-doped phantoms for the dual-shot mGRE sequence were considered as reference values for comparison with other iron–fat phantoms and fit models, under the assumption that the R2* values are not affected by the added fat content. Mean (± standard deviation [SD]) R2* results obtained with mono-exponential, NLSQ, and ARMA models were compared with the reference R2* values in iron phantoms with varying FF values, and mean (±SD) FF results obtained with NLSQ and ARMA models were compared with the known FF values in fat phantoms with varying iron concentrations by using linear regression analysis. In patients, mean liver R2* and FF values were calculated for each model after manual selection of a region of interest encompassing the entire liver and excluding blood vessels based on histogram analysis.29 Further, mean liver R2* values were converted into HIC estimates for each model (hereafter designated as R2*-HIC) using a previously published R2*-HIC biopsy calibration,5 and compared to biopsy HIC values by linear regression analysis.

RESULTS

Phantoms

R2* Results

Figure 1 shows R2* maps for mono-exponential, NLSQ and ARMA models, and FF maps for NLSQ and ARMA models in phantoms at 1.5T and 3T, using single-shot and dual-shot mGRE acquisitions. With single-shot mGRE, all models yielded mean R2* values similar to reference values in phantoms, with low iron concentrations (R2* < 300 s–1) irrespective of the FF value (Fig. 2). For iron phantoms with no or low FF values (0%, 10%), mono-exponential and ARMA models showed a linear relationship with reference R2* values (Fig. 2, Table 1, slopes: 0.89–1.07, R2 > 0.99) at both field strengths. In contrast, for iron phantoms with higher FF values (20%, 40%), mono-exponential and ARMA models overestimated R2* values at 1.5T (slopes: 1.24–1.99, R2 > 0.99) and underestimated R2* values at 3T (slopes: 0.44–0.77, R2 ≥ 0.96) at high iron concentrations. In contrast, NLSQ overestimated R2* values (slopes: 1.25–1.36, R2 > 0.99), with high SDs for phantoms with the highest iron concentration and low FF values at 1.5T. The NLSQ R2* results, however, improved at 3T (slopes: 1.04–1.20, R2 > 0.99) for the highest iron concentration.

Figure 1.

Figure 1.

R2* maps for mono-exponential, NLSQ and ARMA models and FF maps for NLSQ and ARMA models in iron-fat phantoms at 1.5T and 3T using single-shot and dual-shot mGRE acquisitions. Using single-shot mGRE at 3T, ARMA completely failed in fat–water separation and produced FF values close to 0% over the entire FF range (solid white arrows), whereas NLSQ overestimated FF values at the highest iron concentrations (60 μg/ml) especially for the 0% FF phantoms (dotted white arrows). Note that NLSQ produced mean FF values of 2.2±0.25% in pure agar phantoms (i.e., 0 μg/ml iron and 0% FF), indicating some bias in the FF estimation.

Figure 2.

Figure 2.

R2* results for single-shot and dual-shot mGRE acquisitions. Mean (±SD, as error bars) R2* values obtained by mono-exponential (first column), NLSQ (second column), and ARMA (third column) models plotted against reference R2* values in iron phantoms for varying fat fractions (FFs) at 1.5T and 3T. The black dotted line indicates the line of unity. Table 1 shows the linear regression analysis (slope, intercept, and R2) for each model at different field strengths and acquisitions.

Table 1.

Linear Regression Analysis Between Measured and Reference R2* Values (in s–1) for Varying Fat Fractions (FFs, in %) Using the Mono-Exponential, NLSQ, and ARMA Models with Single-Shot and Dual-Shot mGRE Acquisitions.

Fat Fraction (%)
Field Strength Fit Parameter Single-shot mGRE Dual-shot mGRE
0% 10% 20% 40% 0% 10% 20% 40%
1.5T Mono-exponential Slope 1 1.02 1.25 1.72 x 1.02 1.23 1.88
Intercept –0.83 –0.84 –36.6 –87.2 x –2.5 –42.9 –144
R2 >0.99 >0.99 >0.99 0.99 x >0.99 >0.99 0.97
NLSQ Slope 1.36 1.26 1.25 1.09 1.16 1.14 1.16 1.02
Intercept –55 –37 –42.7 –16.4 –23.5 –16.8 –27.6 –5.1
R2 >0.99 >0.99 >0.99 0.99 >0.99 >0.99 >0.99 >0.99
ARMA Slope 1.07 1.07 1.25 2 1.05 1.07 1.24 1.49
Intercept –9.9 –4.8 –37.6 –67.1 –6.8 –3.5 –35.5 –48.6
R2 >0.99 >0.99 >0.99 >0.99 >0.99 >0.99 >0.99 0.98
3T Mono-exponential Slope 1 0.89 0.75 0.44 x 0.9 0.79 0.51
Intercept 1.5 29.3 27.6 55.7 x 30.9 32.4 67.4
R2 >0.99 >0.99 >0.99 0.99 x >0.99 >0.99 0.99
NLSQ Slope 1.18 1.04 1.08 1.2 1.04 0.98 0.96 0.98
Intercept –29.0 21.4 6.9 –8.5 –5.4 29.2 24.4 23.8
R2 >0.99 >0.99 >0.99 >0.99 >0.99 >0.99 >0.99 >0.99
ARMA Slope 1 0.9 0.77 0.51 1.01 0.93 0.99 0.98
Intercept 1.5 31.1 31.4 98.9 –0.31 40.8 22.8 52
R2 >0.99 >0.99 >0.99 0.96 >0.99 >0.99 >0.99 >0.99

Note: R2* values calculated with the mono-exponential model in pure iron phantoms based on the dual-shot mGRE sequence were considered as reference. “x” denotes not applicable. Cells shaded in red indicate deviations from the slope of 1.0 and intercept of 0.0; lighter to darker shading corresponds to minimum and maximum deviations, respectively.

Abbreviations: NLSQ, non-linear least squares model; ARMA, autoregressive moving average model.

With dual-shot mGRE, mono-exponential fits produced similar R2* results as with the single-shot mGRE approach at both field strengths: R2* values were overestimated at 1.5T (Table 1, slopes: 1.23–1.88, R2 ≥ 0.97) and underestimated at 3T (slopes: 0.51–0.79, R2 ≥ 0.96) for highest iron concentrations and high FF content (20%, 40%). In comparison to the single-shot mGRE results, ARMA and NLSQ results improved at 1.5T, but ARMA still overestimated R2* values (Fig. 1, slopes: 1.24–1.49, R2 ≥ 0.98) for the highest iron concentrations and high FF content (20%, 40%). At 3T, R2* results for ARMA and NLSQ were substantially improved (slopes: 0.93–1.04, R2 > 0.99).

FF Results

Figure 3 shows mean (±SD) FF values calculated via the NLSQ and ARMA models plotted against expected FF values for varying iron concentrations, using both, single-shot and dual-shot mGRE acquisitions at 1.5T and 3T field strengths. With single-shot mGRE, both, NLSQ and ARMA FF values, showed a linear relationship (Table 2, slopes: 0.99–1.19, R2 ≥ 0.99) for no or low (0–15 μg/mL) iron concentrations at 1.5T. However, at high iron concentrations (30, 60 μg/mL), NLSQ generally overestimated and ARMA underestimated FF values and both models produced high SDs. At 3T, ARMA completely failed in fat–water separation and produced FF values close to 0% over the entire FF range (Fig. 1, solid white arrows), whereas NLSQ overestimated FF values at the highest iron concentrations especially for the 0% FF phantoms (Fig. 1, dashed white arrows).

Figure 3.

Figure 3.

Fat fraction (FF) results for single-shot and dual-shot mGRE acquisitions. Mean (±SD, as error bars) FF values obtained using NLSQ (first column) and ARMA (second column) models plotted against true FFs in fat phantoms for varying iron concentrations at 1.5T and 3T. The black dotted line indicates the line of unity. Table 2 shows linear regression analysis (slope, intercept, and R2) for each model at different field strengths and acquisitions.

Table 2.

Linear Regression Analysis Between Measured and True FF Values for Varying Iron Concentrations (in μg/mL), Using NLSQ, and ARMA Models with Single-Shot and Dual-Shot mGRE Acquisitions.

Field Strength Fit Parameter Iron Concentration (μg/mL)
Single-shot mGRE Dual-shot mGRE
0 7.5 15 30 60 0 7.5 15 30 60
1.5 T NLSQ Slope 1.04 1.19 1.17 1.19 0.8 1.04 1.18 1.15 1.22 0.88
Intercept 1.6 –2.4 –0.1 3.2 11.8 1.6 –2.4 0.31 1.1 7
R2 >0.99 0.99 >0.99 >0.99 0.93 >0.99 0.99 >0.99 >0.99 0.94
ARMA Slope 1 1.01 0.99 0.27 0.01 1.02 1.01 1 1.06 0.5
Intercept 1.5 –1.3 0.57 6.1 1.7 1.2 –1.3 0.38 –0.02 –2.8
R2 >0.99 >0.99 >0.99 0.51 0.09 >0.99 >0.99 >0.99 >0.99 0.92
3 T NLSQ Slope 1.01 1.11 1.1 1.11 1.01 0.98 1.12 1.09 1.12 1.03
Intercept 1.8 2.7 2.6 3.4 10.5 1.5 1.8 2.4 3 1.4
R2 >0.99 >0.99 >0.99 0.99 0.96 >0.99 >0.99 >0.99 >0.99 0.99
ARMA Slope 0 0 0 0 0.1 1.07 1.07 1.06 1.12 1.05
Intercept 0 0 0 0.1 –0.6 –0.11 1.6 2.3 1.3 –2.5
R2 0 0 0 –0.7 0.92 0.99 >0.99 0.99 >0.99 0.98

Note: R2* values calculated with the mono-exponential model in pure iron phantoms using the dual-shot mGRE sequence were considered as reference. Cells shaded in red indicate deviations from slope of 1.0 and intercept of 0.0; lighter to darker shading corresponds to minimum and maximum deviations, respectively.

Abbreviations: NLSQ, non-linear least squares model; ARMA, autoregressive moving average model.

With dual-shot mGRE, FF results produced by the NLSQ and ARMA models were also superior to those from single-shot mGRE. However, for the iron–fat phantoms with highest iron concentrations, NLSQ still overestimated the fat content for phantoms with 0% FF (mean estimated FF: 12.5%) and produced high SDs for phantoms with 10% and 20% FFs at 1.5T. In contrast, ARMA generally underestimated FF values with high SDs at the highest iron concentration. However, at 3T, FF results for NLSQ and ARMA were substantially improved (Table 2, slopes: 0.98–1.12, R2 > 0.98), although ARMA still underestimated the actual FF for the 10% FF phantom with the highest iron concentration and NLSQ slightly overestimated FF value for phantom with the highest iron content and 0% FF (Fig. 3).

In Vivo Studies

Figure 4 shows representative R2* and FF maps obtained with the three fitting models, along with histology slides of a patient with biopsy-confirmed mild hepatic iron overload and moderate steatosis. In histology slides, iron deposition is seen in blue by Perl’s iron staining. Clear vacuoles on both hematoxylin and eosin (H&E) and Perl’s iron staining indicate the presence of fat deposits. The mean liver R2* values obtained with the three models were in agreement and translated into R2*-HIC estimates of 2.6–2.7 mg Fe/g dry weight, which is close to the biopsy HIC of 2.6 mg Fe/g. In this patient, the mean liver FF value calculated with ARMA (16 ± 4%) was close but slightly lower than the NLSQ FF value (18 ± 4%).

Figure 4.

Figure 4.

R2*-magnetic resonance imaging (MRI) and histology slides obtained from a 2-year-old patient with acute lymphoblastic leukemia with biopsy-confirmed mild iron overload and steatosis. (A) Calculated R2* maps obtained with the mono-exponential, NLSQ, and ARMA models, and FF maps obtained with NLSQ and ARMA models. (B) Histology slides of the liver biopsy sample with hematoxylin and eosin (H&E) and Perl’s iron staining (magnification 20×). Clear vacuoles (i.e., white bubbles) on both stains indicate fat deposits, and the blue color on Perl’s iron staining indicates iron deposits. Mean (±SD) liver R2* values and FF values are shown for each model.

Figure 5 shows MRI and histology results for a patient with biopsy-confirmed severe hepatic iron overload of 16.5 mg Fe/g and no steatosis. Here, the ARMA model produced a homogeneous liver R2* map and the mean R2* (735 ± 162 s–1) was close to that obtained with the mono-exponential model (718 ± 132 s–1), which was considered as reference since the patient had no steatosis. In contrast, the liver R2* map calculated with the NLSQ model was not as homogeneous and overestimated R2* (988 ± 497 s–1) compared with the mono-exponential R2*. The golden-brown deposits on H&E staining and blue deposits on Perl’s iron staining indicate that the patient had severe iron overload. The mean FF value calculated with the ARMA model was ~1%, indicating negligible steatosis, whereas FF calculated with NLSQ was ~11%. The histology staining did not show any clear vacuoles which suggests the absence of fat.

Figure 5.

Figure 5.

R2*-MRI and histology slides obtained from a 9-year old patient with sickle cell anemia with biopsy-confirmed iron overload and no steatosis. (A) Calculated R2* maps obtained with mono-exponential, NLSQ, and ARMA models and FF maps obtained with NLSQ and ARMA models. (B) Histology slides of the liver biopsy sample with H&E and Perl’s iron staining (magnification 20×). Mean (±SD) liver R2* and FF values measured are shown for each model. Golden-brown deposits on H&E and blue deposits on Perl’s iron staining indicate that the patient has severe iron overload.

Table 3 summarizes mean liver R2* and FF values calculated with the three fitting models, along with the histology steatosis grading for all 10 patients. Linear regression (Fig. 6) of the derived mean R2*-HIC estimates for mono-exponential and ARMA models and the biopsy HIC results produced slopes close to unity (0.90–0.95), whereas the NLSQ model showed a trend to overestimate biopsy HIC (slope 1.4). NLSQ produced similar mean R2* and R2*-HIC values as mono-exponential and ARMA models in the 3 patients with mild (HIC < 7 mg/g Fe) and moderate iron overload (7 < HIC < 15 mg/g Fe). However, in the 7 patients with high iron overload (HIC > 15 mg/g Fe), NLSQ substantially overestimated R2*-HIC values and produced mean FFs (4 – 28%) with very high SDs (15 – 222%) when histology results confirmed no evidence of steatosis (Table 3).

Table 3.

In Vivo Mean Liver R2* and FF Values Calculated with Different R2*-MRI Fitting Models, Along with Histology Steatosis Grading for Patients

Patient number Mono-exponential NLSQ ARMA Histology Steatosis Grading
R2* (s–1) R2* (s–1) FF (%) R2* (s–1) FF (%)
1 109 ± 11 109 ± 12 18 ± 4 111 ± 12 16 ± 4 Moderate
2 237 ± 35 263 ± 82 3 ± 9 263 ± 40 4 ± 5 Mild
3 718 ± 132 988 ± 497 11 ± 15 735 ± 162 1 ± 5 No evidence
4 727 ± 184 1032 ± 500 7 ± 32 761 ± 193 1 ± 5 No evidence
5 1060 ± 337 1473 ± 509 17 ± 222 1096 ± 323 1 ± 7 No evidence
6 1087 ± 397 1609 ± 503 27 ± 42 1171 ± 421 3 ± 11 No evidence
7 997 ± 383 1536 ± 538 28 ± 43 1039 ± 368 2 ± 9 No evidence
8 866 ± 237 1366 ± 572 19 ± 33 897 ± 255 1 ± 7 No evidence
9 304 ± 41 318 ± 90 19 ± 5 316 ± 47 14 ± 8 Moderate
10 916 ± 283 1334 ± 528 4 ± 78 983 ± 279 2 ± 8 No evidence

Note: R2* and FF values are given as mean ± standard deviation.

Abbreviations: NLSQ, non-linear least squares model; ARMA, autoregressive moving average model

Figure 6.

Figure 6.

Linear regression analysis between R2*-HIC measurements obtained with different R2*-MRI fitting models and biopsy HIC values. Solid lines represent regression lines, and the dashed line represents the line of unity. Results of the regression analysis are included beneath the plot.

DISCUSSION

The presence of fat introduces oscillations in the signal decay of multi-echo GRE acquisitions. Hence, the use of standard mono-exponential signal models for R2* quantification might be inaccurate and can confound R2* results. Likewise, the presence of iron increases R2*, which causes rapid signal decay and compromises fat quantification. Although signal modeling techniques that consider the multi-spectral nature of fat and water have been proposed to simultaneously quantify R2* and FF, proper extraction of quantitative results in situations of high iron content might be hampered because of the rapid signal decay.23,33 In this study, we evaluated the performance of an ARMA model for R2* and FF quantification, and compared the results with those obtained using a mono-exponential model and a NLSQ model available in the ISMRM Fat–Water Toolbox.31,32

The mono-exponential model was not affected by the fat-water oscillations in phantoms with low to moderate iron concentrations irrespective of the FF values. This might be because we used equal echo spacing and a long echo train length that might have caused the fat-water oscillations to temporally average out in the mono-exponential fitting. However, these findings might change if unequal echo spacing and/or shorter echo train lengths are used. At highest iron concentrations (R2* ~750 s–1), the mono-exponential model overestimated R2* values at 1.5T and underestimated at 3T for higher FF content (20%, 40%), which is most likely caused by the influence of in-phase and out-of-phase effects on the fitted signal decay. Specifically, for the highest iron concentration, the rapid signal decay is mainly affected by the first out-of-phase TE at 1.5T and first in-phase TE at 3T (both at 2.2 ms) leading to R2* overestimation or underestimation, respectively. Using dual-shot mGRE, the mono-exponential model still produced similar R2* results as single-shot mGRE, even at high iron concentrations, because not accounting for the presence of fat in the model might have a major impact on the R2* results compared to the differences between the acquisitions. Hence, incorporating fat into the signal model is essential for accurate R2* quantification, especially at higher iron concentrations and FF values.

Both fat-water signal models, ARMA and NLSQ, produced R2* and FF values as expected for low iron concentrations (R2* ≤ 300 s–1) at 1.5T using either single-shot or dual-shot mGRE. Consistent results were also observed at 3T, except that ARMA failed to correctly estimate FF based on single-shot mGRE data because the ΔTE used (~1.44 ms) was not sufficient to fulfill the Nyquist criterion in frequency domain, which leads to spectral aliasing and inaccurate FF extraction. As fat and water frequencies differ by ~440Hz at 3T, a ΔTE ≤ 1.1 ms is required to meet the Nyquist criterion. Using dual-shot mGRE acquisition (ΔTE = 0.74 ms), enabled successful fat–water separation with ARMA at 3T and substantially improved ARMA R2* and FF slopes compared to single-shot mGRE. In contrast, the NLSQ model was able to perform fat–water separation at 3T even for the single-shot mGRE, because unlike ARMA, it uses a priori information about the spectral components and imposes constraints such as spatial regularization in estimating the B0 field map, so that spectral aliasing can be avoided.31

At high iron concentrations (R2* > 300 s–1), ARMA and NLSQ showed different behaviors depending on the FF content at 1.5T. When there is no fat, NLSQ overestimated R2* and produced false FFs (4 – 28%)) both in iron-fat phantoms and patients, hence falsely classifying the patients as steatosis grade 1 or higher.34 This might be because the NLSQ model still uses a multi-spectral fat-water model despite no fat being present, and fits a higher number of free parameters that produce the smallest least square error. At high iron concentrations when the signal decay is rapid, accurate computation of the higher number of free parameters can be hampered by the NLSQ model, leading to erroneous results. In order to avoid such instability and bias of the NLSQ model at high iron concentrations, recently some investigators applied the NLSQ fit without incorporating a fat model for R2* > 500 s−1.35 In contrast, the ARMA model, using an iterative approach, is potentially reduced to a one-peak model (see Supplement) with a simple mono-exponential decay if no lipid peaks can be identified and thus produced results similar to magnitude-based mono-exponential fit in high iron and no fat conditions. Theoretically, the NLSQ model could also be improved using an iterative strategy – starting with all lipid components and successively reducing the number of components if the fitting improves. On the other hand, NLSQ outperformed ARMA at high iron concentrations once fat is present as now the model correctly fitted the lipid components, whereas ARMA failed in R2* and fat quantification in high iron and high fat conditions at 1.5T. This is because in case of high iron overload (R2* > 500 s–1 or T2* < 2 ms), the signal is almost decayed even before fat and water signals are in-phase again for the first time (4.6 ms at 1.5T) or encounter at least one fat-water oscillation (see Fig. S1 in Supplement). Due to the poor signal, the ARMA model was unable to pick up signal modulations introduced by fat at the highest iron concentrations as it does not use any prior information about the fat spectrum, such as in the case of NLSQ, and hence behaved like a mono-exponential model and failed in fat quantification. Using the dual-shot acquisition, only improved the results slightly for both NLSQ and ARMA models compared to single-shot at 1.5T. But at 3T, as the fat and water signals are in-phase and out-of-phase at a higher frequency (~440Hz; i.e., in-phase at 2.3 ms) and due to higher SNR compared to 1.5T, there are at least two fat-water oscillations that can be captured before the signal is lost at the highest iron concentration. Hence, both the ARMA and NSLQ were able to model the signal modulations correctly for the highest iron phantoms at 3T using the dual-shot mGRE and produced R2* and FF values in accordance with the reference values.

R2* and FF results obtained at high iron concentrations with ARMA or the NLSQ could be improved by using optimized flip angles that increase SNR. In theory, the mGRE signal is maximal at the Ernst angle, which is about 45° assuming a TR of 200 ms and T1 of water in the liver of 586 ms at 1.5T.36 However, flip angles as high as 45° will result in an overestimated FF, because of the T1 bias introduced by the shorter T1 time of fat (~350 ms at 1.5T). The T1 bias can in general be estimated and corrected from the GRE signal equation using the TR and the expected T1 values of water and fat.36 For example, a flip angle of 45°, leads to a ~14% increase in the FF estimation due to T1 bias. Therefore, in our phantom study, a reduced flip angle of 25° was used which results in only a ~5% increase in the FF estimation. However, our in vivo data were obtained with a flip angle of 45° so that FF values from both models, ARMA or NLSQ, might be systematically overestimated. But the presence of iron also reduces T1 times of water so that the T1 bias is reduced,37,38 and the overestimation in FF estimation might not be considerably high.

The major features of ARMA are that it (i) does not require prior information about the relative amplitudes and frequencies of lipid peaks, (ii) generates a B0 field map without making any assumptions, and (iii) provides separate R2* values for water and fat species. But using no prior knowledge would require a sufficiently high signal for ARMA to localize the individual water and fat peaks. This is the reason why ARMA was able to detect only a maximum of 4 peaks, i.e., 1 water peak and 3 lipid peaks with highest amplitudes (~90% of fat signal) both in phantoms and in vivo. Hence, there might be a slight underestimation in the FF values calculated with ARMA compared to the true FF values. It has been well-validated that the in vivo liver fat spectrum shows limited variability regardless of the level of fat deposition,20 so that we believe that fixed values can be used for the relative amplitudes and frequencies of lipid peaks as done in the NLSQ model.39 Also, it has recently been shown, fitting a single R2* for both water and fat species is accurate when iron is present as iron is the most dominating source for T2* relaxation. Hence, we speculate that ARMA modeling might improve by incorporating some prior assumptions as made in the NLSQ model, however, this requires further investigation.

In this study, we used the R2* values calculated with the mono-exponential model in pure iron-doped phantoms as reference for comparison with other iron–fat phantoms and fit models. Using pure iron doped phantoms as a reference might not be the best choice as R2* values might change slightly with fat content.25 Nevertheless, these changes are considered quite small compared to the effect of iron particles on R2* and hence, it was recently proposed to treat R2* values of both species as a combined value and therefore to apply a single R2* fitting approach for simultaneous fat-water modeling.23

There are some limitations in our study. In vivo, the R2*-HIC estimates from mono-exponential and ARMA models were in good agreement with the biopsy HIC estimates whereas NLSQ overestimated in cases with severe iron overload (HIC > 15 mg/g Fe) and no steatosis. However, none of our study patients presented with both, high iron and high fat, a regimen where NLSQ outperformed the ARMA and mono-exponential models in phantoms. Thus, a final conclusion about the best model for accurate R2* and FF estimation in vivo in various situations of iron and fat content is not feasible at this time and requires further investigation of all possible combinations (i.e., low/high iron and/or low/high fat). Further, mono-exponential and fat–water models were tested in only 10 patients at 1.5T using a single-shot mGRE sequence. Hence, a more systematic investigation in a larger cohort using all three models at both 1.5T and 3T and using dual-shot GRE is warranted.

Finally it has to be noted that recent studies investigating improved fit algorithms for iron and fat assessment including NLSQ and ARMA26,35 usually cross-calibrate their R2* results with first generation iron quantification methods such as Ferriscan® or R2*-HIC values estimated from R2*-biopsy HIC calibration studies. These biopsy calibration studies were validated in patients with iron overload only, and were either not controlled for steatosis or obtained from cohorts that had no steatosis.4,5,40 Without histopathological confirmation, cross-calibrating new fitting methods to these first generation studies may therefore lead to potential bias or inaccurate results, and cannot guarantee accuracy of quantitative measures.

In conclusion our study shows that a standard mono-exponential model produced inaccurate R2* values at high iron concentrations and high FF values, but is able to accurately quantify R2* in fat phantoms with low iron concentrations with the mGRE acquisitions employed in this study. Our data suggest that ARMA yields correct R2* and FF quantification in situations of high iron and no or very little fat content, whereas NLSQ is superior under high iron and fat conditions. Hence, caution about selection of an appropriate strategy for fat-water and R2* quantification is required when using models such as NLSQ or ARMA especially in situations of high iron, with or without steatosis. We conclude that more in-depth investigations of these modern algorithms with concurrent histopathology are necessary, preferably in preclinical models of hepatosteatosis and siderosis, before using these models clinically.

Supplementary Material

Supp info

Acknowledgements:

The authors thank Gail Fortner, RN, for patient enrollment and regulatory matters, and Chris Goode, RT, for MRI data collection. The authors also thank Vani Shanker, PhD for scientific editing.

Grant Support: Supported by grant 5 R01 DK088988 from the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health and ALSAC (the fund-raising organization of St. Jude Children’s Research Hospital).

Footnotes

Disclosure:

Part of this work was presented at the 2017 Annual Meeting of the International Society of Magnetic Resonance in Medicine in Honolulu, HI, USA.

REFERENCES

  • 1.Brittenham GM, Badman DG, National Institute of D, Digestive, Kidney Diseases W. Noninvasive measurement of iron: report of an NIDDK workshop. Blood 2003;101(1):15–19. [DOI] [PubMed] [Google Scholar]
  • 2.Henninger B Demystifying liver iron concentration measurements with MRI. European radiology 2018. [DOI] [PubMed] [Google Scholar]
  • 3.Anderson LJ, Holden S, Davis B, et al. Cardiovascular T2-star (T2*) magnetic resonance for the early diagnosis of myocardial iron overload. European heart journal 2001;22(23):2171–2179. [DOI] [PubMed] [Google Scholar]
  • 4.Wood JC, Enriquez C, Ghugre N, et al. MRI R2 and R2* mapping accurately estimates hepatic iron concentration in transfusion-dependent thalassemia and sickle cell disease patients. Blood 2005;106(4):1460–1465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hankins JS, McCarville MB, Loeffler RB, et al. R2* magnetic resonance imaging of the liver in patients with iron overload. Blood 2009;113(20):4853–4855. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Henninger B, Zoller H, Rauch S, et al. R2* relaxometry for the quantification of hepatic iron overload: biopsy-based calibration and comparison with the literature. RoFo : Fortschritte auf dem Gebiete der Rontgenstrahlen und der Nuklearmedizin 2015;187(6):472–479. [DOI] [PubMed] [Google Scholar]
  • 7.Szczepaniak LS, Nurenberg P, Leonard D, et al. Magnetic resonance spectroscopy to measure hepatic triglyceride content: prevalence of hepatic steatosis in the general population. American journal of physiology Endocrinology and metabolism 2005;288(2):E462–468. [DOI] [PubMed] [Google Scholar]
  • 8.Lazo M, Hernaez R, Eberhardt MS, et al. Prevalence of nonalcoholic fatty liver disease in the United States: the Third National Health and Nutrition Examination Survey, 1988–1994. American journal of epidemiology 2013;178(1):38–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Halonen P, Mattila J, Ruuska T, Salo MK, Makipernaa A. Liver histology after current intensified therapy for childhood acute lymphoblastic leukemia: microvesicular fatty change and siderosis are the main findings. Medical and pediatric oncology 2003;40(3):148–154. [DOI] [PubMed] [Google Scholar]
  • 10.Liu Y, Fernandez CA, Smith C, et al. Genome-Wide Study Links PNPLA3 Variant With Elevated Hepatic Transaminase After Acute Lymphoblastic Leukemia Therapy. Clinical pharmacology and therapeutics 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Sahoo S, Hart J. Histopathological features of L-asparaginase-induced liver disease. Seminars in liver disease 2003;23(3):295–299. [DOI] [PubMed] [Google Scholar]
  • 12.Janiszewski PM, Oeffinger KC, Church TS, et al. Abdominal obesity, liver fat, and muscle composition in survivors of childhood acute lymphoblastic leukemia. The Journal of clinical endocrinology and metabolism 2007;92(10):3816–3821. [DOI] [PubMed] [Google Scholar]
  • 13.Nozaki Y, Sato N, Tajima T, et al. Usefulness of Magnetic Resonance Imaging for the Diagnosis of Hemochromatosis with Severe Hepatic Steatosis in Nonalcoholic Fatty Liver Disease. Internal medicine 2016;55(17):2413–2417. [DOI] [PubMed] [Google Scholar]
  • 14.Nelson JE, Klintworth H, Kowdley KV. Iron metabolism in Nonalcoholic Fatty Liver Disease. Current gastroenterology reports 2012;14(1):8–16. [DOI] [PubMed] [Google Scholar]
  • 15.Radmard AR, Poustchi H, Dadgostar M, et al. Liver enzyme levels and hepatic iron content in Fatty liver: a noninvasive assessment in general population by T2* mapping. Academic radiology 2015;22(6):714–721. [DOI] [PubMed] [Google Scholar]
  • 16.Bley TA, Wieben O, Francois CJ, Brittain JH, Reeder SB. Fat and water magnetic resonance imaging. Journal of magnetic resonance imaging : JMRI 2010;31(1):4–18. [DOI] [PubMed] [Google Scholar]
  • 17.Hernando D, Levin YS, Sirlin CB, Reeder SB. Quantification of liver iron with MRI: state of the art and remaining challenges. Journal of magnetic resonance imaging : JMRI 2014;40(5):1003–1021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Krafft AJ, Loeffler RB, Song R, et al. Does fat suppression via chemically selective saturation affect R2*-MRI for transfusional iron overload assessment? A clinical evaluation at 1.5T and 3T. Magnetic resonance in medicine 2016;76(2):591–601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Meloni A, Tyszka JM, Pepe A, Wood JC. Effect of inversion recovery fat suppression on hepatic R2* quantitation in transfusional siderosis. AJR American journal of roentgenology 2015;204(3):625–629. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hamilton G, Yokoo T, Bydder M, et al. In vivo characterization of the liver fat (1)H MR spectrum. NMR Biomed 2011;24(7):784–790. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Bydder M, Yokoo T, Hamilton G, et al. Relaxation effects in the quantification of fat using gradient echo imaging. Magnetic resonance imaging 2008;26(3):347–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hernando D, Liang ZP, Kellman P. Chemical shift-based water/fat separation: a comparison of signal models. Magnetic resonance in medicine 2010;64(3):811–822. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Horng DE, Hernando D, Reeder SB. Quantification of liver fat in the presence of iron overload. Journal of magnetic resonance imaging : JMRI 2017;45(2):428–439. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Taylor BA, Hwang KP, Hazle JD, Stafford RJ. Autoregressive moving average modeling for spectral parameter estimation from a multigradient echo chemical shift acquisition. Medical physics 2009;36(3):753–764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Krafft AJ, Taylor BA, Lin H, Loeffler RB, Hillenbrand CM. A Systematic Evaluation of an Auto Regressive Moving Average (ARMA) Model for Fat-water Quantification and Simultaneous T2* Mapping. International Society of Magnetic Resonance in Medicine; Salt Lake City, Utah; 2013. [Google Scholar]
  • 26.Taylor BA, Loeffler RB, Song R, McCarville MB, Hankins JS, Hillenbrand CM. Simultaneous field and R2 mapping to quantify liver iron content using autoregressive moving average modeling. Journal of magnetic resonance imaging : JMRI 2012;35(5):1125–1132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hines CD, Yu H, Shimakawa A, McKenzie CA, Brittain JH, Reeder SB. T1 independent, T2* corrected MRI with accurate spectral modeling for quantification of fat: validation in a fat-water-SPIO phantom. Journal of magnetic resonance imaging : JMRI 2009;30(5):1215–1222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Kleiner DE, Brunt EM, Van Natta M, et al. Design and validation of a histological scoring system for nonalcoholic fatty liver disease. Hepatology 2005;41(6):1313–1321. [DOI] [PubMed] [Google Scholar]
  • 29.McCarville MB, Hillenbrand CM, Loeffler RB, et al. Comparison of whole liver and small region-of-interest measurements of MRI liver R2* in children with iron overload. Pediatric radiology 2010;40(8):1360–1367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Gudbjartsson H, Patz S. The Rician distribution of noisy MRI data. Magnetic resonance in medicine 1995;34(6):910–914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Hernando D, Kellman P, Haldar JP, Liang ZP. Robust water/fat separation in the presence of large field inhomogeneities using a graph cut algorithm. Magnetic resonance in medicine 2010;63(1):79–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Hu HH, Bornert P, Hernando D, et al. ISMRM workshop on fat-water separation: insights, applications and progress in MRI. Magnetic resonance in medicine 2012;68(2):378–388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Hernando D Limits of liver fat quantification in the presence of severe iron overload. ISMRM; 2014. [Google Scholar]
  • 34.Tang A, Tan J, Sun M, et al. Nonalcoholic fatty liver disease: MR imaging of liver proton density fat fraction to assess hepatic steatosis. Radiology 2013;267(2):422–431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hernando D, Zhao R, Taviani V, et al. Liver R2* as a Biomarker of Liver Iron Concentration: Interim Results from a Multi-Center, Multi-Vendor Reproducibility Study at 1.5T and 3T Proceedings 26th Scientific Meeting, International Society for Magnetic Resonance in Medicine; Paris, France; 2018. [Google Scholar]
  • 36.Liu CY, McKenzie CA, Yu H, Brittain JH, Reeder SB. Fat quantification with IDEAL gradient echo imaging: correction of bias from T(1) and noise. Magnetic resonance in medicine 2007;58(2):354–364. [DOI] [PubMed] [Google Scholar]
  • 37.Banerjee R, Pavlides M, Tunnicliffe EM, et al. Multiparametric magnetic resonance for the non-invasive diagnosis of liver disease. Journal of hepatology 2014;60(1):69–77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Tipirneni-Sajja A, Kercher EM, Loeffler RB, et al. Does T1 Mapping Provide Additional Information in the Context of Hepatic Iron Overload? International Society of Magnetic Resonance in Medicine; Honolulu, Hawaii; 2017. [Google Scholar]
  • 39.Hernando D, Kramer JH, Reeder SB. Multipeak fat-corrected complex R2* relaxometry: theory, optimization, and clinical validation. Magnetic resonance in medicine 2013;70(5):1319–1331. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.St Pierre TG, Clark PR, Chua-anusorn W, et al. Noninvasive measurement and imaging of liver iron concentrations using proton magnetic resonance. Blood 2005;105(2):855–861. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

RESOURCES