Abstract
Purpose:
To accelerate model-based iterative reconstruction (IR) methods for C-arm cone-beam CT (CBCT), thereby combining the benefits of improved image quality and/or reduced radiation dose with reconstruction times on the order of minutes rather than hours.
Methods:
The ordered-subsets, separable quadratic surrogates (OS-SQS) algorithm for solving the penalized-likelihood (PL) objective was modified to include Nesterov’s method, which utilizes “momentum” from image updates of previous iterations to better inform the current iteration and provide significantly faster convergence. Reconstruction performance of an anthropomorphic head phantom was assessed on a benchtop CBCT system, followed by CBCT on a mobile C-arm, which provided typical levels of incomplete data, including lateral truncation. Additionally, a cadaveric torso that presented realistic soft-tissue and bony anatomy was imaged on the C-arm, and different projectors were assessed for reconstruction speed.
Results:
Nesterov’s method provided equivalent image quality to OS-SQS while reducing the reconstruction time by an order of magnitude (10.0 ×) by reducing the number of iterations required for convergence. The faster projectors were shown to produce similar levels of convergence as more accurate projectors and reduced the reconstruction time by another 5.3 ×. Despite the slower convergence of IR with truncated C-arm CBCT, comparison of PL reconstruction methods implemented on graphics processing units showed that reconstruction time was reduced from 106 min for the conventional OS-SQS method to as little as 2.0 min with Nesterov’s method for a volumetric reconstruction of the head. In body imaging, reconstruction of the larger cadaveric torso was reduced from 159 min down to 3.3 min with Nesterov’s method.
Conclusions:
The acceleration achieved through Nesterov’s method combined with ordered subsets reduced IR times down to a few minutes. This improved compatibility with clinical workflow better enables broader adoption of IR in CBCT-guided procedures, with corresponding benefits in overcoming conventional limits of image quality at lower dose.
Keywords: cone-beam CT, iterative reconstruction, accelerated reconstruction, image-guided surgery, truncation, mobile C-arm
1. INTRODUCTION
Advances in model-based iterative reconstruction (IR) methods for x-ray CT and cone-beam CT (CBCT) imaging have led to numerous studies demonstrating the benefits of improved image quality and/or reduced radiation dose over conventional analytic reconstruction methods.1–6 However, the increased reconstruction time (up to several hours, even on commercial systems) is a major drawback that limits the use of IR in many applications, especially those that require timely images as a part of the clinical workflow.7–10 For example, potential applications of IR for CBCT include the use of C-arms in image-guided surgery for verifying device placement (e.g., pedicle screws in spine surgery) and providing high-quality, low-dose intraoperative checks against complications in healthy tissue (e.g., intracranial hemorrhage in neurosurgery).11–14 Such applications demand image reconstructions on the order of minutes rather than hours.
Further compounding the challenge, C-arm CBCT data are typically incomplete and missing information due to the failure to satisfy Tuy’s condition with a cone-beam acquisition and a circular orbit15–19 (which yields so-called cone-beam artifacts), longitudinal truncation (i.e., the long-object problem), an incomplete orbit (e.g., if 180°+ the fan angle is not acquired), sparse sampling (i.e., fewer projections acquired), and/or lateral truncation [due to the relatively small field of view (FOV)].20–24 Therefore, the problem is generally ill-conditioned, leading to slow convergence. In particular, for truncated acquisitions, expanding the reconstruction FOV beyond the C-arm imaging FOV to encompass the object support is important in IR methods to enforce consistency of the line integral of the reconstruction with the measurements, but convergence is slow in the expanded FOV due to the lack of data. Although image regularization can improve conditioning of the problem, truncated projections and other forms of incomplete data remain a challenge for CBCT IR convergence speed.
Accelerating IR is an active area of research and can be addressed using a number of possible solutions, including hardware and/or algorithmic improvements.25–29 A common acceleration technique divides the projections into ordered subsets (OS) to accelerate the reconstruction by a factor approximately equal to the number of subsets.28,30 The acceleration is achieved by using one subset of projections per subiteration to update the volume, which correspondingly reduces the number of forward and backprojections per update. For example, the penalized-likelihood (PL) reconstruction problem, which leverages a statistical model of the measured data combined with a priori assumptions on the image such as local smoothness, can be solved iteratively by combining OS with the separable quadratic surrogates (SQS) algorithm (previously referred to as separable paraboloidal surrogates, SPS),30 although other algorithms for maximum-likelihood (ML) and PL problems have been developed as well, including expectation maximization (EM), iterative coordinate descent, and grouped coordinate ascent.28,31–33 OS-SQS uses a highly parallelizable approach that can leverage advances in parallel computing on hardware such as general-purpose graphics processing units (GPUs). Even so, convergence can still be slow, typically requiring hundreds of iterations and upward of several hours to perform the reconstruction. This is in part due to conventional OS-SQS updating the image without any “memory” of previous updates. Therefore, if the algorithm can be modified to carry “momentum” from previous updates, it can better inform the current update and achieve faster convergence.34,35 Such a method, known as Nesterov acceleration or Nesterov’s method,34 was first applied to total-variation (TV)-based CT image reconstruction by Jørgensen et al.36 and Jensen et al.37 TV reconstruction is similar to PL in that accelerated convergence of the optimization problem is desired and the objective function is smooth, enabling the development of an accelerated method called unknown parameter Nesterov (UPN), which has also been applied to CBCT reconstruction by others.38,39 Kim et al. also recently demonstrated the use of Nesterov’s method,40,41 combining it with OS-SQS for penalized weighted least squares (PWLS) CT reconstructions. A number of alternative approaches have been investigated for accelerating PWLS reconstructions, including work by Ramani and Fessler29 that compared the fast iterative shrinkage-thresholding algorithm (FISTA)42 and split-Bregman43,44 methods with a splitting-based alternating direction method of multipliers (ADMM) algorithm accelerated by a preconditioning filter. Additionally, the linearized augmented Lagrangian method with ordered subsets (OS-LALM) was recently developed.45 Such methods could possibly be extended to PL reconstructions, but are not considered within the scope of this work. The work below applies Nesterov’s method with OS-SQS to acceleration of PL reconstruction similar to the approach by Kim et al.40,41 and applies the algorithm to C-arm CBCT in the context of image-guided surgery, with a particular emphasis on how acceleration can help to overcome convergence speed issues associated with truncated data. Performance is assessed relative to SQS in an anthropomorphic head phantom for truncated and untruncated data and demonstrated in a cadaveric torso emulating a scenario of CBCT-guided abdominal surgery.
2. METHODS
2.A. Statistical reconstruction algorithms
The PL framework30 enables statistical image reconstruction by first applying a basic Poisson statistics model to the data
(1) |
where y are the noisy measured projection data, b are the number of incident photons, l = Aμ are the line integrals computed for system matrix A (forward-projection) and image volume μ, and i indexes the rays. PL then formulates the reconstruction as the solution to an optimization problem, where the objective Φ(μ; y) comprises the log-likelihood function of the data and image regularization by a roughness penalty R(μ) with strength β,
(2) |
(3) |
Ignoring constant terms, the log-likelihood function is
(4) |
and the image roughness penalty30 is calculated as
(5) |
where j indexes all voxels, k indexes the voxels in a neighborhood Nj about voxel j (first-order neighborhood in this work), wjk are the weights within a neighborhood (unity in this work), and ψ is the penalty function.
The OS-SQS method30 is often employed to solve Eq. (2) and utilizes highly parallelizable voxel-wise operations on volumes. Additionally, the forward and backprojection operators are performed on the entire volume and subsets of projections, respectively, which are tasks well-suited for implementation on parallel hardware architectures such as GPUs. OS-SQS leverages a separable quadratic “surrogate” function that locally approximates the PL objective function, and the surrogate function is maximized during each update to increase the objective value of the reconstructed volume. The starting image μ(0) can be initialized by the filtered backprojection (FBP) reconstruction (although a common alternative is a zero image). When M subsets are used (providing acceleration by approximately a factor of M), the algorithm is run for N iterations as follows (Algorithm I, denoted SQS-M):
ALGORITHM I.
Initialize μ = μ(0) | ||||||||||||||
Precompute γm = Am1, for m = 1, 2, 3, …, M | ||||||||||||||
For n = 1, 2, 3, …, N | ||||||||||||||
For m = 1, 2, 3, …, M
|
where Am and AmT are the forward and backprojection operators, respectively, for subset m; γ m are the projections of a volume of all ones for the mth subset; Sm are the projections in subset m; i and j index detector pixels and volume voxels, respectively; and d are the gradient and curvature of the likelihood surrogates, respectively; ⋅ denotes an element-wise product of two vectors; and ωψ are the derivative and curvature of the penalty function surrogates, respectively; Δ is the image update; and is a thresholding operation at 0 that enforces a non-negativity constraint. The reconstructed image after N iterations of M subsets is denoted as . In this work, the edge-preserving Huber penalty was used, with
(13) |
(14) |
where δ is used to control the degree of edge preservation by controlling the width of the quadratic penalty region around x = 0.
Despite the OS acceleration, SQS-M convergence can be very slow. Kim et al.40,41 demonstrated significant acceleration can be achieved by adapting Nesterov’s original method34 with a modified and improved set of momentum weights [Eq. (17), below]46 to accumulate momentum from image updates Δ [denoted in Refs. 40 and 41], where the improved weights provide faster convergence. The algorithm is run for N iterations and M subsets as follows (Algorithm II, denoted Nes-M): where z is the current image estimate; v is the cumulative momentum from all image updates; t is a scalar momentum weight that increases approximately linearly with each subiteration; and the image μ is now a state variable that linearly combines the current image estimate with a momentum-based image (the cumulative momentum added to the initial image). Note that after each iteration, the current image estimate is now z rather than μ, with this redefinition of μ enabling the same definition of image update Δ with relation to μ [Eqs. (6)–(11)].
ALGORITHM II.
The additional computational expense of Nesterov’s method is minimal: it requires just one additional volume v in memory storage (an implementation that eliminates the intermediate variable z), and the additional computation of v and μ are multiply-and-add, voxel-wise operations of volumes that can be performed in parallel. Because t ≥ 1 and is an increasing function (Fig. 1), the momentum-based term in Eq. (18) takes larger steps in the combined direction of the previous updates, but does so in a controlled manner due to the 1/t weight. The selection of momentum weights is key to the acceleration and stability of the method;34,46–48 for example, as a special case, if the weights were fixed at t = 1 for all subiterations, the Nes algorithm would be equivalent to the SQS algorithm. In the case of Nesterov’s method, the momentum weight t asymptotically approaches s/2, where s is the number of subiterations,
(19) |
2.B. Experimental setup
The performance of the SQS-M and Nes-M algorithms was compared using CBCT data acquired with an x-ray test bench and a prototype mobile C-arm capable of CBCT (modified PowerMobil, Siemens Healthcare). Studies employed an anthropomorphic head phantom containing a natural skeleton and simulated soft-tissue inserts as well as a cadaveric torso emulating an abdominal surgery scenario. The test bench incorporated a 43 × 43 cm2 flat-panel detector with 0.278 × 0.278mm2 pixel size (PaxScan 4343CB, Varian Medical Systems, Palo Alto, CA) providing little or no lateral truncation of the head phantom. The C-arm employed a 30 × 30 cm2 detector with 0.388 × 0.388 mm2 pixel size (PaxScan 3030+, Varian Medical Systems, Palo Alto, CA) with realistic lateral truncation (Fig. 2). The acquisition technique and geometry of the test bench replicated that of the C-arm—100 kVp tube voltage, 80 mA s total exposure (3.3 mGy head dose), 198 projections over ∼180° orbit, 60 cm source-axis distance (SAD), and 120 cm source-detector distance (SDD). Additionally, a separate study emulating a scenario of CBCT-guided abdominal surgery was conducted on the C-arm using a fresh, unfixed cadaveric torso presenting realistic anatomical structures and an acquisition technique of 100 kVp, 120 mA s (3.1 mGy body dose).5
The SQS and Nes algorithms were implemented using custom CUDA libraries (Nvidia, Santa Clara, CA) to leverage the parallel computing capabilities of GPUs. Unless otherwise noted, the voxel-based separable footprints with trapezoidal basis functions (SF-TT) projector was used due to its greater accuracy49—the forward projector was previously defined as matrix operator A and the backprojector by AT. As described in Ref. 5, the original SF-TT approach was extended from a circular trajectory (five degrees of freedom, DOF) to handle an arbitrary 9-DOF geometry represented by projection matrices. The extension changed computation of the “amplitude” (i.e., height of the trapezoid function of the footprint of each voxel) to calculating the intersection length between the voxel and a ray connecting the source and the center of the voxel,50 followed by determining the detector pixels intersecting the trapezoidal footprint by projecting the eight vertices of the voxel to compute the trapezoid vertices. Each voxel was projected by a single thread on the GPU, enabling a highly parallelized implementation.
While the high-fidelity but relatively slow SF-TT projector was used by default, reconstructions with faster, less accurate projectors were also evaluated—in particular, the ray-driven Siddon forward projector51 and voxel-driven Peters backprojector,52 denoted SP. The use of mismatched forward and backprojectors (sometimes referred to as dual-matrix reconstruction) to reduce computation time has been previously proposed and empirically shown to accelerate the reconstruction process with little penalty on image quality, despite convergence no longer necessarily being guaranteed, even for SQS-1.53–55 We therefore assessed the impact of the faster SP projectors on both reconstruction time and convergence speed. The GPU-based implementation assigned each ray to a computational thread for the Siddon projector, while each voxel was assigned to a computational thread for the Peters backprojector, and both projectors were found to be substantially faster than the SF-TT projectors due to faster computation and efficient memory access. An additional modification was made to better match the voxel size (0.6 mm isotropic at isocenter with a magnification of ∼2.0) with the pixel size (0.388 mm at the detector): prior to each backprojection, the projection was convolved with a 3 × 3 averaging window to approximately match the pixel and (magnified) voxel size. A number of other projection methods have been developed, and the topic remains an active area of research.56–58 Comparison to other projectors, such as distance-driven,59 blobs,60 and B-splines,61 and their tradeoffs between computation time and accuracy for PL reconstruction can be considered in future work.
Reconstruction parameters were set at b = 8000 quanta, β = 200 (for the bench data, and β = 80 for the C-arm data to compensate for the smaller detector array, which results in a smaller log-likelihood magnitude [Eq. (4)] relative to the image roughness [Eq. (5)]), δ = 10 −4 mm−1, and 0.6 × 0.6 × 0.6 mm3 voxel size. The effect of M was quantified for integer divisors of the number of projections—thus, M ∈ {66, 33, …, 1}. Different reconstruction algorithms (e.g., Nes-11 vs SQS-1) were compared by assessing how many iterations nA were required for Algorithm A to achieve the same objective value as nB iterations of Algorithm B,
(20) |
In this way, the acceleration factor (AF) of Algorithm A could be determined in relation to Algorithm B,
(21) |
For example, the acceleration of SQS-M relative to SQS-1 is expected to produce the familiar . The work below exclusively computes the AF for SQS-M and Nes-M (Algorithm A) relative to SQS-1 (Algorithm B), so the algorithm names are dropped from the AF notation for simplicity. Additionally, the root mean square difference (RMSD) between μ(n) and a “converged” reconstruction μ* was used to quantify image accuracy as a function of iteration. Conversion from reconstructed units of mm−1 to Hounsfield units (HU) was approximated as 5 × 104 HU/mm−1.
3. RESULTS
3.A. Untruncated reconstructions of the head
Test bench images of the head phantom were reconstructed using the SQS-M algorithm run for 1000 iterations for each M and the acceleration factors relative to SQS-1 demonstrated the speedups associated with SQS-M (Fig. 3). As expected, SQS-M achieved AF up to M (although the AF tends to fall off due to suboptimal limit cycles,28,62 as seen with SQS-66). The Nes-M algorithm was run for 100 iterations each and demonstrated much greater AF than SQS-M. For example, Nes-11 exhibited AF = 357 at 104 equivalent SQS-1 iterations and the AF continued to monotonically increase with more iterations, suggesting a faster rate of convergence than SQS and increasingly more benefit from additional iterations. The AF also appeared to be proportional to M (for M ≤ 11). For M > 11, the reconstruction may not converge due to limit-cycle issues or instability, since momentum from each subset only contains information from a few projections, which may lead the reconstruction to false local optima.
A converged reference volume μ* was achieved with 3000 iterations of Nes-1 followed by 3000 iterations of convergent SQS-1 and was confirmed to have greater objective value than any of the other reconstructions. Although the Nes-1 iterations appeared to be convergent, the behavior of the Nes algorithm has not been thoroughly investigated. The 3000 iterations of Nes-1 produced an objective value that appeared to be within the numerical accuracy limits of the computations (performed in single-precision floating-point format), with numerical errors potentially accumulating in the momentum term. Therefore, the additional 3000 iterations of SQS-1 were performed to verify the Nes-1 solution and found to slightly increase the objective value. The RMSD before and after the additional SQS-1 iterations was only 0.0025 HU, suggesting that the Nes-1 solution was already very close to μ *. Alternatively, SQS-1 alone could have been run for tens of thousands of iterations to provide μ *, requiring an extremely long run time (weeks).
The convergence speed of SQS-M and Nes-M was assessed by the RMSD with μ* in a region encompassing soft-tissue simulating inserts (Fig. 4). For SQS reconstructions, SQS-66 most rapidly reduced RMSD in early iterations but quickly leveled out after achieving 2.0 HU accuracy in 140 iterations, while SQS-33 was capable of RMSD = 1.0 HU after 260 iterations. On the other hand, Nes-18 only required 15 iterations to achieve RMSD = 2.0 HU (9.3 × acceleration vs SQS-66) and continued to reduce the error, achieving RMSD = 1.0 HU in 28 iterations (9.3 × acceleration vs SQS-33). Although the difference images contain some dissimilarities due to the different algorithms and subsets, they have the same RMSD and illustrate the most challenging aspect of convergence in the central axial slice—high frequency structure at edges and residual streaks at the posterior of the skull arising from the incomplete orbit (180°). Both SQS-M and Nes-M are capable of reducing the residual errors and streak artifacts beyond what is shown in Fig. 4 through a combination of lower M and more iterations, although this comes at the expense of higher run times, as indicated by the RMSD plots. For SQS-M, this requires using fewer subsets, since SQS-66 reaches a limit-cycle and converges to a RMSD = 1.9 HU. On the other hand, more iterations could be applied to Nes-18 (eventually achieving RMSD = 0.9 HU), or Nes-11 could be used to achieve even lower RMSD.
3.B. Truncated reconstructions of the head
For the truncated C-arm projections, the AF followed a similar trend as the untruncated bench data, with Nes-11 providing a stable, monotonic increase in AF up to 345 × for 104 equivalent SQS-1 iterations. However, analysis of RMSD illustrated the challenge of truncated projections, particularly due to missing data outside the C-arm FOV and the slow convergence in those regions (Fig. 5). Both SQS and Nes were unable to achieve RMSD as low as their counterparts in the untruncated data, in large part due to influence from the large errors outside the C-arm FOV (RMSD > 180 HU outside the C-arm FOV). The algorithms therefore require more iterations even for a higher RMSD than in untruncated data, with SQS-33 achieving 4.0 HU RMSD in 197 iterations and Nes-11 in 21 iterations. Even so, Nes-11 provided a 9.4 × reduction in the number of iterations required with SQS-33. Additionally, large M (e.g., SQS-66, Nes-18) resulted in unstable reconstructions and was susceptible to divergence from the optimal solution with too many iterations, possibly due to the missing data in truncated projections. For example, SQS-66 only utilizes three truncated projections per subset and errors from each update begin to accumulate, especially in regions of the reconstructed volume outside the C-arm FOV. Similarly, errors can accumulate in the cumulative momentum term for Nes-M—even moderate values of M for Nes exhibited some degree of divergent behavior. The stability of both SQS-M and Nes-M as a function of the number of subsets is the subject of ongoing work by others47 and can continue to be investigated further.
Reconstruction with the modified, faster SP projectors was also able to yield RMSD = 4.0 HU compared to the SF-TT-based μ* but required a greater number of iterations [32 iterations for Nes-11-SP, Fig. 5(f)]. It should be noted that Nes-11-SP barely achieved RMSD = 4.0 HU, so the stopping criterion is essential to any potential advantage of using the SP projectors. For example, if a lower RMSD were desired, fewer subsets would have to be used (e.g., Nes-6-SP), resulting in slower convergence speed. Much of the increased RMSD is attributable to inherent differences between the SP and SF-TT projectors, with a converged, SP-based μ* having a RMSD = 3.7 HU relative to the SF-TT-based μ*. Although SF-TT has been shown to be more accurate for projecting voxels,49 neither set of projectors produces the “true” image. Stability remains a concern for Nes-M-SP since the error begins increasing shortly after reaching a minimum. In the case of Nes-11-SP, the target RMSD was reached in 32 iterations and the minimum was soon reached at 38 iterations before increasing again, demonstrating the need for a carefully chosen stopping criterion, whereas Nes-11 reached the target RMSD in 21 iterations and continued to lower the error until RMSD = 1.3 HU at 70 iterations. Nonetheless, despite the mismatched projectors, the RMSD = 4.0 HU error level was achievable without introducing noticeable artifact due to the addition of the smoothing step (convolving the projection with a 3 × 3 averaging window) prior to backprojection as well as the image regularization innate to PL. Therefore, the SP projectors may be useful if the benefit of increased speed outweighs the cost of additional iterations—e.g., in near-real-time CBCT for image-guided surgery.
3.C. Reconstruction time with GPU implementation
Reconstruction times for C-arm CBCT (768 × 768 × 198 data) were measured for the full head volume (300 × 360 × 300 voxels) on a PC workstation with a single GPU (GeForce GTX Titan Black, Nvidia, Santa Clara, CA). The vast majority of the reconstruction time is spent in the forward and backprojection operations (Fig. 6). Relative to the SQS-11 time per iteration, SQS-33 added an additional 7.94% computational time cost (primarily due to regularizing and updating the volume for each subset), while Nes-11 only added 1.34% cost [Fig. 6(a)]. Conversely, the faster SP projectors dramatically reduced the time per iteration by almost a factor of 8 since they are particularly well-suited for efficient parallel implementation. When the time per iteration is multiplied by the number of iterations required, a RMSD = 4 HU could be accomplished in ∼11 min for Nes-11 SF-TT (cf. 106 min for SQS-33), while the faster SP projectors allowed reconstruction in just over 2 min (121 s). Therefore, Nesterov’s method (Nes-11) alone reduced reconstruction time by 10.0 × over SQS (SQS-33), and faster projectors enabled an additional speedup of 5.3 × over SF-TT with the same RMSD.
3.D. Truncated reconstructions of the abdomen
The same analysis of SQS and Nes algorithms was performed in reconstructions of fully truncated C-arm projections of a cadaveric torso (Fig. 7). Because of the larger object size, the reconstructed volume was increased to 500 × 350 × 330 voxels and a balance between reconstruction time and RMSD was found with SQS-33 providing 6.0 HU RMSD in 205 iterations (9522 s = 2.6 h), while Nes-11-SP was able to do so in only 38 iterations (197 s = 3.3 min). Compared to the head reconstruction, the abdomen reconstruction required more iterations even for a higher RMSD due to the greater degree of truncation and missing data. However, the acceleration of Nes-M-SP relative to SQS-M was just as pronounced, with a 48.3 × reduction in reconstruction time, demonstrating the applicability of the algorithm to objects with even more severe truncation.
4. DISCUSSION AND CONCLUSIONS
Nesterov’s method offers dramatic reduction in reconstruction time by accelerating convergence of the conventional ordered-subsets SQS algorithm by an order of magnitude (∼10 ×) for typical reconstructions. With faster Siddon-Peters type projectors, a GPU implementation of Nesterov-accelerated SQS was capable of providing a volumetric reconstruction of the head in ∼2 min, despite the challenges of fully truncated C-arm CBCT projections and/or other forms of incomplete data that lead to ill-conditioning and slower convergence. Of course, conventional FBP reconstruction is still faster than iterative reconstruction since by definition it requires only a single backprojection (cf. multiple forward/backprojections for IR). For example, FBP reconstruction of the head volume only required 19.3 s [18.0 s for filtering (implemented on GPU, but not yet fully optimized for run time) and 1.3 s for backprojection]. Nonetheless, reconstruction speeds accomplished by Nesterov acceleration further facilitates incorporation of IR methods in image-guided interventions, with corresponding benefits to image quality and reduced radiation dose.
Ongoing work includes integration of other methods for addressing lateral truncation, e.g., a fit of projection data to an elliptical model of the volume.63,64 Using coarser voxels outside the C-arm FOV (i.e., a multiresolution volume) could provide further acceleration, since accuracy outside the C-arm FOV is not as critical despite comprising up to 68% of the total voxels in the reconstructed FOV. Simultaneous use of multiple GPUs has been investigated to reduce forward and backprojection time for the SF-TT projector by distributing the projections within each subset among the GPUs, whereas the already fast SP projectors were unable to take advantage of multiple GPUs due to the overhead cost of transferring data between GPUs. For example, a GPU workstation with 3 × GTX Titan’s reduced the time per iteration of the head volume to 15.52 s [2.1 × reduction, cf. Figure 6(a)] for Nes-11, but increased the time to 6.84 s (1.8 × increase) for Nes-11-SP. Further acceleration is also possible by using precomputed likelihood curvatures to eliminate one backprojection per iteration,30 applying spatially nonuniform updates,65 and applying different momentum techniques that increase stability or optimize convergence speed.41,47,48 The impact of these modifications on convergence properties deserves further study, including whether the precomputed likelihood curvatures converge at the same rate and whether the curvatures should be updated every few iterations. Future work includes determining a method for selecting M a priori to maximize acceleration and minimize instability for a given level of RMSD as well as incorporation of a convergence criterion for terminating the reconstruction at an appropriate number of iterations. Additionally, comparison to other Nesterov-accelerated methods, such as the TV-based UPN method36,37 applied to accelerated barrier optimization compressed sensing (ABOCS),39 could be made. Future work includes comparison of the growing number of such momentum-based methods, not only in terms of reconstruction performance but also in obtaining the converged reference, which requires numerical stability and can benefit from a better understanding of convergence behavior. In conclusion, the increased compatibility of an accelerated reconstruction with the clinical workflow has the potential to increase adoption and the routine use of IR methods for C-arm CBCT.
ACKNOWLEDGMENTS
This research is supported by a 2013 AAPM Research Seed Funding grant, NIH fellowship F32EB017571, and academic-industry partnership with Siemens Healthcare (XP Division, Erlangen, Germany). The authors would like to thank Ronn Wade (University of Maryland Anatomy Board) for assistance with cadaver specimens and Joshua Levy (The Phantom Laboratory, Greenwich, NY) for assistance with phantom development and construction.
REFERENCES
- 1.Hara A. K., Paden R. G., Silva A. C., Kujak J. L., Lawder H. J., and Pavlicek W., “Iterative reconstruction technique for reducing body radiation dose at CT: Feasibility study,” Am. J. Roentgenol. 193(3), 764–771 (2009). 10.2214/ajr.09.2397 [DOI] [PubMed] [Google Scholar]
- 2.Sagara Y., Hara A. K., Pavlicek W., Silva A. C., Paden R. G., and Wu Q., “Abdominal CT: Comparison of low-dose CT with adaptive statistical iterative reconstruction and routine-dose CT with filtered back projection in 53 patients,” Am. J. Roentgenol. 195(3), 713–719 (2010). 10.2214/ajr.09.2989 [DOI] [PubMed] [Google Scholar]
- 3.Silva A. C., Lawder H. J., Hara A., Kujak J., and Pavlicek W., “Innovations in CT dose reduction strategy: Application of the adaptive statistical iterative reconstruction algorithm,” Am. J. Roentgenol. 194(1), 191–199 (2010). 10.2214/ajr.09.2953 [DOI] [PubMed] [Google Scholar]
- 4.Marin D. et al. , “Low-tube-voltage, high-tube-current multidetector abdominal CT: Improved image quality and decreased radiation dose with adaptive statistical iterative reconstruction algorithm–Initial clinical experience,” Radiology 254(1), 145–153 (2010). 10.1148/radiol.09090094 [DOI] [PubMed] [Google Scholar]
- 5.Wang A. S. et al. , “Soft-tissue imaging with C-arm cone-beam CT using statistical reconstruction,” Phys. Med. Biol. 59(4), 1005–1029 (2014). 10.1088/0031-9155/59/4/1005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Yamada Y. et al. , “Dose reduction in chest CT: Comparison of the adaptive iterative dose reduction 3D, adaptive iterative dose reduction, and filtered back projection reconstruction techniques,” Eur. J. Radiol. 81(12), 4185–4195 (2012). 10.1016/j.ejrad.2012.07.013 [DOI] [PubMed] [Google Scholar]
- 7.Pickhardt P. J. et al. , “Abdominal CT with model-based iterative reconstruction (MBIR): Initial results of a prospective trial comparing ultralow-dose with standard-dose imaging,” Am. J. Roentgenol. 199(6), 1266–1274 (2012). 10.2214/ajr.12.9382 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Miéville F. A., Gudinchet F., Brunelle F., Bochud F. O., and Verdun F. R., “Iterative reconstruction methods in two different MDCT scanners: Physical metrics and 4-alternative forced-choice detectability experiments–A phantom approach,” Phys. Med. 29(1), 99–110 (2013). 10.1016/j.ejmp.2011.12.004 [DOI] [PubMed] [Google Scholar]
- 9.Nelson R. C., Feuerlein S., and Boll D. T., “New iterative reconstruction techniques for cardiovascular computed tomography: How do they work, and what are the advantages and disadvantages?,” J. Cardiovasc. Comput. Tomogr. 5(5), 286–292 (2011). 10.1016/j.jcct.2011.07.001 [DOI] [PubMed] [Google Scholar]
- 10.Miéville F. A. et al. , “Model-based iterative reconstruction in pediatric chest CT: Assessment of image quality in a prospective study of children with cystic fibrosis,” Pediatr. Radiol. 43(5), 558–567 (2013). 10.1007/s00247-012-2554-4 [DOI] [PubMed] [Google Scholar]
- 11.Siewerdsen J. H. et al. , “Volume CT with a flat-panel detector on a mobile, isocentric C-arm: Pre-clinical investigation in guidance of minimally invasive surgery,” Med. Phys. 32(1), 241–254 (2005). 10.1118/1.1836331 [DOI] [PubMed] [Google Scholar]
- 12.Qureshi A. I., Mendelow A. D., and Hanley D. F., “Intracerebral haemorrhage,” Lancet 373(9675), 1632–1644 (2009). 10.1016/s0140-6736(09)60371-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Schafer S. et al. , “Mobile C-arm cone-beam CT for guidance of spine surgery: Image quality, radiation dose, and integration with interventional guidance,” Med. Phys. 38, 4563–4574 (2011). 10.1118/1.3597566 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zrinzo L., Foltynie T., Limousin P., and Hariz M. I., “Reducing hemorrhagic complications in functional neurosurgery: A large case series and systematic literature review,” J. Neurosurg. 116(1), 84–94 (2012). 10.3171/2011.8.jns101407 [DOI] [PubMed] [Google Scholar]
- 15.Tuy H. K., “An inversion formula for cone-beam reconstruction,” SIAM J. Appl. Math. 43(3), 546–552 (1983). 10.1137/0143035 [DOI] [Google Scholar]
- 16.Tang X., Hsieh J., Hagiwara A., Nilsen R.a., Thibault J.-B., and Drapkin E., “A three-dimensional weighted cone beam filtered backprojection (CB-FBP) algorithm for image reconstruction in volumetric CT under a circular source trajectory,” Phys. Med. Biol. 50(16), 3889–3905 (2005). 10.1088/0031-9155/50/16/016 [DOI] [PubMed] [Google Scholar]
- 17.Sidky E. Y. and Pan X., “Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization,” Phys. Med. Biol. 53(17), 4777–4807 (2008). 10.1088/0031-9155/53/17/021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zhuang T., Zambelli J., Nett B., Leng S., and Chen G.-H., “Exact and approximate cone-beam reconstruction algorithms for C-arm based cone-beam CT using a two-concentric-arc source trajectory,” Proc. SPIE 6913, 691321 (2008). 10.1117/12.772390 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Bartolac S., Clackdoyle R., Noo F., Siewerdsen J., Moseley D., and Jaffray D., “A local shift-variant fourier model and experimental validation of circular cone-beam computed tomography artifacts,” Med. Phys. 36(2), 500–512 (2009). 10.1118/1.3062875 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Defrise M., Noo F., and Kudo H., “A solution to the long-object problem in helical cone-beam tomography,” Phys. Med. Biol. 45, 623–643 (2000). 10.1088/0031-9155/45/3/305 [DOI] [PubMed] [Google Scholar]
- 21.Zou Y. and Pan X., “Exact image reconstruction on PI-lines from minimum data in helical cone-beam CT,” Phys. Med. Biol. 49(6), 941–959 (2004). 10.1088/0031-9155/49/6/006 [DOI] [PubMed] [Google Scholar]
- 22.Courdurier M., Noo F., Defrise M., and Kudo H., “Solving the interior problem of computed tomography using a priori knowledge,” Inverse Probl. 24(6), 065001 (2008). 10.1088/0266-5611/24/6/065001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kudo H., Courdurier M., Noo F., and Defrise M., “Tiny a priori knowledge solves the interior problem in computed tomography,” Phys. Med. Biol. 53(9), 2207–2231 (2008). 10.1088/0031-9155/53/9/001 [DOI] [PubMed] [Google Scholar]
- 24.Bian J. et al. , “Evaluation of sparse-view reconstruction from flat-panel-detector cone-beam CT,” Phys. Med. Biol. 55(22), 6575–6599 (2010). 10.1088/0031-9155/55/22/001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Xu F. and Mueller K., “Accelerating popular tomographic reconstruction algorithms on commodity PC graphics hardware,” IEEE Trans. Nucl. Sci. 52(3), 654–663 (2005). 10.1109/tns.2005.851398 [DOI] [Google Scholar]
- 26.Kole J. S. and Beekman F. J., “Evaluation of accelerated iterative x-ray CT image reconstruction using floating point graphics hardware,” Phys. Med. Biol. 51(4), 875–889 (2006). 10.1088/0031-9155/51/4/008 [DOI] [PubMed] [Google Scholar]
- 27.Jia X., Dong B., Lou Y., and Jiang S. B., “GPU-based iterative cone-beam CT reconstruction using tight frame regularization,” Phys. Med. Biol. 56(13), 3787–3807 (2011). 10.1088/0031-9155/56/13/004 [DOI] [PubMed] [Google Scholar]
- 28.Hudson H. M. and Larkin R. S., “Accelerated image reconstruction using ordered subsets of projection data,” IEEE Trans. Med. Imaging 13(4), 601–609 (1994). 10.1109/42.363108 [DOI] [PubMed] [Google Scholar]
- 29.Ramani S. and Fessler J. A., “A splitting-based iterative algorithm for accelerated statistical x-ray CT reconstruction,” IEEE Trans. Med. Imaging 31(3), 677–688 (2012). 10.1109/tmi.2011.2175233 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Erdogan H. and Fessler J. A., “Ordered subsets algorithms for transmission tomography,” Phys. Med. Biol. 44(11), 2835–2851 (1999). 10.1088/0031-9155/44/11/311 [DOI] [PubMed] [Google Scholar]
- 31.Lange K. and Carson R., “EM reconstruction algorithms for emission and transmission tomography,” J. Comput. Assist. Tomogr. 8(2), 306–316 (1984). [PubMed] [Google Scholar]
- 32.Bouman C. and Sauer K., “A unified approach to statistical tomography using coordinate descent optimization,” IEEE Trans. Image Process. 5(3), 480–492 (1996). 10.1109/83.491321 [DOI] [PubMed] [Google Scholar]
- 33.Fessler J. A., Ficaro E. P., Clinthorne N. H., and Lange K., “Grouped-coordinate ascent algorithms for penalized-likelihood transmission image reconstruction,” IEEE Trans. Med. Imaging 16(2), 166–175 (1997). 10.1109/42.563662 [DOI] [PubMed] [Google Scholar]
- 34.Nesterov Y., “Smooth minimization of non-smooth functions,” Math. Program. 103, 127–152 (2005). 10.1007/s10107-004-0552-5 [DOI] [Google Scholar]
- 35.Nesterov Y., Introductory Lectures on Convex Optimization: A Basic Course (Kluwer Academic Publishers, Norwell, Massachusetts, 2004). [Google Scholar]
- 36.Jørgensen J. et al. , “Accelerated gradient methods for total-variation-based CT image reconstruction,” in11th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully 3D, Potsdam, Germany, 2011), pp. 435–438. [Google Scholar]
- 37.Jensen T. L., Jørgensen J. H., Hansen P. C., and Jensen S. H., “Implementation of an optimal first-order method for strongly convex total variation regularization,” BIT Numer. Math. 52(2), 329–356 (2011). 10.1007/s10543-011-0359-8 [DOI] [Google Scholar]
- 38.Li T., Li X., Yang Y., Zhang Y., Heron D. E., and Huq M. S., “Simultaneous reduction of radiation dose and scatter for CBCT by using collimators,” Med. Phys. 40(12), 121913 (10pp.) (2013). 10.1118/1.4831970 [DOI] [PubMed] [Google Scholar]
- 39.Niu T., Ye X., Fruhauf Q., Petrongolo M., and Zhu L., “Accelerated barrier optimization compressed sensing (ABOCS) for CT reconstruction with improved convergence,” Phys. Med. Biol. 59(7), 1801–1814 (2014). 10.1088/0031-9155/59/7/1801 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kim D., Ramani S., and Fessler J. A., “Accelerating x-ray CT ordered subsets image reconstruction with Nesterov’s first-order methods,” in 12th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully 3D, Lake Tahoe, CA, 2013), pp. 22–25. [Google Scholar]
- 41.Kim D., Ramani S., and Fessler J. A., “Combining ordered subsets and momentum for accelerated x-ray CT image reconstruction,” IEEE Trans. Med. Imaging 34(1), 167–178 (2015). 10.1109/tmi.2014.2350962 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Beck A. and Teboulle M., “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sci. 2(1), 183–202 (2009). 10.1137/080716542 [DOI] [Google Scholar]
- 43.Goldstein T. and Osher S., “The split Bregman method for L1-regularized problems,” SIAM J. Imaging Sci. 2(2), 323–343 (2009). 10.1137/080725891 [DOI] [Google Scholar]
- 44.Vandeghinste B. et al. , “Split-Bregman-based sparse-view CT reconstruction,” in 11th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine (Fully 3D, Potsdam, Germany, 2011), pp. 431–434. [Google Scholar]
- 45.Nien H. and Fessler J. A., “Fast splitting-based ordered-subsets x-ray CT image reconstruction,” in The Third International Conference on Image Formation in X-ray Computed Tomography (CT Meeting, Salt Lake City, UT, 2014), pp. 291–294. [Google Scholar]
- 46.Tseng P., “Approximation accuracy gradient methods and error bound for structured convex optimization,” Math. Program. 125, 263–295 (2010). 10.1007/s10107-010-0394-2 [DOI] [Google Scholar]
- 47.Kim D. and Fessler J. A., “Ordered subsets acceleration using relaxed momentum for x-ray CT image reconstruction,” IEEE Nuclear Science Symposium and Medical Imaging Conference (IEEE, New York, NY, 2013), pp. 1–5. [Google Scholar]
- 48.Kim D. and Fessler J. A., “Optimized momentum steps for accelerating x-ray CT ordered subsets image reconstruction,” in Third International Conference on Image Formation in X-ray Computed Tomography (CT Meeting, Salt Lake City, UT, 2014), pp. 103–106. [Google Scholar]
- 49.Long Y., Fessler J. A., and Balter J. M., “3D forward and back-projection for x-ray CT using separable footprints,” IEEE Trans. Med. Imaging 29(11), 1839–1850 (2010). 10.1109/tmi.2010.2050898 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Kay T. L. and Kajiya J. T., “Ray tracing complex scenes,” in ACM SIGGRAPH Computer Graphics (ACM, New York, NY, 1986), pp. 269–278. [Google Scholar]
- 51.Siddon R. L., “Prism representation: A 3D ray-tracing algorithm for radiotherapy applications,” Phys. Med. Biol. 30(8), 817–824 (1985). 10.1088/0031-9155/30/8/005 [DOI] [PubMed] [Google Scholar]
- 52.Peters T. M., “Algorithms for fast back-and re-projection in computed tomography,” IEEE Trans. Nucl. Sci. 28(4), 3641–3647 (1981). 10.1109/tns.1981.4331812 [DOI] [Google Scholar]
- 53.Kamphuis C. and Beekman F., “Dual matrix ordered subsets reconstruction for accelerated 3D scatter compensation in single-photon emission tomography,” Eur. J. Nucl. Med. 25(1), 8–18 (1998). 10.1007/s002590050188 [DOI] [PubMed] [Google Scholar]
- 54.Glick S. J. and Soares E. J., “Noise characteristics of SPECT iterative reconstruction with a mis-matched projector-backprojector pair,” IEEE Trans. Nucl. Sci. 45(4), 2183–2188 (1998). 10.1109/23.708339 [DOI] [Google Scholar]
- 55.Zeng G. and Gullberg G., “Unmatched projector/backprojector pairs in an iterative reconstruction algorithm,” IEEE Trans. Med. Imaging 19(5), 548–555 (2000). 10.1109/42.870265 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Momey F., Denis L., Mennessier C., Thiebaut E., Becker J., and Desbat L., “A B-spline based and computationally performant projector for iterative reconstruction in tomography: Application to dynamic x-ray gated CT,” in Second International Conference on Image Formation in X-ray Computed Tomography (CT Meeting, Salt Lake City, UT, 2012), pp. 157–160. [Google Scholar]
- 57.Schmitt K., Schondube H., Stierstorfer K., Hornegger J., and Noo F., “Analysis of bias induced by various forward projection models in iterative reconstruction,” in Second International Conference on Image Formation in X-ray Computed Tomography (CT Meeting, Salt Lake City, UT, 2012), pp. 288–292. [Google Scholar]
- 58.Schmitt K., Schondube H., Stierstorfer K., Hornegger J., and Noo F., “Task-based comparison of linear forward projection models in iterative CT reconstruction,” in Third International Conference on Image Formation in X-ray Computed Tomography (CT Meeting, Salt Lake City, UT, 2014), pp. 56–59. [Google Scholar]
- 59.De Man B., Basu S., and De Man B., “Distance-driven projection and backprojection in three dimensions,” Phys. Med. Biol. 49(11), 2463–2475 (2004). 10.1088/0031-9155/49/11/024 [DOI] [PubMed] [Google Scholar]
- 60.Ziegler A., Kohler T., Nielsen T., and Proksa R., “Efficient projection and backprojection scheme for spherically symmetric basis functions in divergent beam geometry,” Med. Phys. 33(12), 4653–4663 (2006). 10.1118/1.2388570 [DOI] [PubMed] [Google Scholar]
- 61.Horbelt S., Liebling M., Unser M., and Member S., “Discretization of the radon transform and of its inverse by Spline convolutions,” IEEE Trans. Med. Imaging 21(4), 363–376 (2002). 10.1109/tmi.2002.1000260 [DOI] [PubMed] [Google Scholar]
- 62.Beekman F. J. and Kamphuis C., “Ordered subset reconstruction for x-ray CT,” Phys. Med. Biol. 46(7), 1835–1844 (2001). 10.1088/0031-9155/46/7/307 [DOI] [PubMed] [Google Scholar]
- 63.Kolditz D., Meyer M., Kyriakou Y., and Kalender W. A., “Comparison of extended field-of-view reconstructions in C-arm flat-detector CT using patient size, shape or attenuation information,” Phys. Med. Biol. 56(1), 39–56 (2011). 10.1088/0031-9155/56/1/003 [DOI] [PubMed] [Google Scholar]
- 64.Lauzier P. T., Tang J., and Chen G.-H., “Time-resolved cardiac interventional cone-beam CT reconstruction from fully truncated projections using the prior image constrained compressed sensing (PICCS) algorithm,” Phys. Med. Biol. 57(9), 2461–2476 (2012). 10.1088/0031-9155/57/9/2461 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Kim D., Pal D., Thibault J., and Fessler J. A., “Accelerating ordered subsets image reconstruction for x-ray CT using spatially nonuniform optimization transfer,” IEEE Trans. Med. Imaging 32(11), 1965–1978 (2013). 10.1109/tmi.2013.2266898 [DOI] [PMC free article] [PubMed] [Google Scholar]