Evaluation of Fluence Reduction versus Sparsity for Diffusion Posterior Sampling Reconstruction in Low-Dose CT

Zimo Liu; Xin Wang; Xiao Jiang; Altea Lorenzon; Grace J Gang; J Webster Stayman

doi:10.1117/12.3087836

. Author manuscript; available in PMC: 2026 Apr 21.

Published in final edited form as: Proc SPIE Int Soc Opt Eng. 2026 Apr 2;13924:139241J. doi: 10.1117/12.3087836

Evaluation of Fluence Reduction versus Sparsity for Diffusion Posterior Sampling Reconstruction in Low-Dose CT

Zimo Liu ^a, Xin Wang ^a, Xiao Jiang ^a, Altea Lorenzon ^a, Grace J Gang ^b, J Webster Stayman ^a

PMCID: PMC13095152 NIHMSID: NIHMS2165009 PMID: 42017038

Abstract

Low-dose computed tomography (CT) remains a popular research topic with the advent of an increasing number of algorithmic solutions to control noise. One such approach that enforces data consistency through a model-based data likelihood term but that also includes a deep learning generative prior is Diffusion Posterior Sampling (DPS). This technique is formulated within a probabilistic framework and is capable of generating high-quality reconstructions under noisy and/or undersampled conditions. However, one major unanswered question is, given the opportunity to design a low-dose protocol, how should low dose be achieved - through sparse sampling or reduced fluence per projection. In this work, we conducted a simulation study and systematically investigated the impact of acquisition parameters - the number of views $(n_{view})$ and incident photons per view $(I_{0})$ - on DPS-based CT reconstruction. We performed a 2D sweep over different combinations of the number of views $(n_{view})$ and incident photons per view $(I_{0})$ and compared reconstructions with an equivalent total incident photons (TIP). Reconstruction quality was evaluated in terms of PSNR (Peak Signal-to-Noise Ratio), bias, and posterior sample variability. We found that the number of views had a strong influence on image quality and that most performance curves showed a transition where too few views had a large negative impact on performance. We observed that there is an advantage to be gained by jointly optimizing both the fluence per view and the number of views, with a trend of an increasing number of views required for a higher total incident fluence. These findings provide a strategy for optimizing CT acquisition protocols that adapt both fluence per view and sparsity to optimally maintain image quality at reduced radiation doses.

1. INTRODUCTION

Computed tomography (CT) plays an important role in clinical diagnostics and research, but the trade-off between dose and image quality remains a challenge.¹ Generally speaking, low-dose CT can be achieved either by lowering the number of projections or by lowering the x-ray tube fluence per projection (e.g. by reducing the tube current).² While most x-ray CT devices allow for customization of the x-ray technique for different tube currents, the ability to perform sparse data acquisition is more common on cone-beam CT (CBCT) systems that are in use in dedicated scanners (breast, extremities, head), in on-board imaging for radiation therapy, and interventional imaging devices (C-arm systems, etc.). The ability to perform sparse acquisitions on traditional multi-row diagnostic CT remains largely a research topic. However, it is possible that new reconstruction algorithms would favor sparse data acquisition in terms of performance, and that could drive clinical access to such protocols.

Traditional model-based iterative reconstruction (MBIR), with its ability to weigh statistical data fidelity of different measurements and integrate sophisticated measurement models, has demonstrated significant improvements over analytic methods.³ However, such methods are often limited by the hand-crafted image priors that are adopted (e.g. simple roughness penalties or related priors based on Markov random fields). Deep learning approaches have found widespread interest with their ability to capture sophisticated prior information about human anatomy - resulting in impressive imaging performance improvements in low-dose CT data. Unfortunately, many of those deep learning approaches do not enforce consistency with the measurement data, raising concerns about interpretability, potential for hallucination, etc. Recently, diffusion posterior sampling (DPS) CT reconstruction⁴ has been introduced that (through a one-time training) combines a generative prior of human anatomy with a classic likelihood measurement model. This approach has the potential to deliver the advantages of both MBIR and deep learning. In this work, we performed a simulation study to explore the relationship between number of views $(n_{view})$ and incident photons per view $(I_{0})$ in DPS reconstruction. We considered the relative performance of different protocols with the same total number of incident photons (TIP), which is used as a surrogate for dose.

2. METHODS

2.1. Diffusion posterior sampling (DPS) with a nonlinear physical model

The DPS approach requires a forward model for the mean measurements acquired by the CT system. We adopt a typical monoenergetic model⁵ based on Beer’s law:

\bar{y} = B exp \{- A x\},

(1)

where $x$ denotes the attenuation volume representing the object, $B$ incorporates scan-related factors including photon fluence, detector gain, etc., and $A$ is the system matrix representing the linear projection operation. In this work, we focus on protocols with varied fluence and sparsity which are modeled in $B = I_{0}$ (uniform barebeam fluence with magnitude $I_{0}$ ) and in $A$ where a variable number of projection views are acquired. In this work, all projection and backprojection operations are computed using the CTorch toolbox,⁶ a PyTorch-compatible, GPU-accelerated library.

We summarize the DPS framework for the above forward nonlinear CT reconstruction that was proposed in Ref.⁷ Construction of the anatomical prior is based on a forward stochastic process that corrupts a clean image $x_{0}$ into a noisy version $x_{t}$ by adding Gaussian noise:

x_{t} = \sqrt{{\overline{α}}_{t}} x_{0} + \sqrt{1 - {\overline{α}}_{t}} ϵ, ϵ ~ 𝒩 (0, I),

(2)

where $α_{t} = 1 - β_{t}, {\overline{α}}_{t} = \prod_{s = 1}^{t} α_{s}$ , and $β_{t}$ is a predefined noise addition schedule. At each step of the reverse process, a trained DDPM⁸ predicts the noise $ϵ_{θ} (x_{t}, t)$ , from which an estimate of the clean image is obtained as

{\hat{x}}_{0} (x_{t}, t) = \frac{x_{t} - \sqrt{1 - {\overline{α}}_{t}} ϵ_{θ} (x_{t}, t)}{\sqrt{{\overline{α}}_{t}}} .

(3)

This estimate ${\hat{x}}_{0}$ is then combined with the current noisy sample $x_{t}$ to generate the next state $x_{t - 1}^{'}$ :

x_{t - 1}^{'} = \frac{\sqrt{α_{t} (1 - {\overline{α}}_{t - 1})}}{1 - {\overline{α}}_{t}} x_{t} + \frac{\sqrt{{\overline{α}}_{t - 1} β_{t}}}{1 - {\overline{α}}_{t}} {\hat{x}}_{0} + σ_{t} z .

(4)

To enforce data consistency, ${\hat{x}}_{0}$ is refined by performing an image update that minimizes the discrepancy between the simulated measurement $y$ and the forward projection $B exp (- A {\hat{x}}_{0})$ . In this work, the update is performed based on a Gaussian likelihood objective using the Adam optimizer such that:

{\hat{x}}_{0}^{'} = Adam ({\hat{x}}_{0}, \nabla_{{\hat{x}}_{0}} {‖B exp (- A {\hat{x}}_{0}) - y‖}_{K^{- 1}}^{2}) .

(5)

The final $x_{t - 1}$ can be expressed as:

x_{t - 1} = x_{t - 1}^{'} - {\hat{x}}_{0} + {\hat{x}}_{0}^{'} .

(6)

which preserves the stochasticity of the diffusion sampler while steering the samplings towards higher data consistency. To accelerate reconstruction and improve stability, we applied a jumpstart strategy⁹ by initializing the reverse sampling from $x_{T^{'}}$ obtained by Filtered Back Projection (FBP) reconstruction instead of random Gaussian noise.

2.2. Diffusion model training

The Lung Image Database Consortium Image Collection (LIDC-IDRI)¹⁰ was used to train the prior model. A total of 8693 slices with a size of 512 × 512 were extracted from 50 patient scans and split into training and validation sets with a 4:1 ratio. Hounsfield Units (HU) were converted to attenuation coefficients assuming $μ_{water} = 0.02 {mm}^{- 1}$ . The diffusion prior was trained using the DDPM framework, with $T = 1000$ discretization steps and a variance schedule linearly increasing from $β_{1} = 10^{- 4}$ to $β_{1000} = 0.02$ . The network was implemented in PyTorch, trained with a batch size of 8 for 200 epochs on a single NVIDIA RTX A6000 GPU.

2.3. Data Simulation and Reconstruction

We emulated noisy CT projection data using a patient slice also taken from the LIDC-IDRI dataset, but from a patient not included in the training set. The 2D slice has a size of 512 × 512 with a pixel spacing of 0.8 mm. Projections were simulated according to the forward model in (1). The system geometry was configured with a source-to-detector distance of 1000 mm and a source-to-axis distance of 500 mm. The detector was modeled with 1000 pixels spaced with a pitch of 1.0 mm. The reconstruction protocol was varyed by adjusting the number of (equally spaced views over 360°) views $(n_{view})$ and the incident photon count $(I_{0})$ . We presume that quantum noise dominates the measurements and Poisson noise is added to the prior to reconstruction. We investigate protocols defined by all pairs of $n_{view}$ and $I_{0}$ in the table below. In total, this yields 12 × 12 = 144 unique $(I_{0}, n_{view})$ settings for evaluation.

2.4. Reconstruction Parameter Selection

Our implementation of DPS has three main parameters that influence the output image properties. These are the learning rate (lr) for the Adam optimizer, the number of likelihood updates per diffusion time step $(η)$ , and the jump start time $(T^{'})$ . We have previously observed that these parameters can be tuned to the specific protocol for maximum performance.

Thus, to provide optimized results, for each protocol, a parameter sweep was conducted to identify the optimal hyperparameters for the DPS framework. For every $(I_{0}, n_{view})$ pairs, four independent reconstructions (different noise realizations) were performed across all combinations of $(T^{'}, η, l r)$ , and the combination yielding the highest average PSNR was chosen as the most effective. The parameter ranges investigated were $l r \in \{5 \times 10^{- 4}, 1 \times 10^{- 3}, 5 \times 10^{- 3}, 1 \times 10^{- 2}, 5 \times 10^{- 2}, 1 \times 10^{- 1}\}, η \in {1, 2, 3, 4, 5, 6, 7, 8}$ , and $T^{'} \in {30, 50, 100, 150}$ .

2.5. Evaluation

In our experiments, we consider the total number of incident photons to be a proxy for the radiation dose. This number is proportional to the product of the incident number of photon number per pixel $(I_{0})$ and the total number of projection views $(n_{view})$ : Total Incident Photons $\propto I_{0} \times n_{view}$ (The actual total is additionally multiplied by the number of pixels.)

For each $(I_{0}, n_{view})$ combination, the average PSNR of 10 independent reconstructions (different Poisson noise realizations) is used to assess the image quality. Bias and Standard Deviation (STD) were also to isolate systemic error and variability, and were calculated as:

PSNR (\hat{x}, x) = 10 \cdot {log}_{10} (\frac{M A X^{2}}{MSE (\hat{x}, x)}), Bias (\hat{x}, x) = ‖ E {\hat{x}} - x ‖, STD (\hat{x}) = \sqrt{E \{(\hat{x} - E {\hat{x}})^{2}\}} .

(7)

Here, $\hat{x}$ denotes the reconstructed image and $x$ denotes the ground truth. For each evaluation metric, we visualized the results as 2D heatmaps, overlaid with iso-Total-Incident-Photon (iso-TIP) lines, to visualize protocols with matched exposure. To further examine the influence of acquisition settings on DPS reconstruction, we analyzed representative reconstructions along several iso-TIP lines, providing insight into trade-offs between $I_{0}$ and $n_{view}$ .

3. RESULTS

3.1. Reconstruction Parameter Selection

Fig. 1 shows the optimal parameters as a function of Total Incident Photons with protocols additionally colorcoded by $n_{view}$ . We see that the learning rate was largely constant across protocols except for some of the lowest TIP and $n_{view}$ . The optimal number of likelihood iterations shows an increasing trend as the total incident photons increase - suggesting that larger TIP permits reconstructions that are more highly constrained by the measurement data. The optimal jumpstart steps drops at first as TIP increase, then increase to 100 with higher TIP. This trend is more difficult to explain but likely is related to the quality of the initial FBP reconstruction used for the jumpstart procedure. Overall, the trends are relatively smooth and predictable suggesting an ability to tune parameter settings based on the acquisition protocol.

Figure 1. — The optimal jumpstart steps $T^{'}$ , the number of likelihood iterations $η$ and learning rate $l r$ for different Total Incident Photons. The color of the scatter represents number of views as shown in the colorbar.

3.2. Imaging Performance as a Function of Protocol

Fig. 2 shows the distribution of PSNR with different combinations of $I_{0}$ and $n_{view}$ . The heatmap shows PSNR, Bias and STD metrics using 10 independent samples for each protocol. Additionally, the dashed curves represent iso-TIP lines with 2% tolerance. The expected general trends showing improvements in all image metrics for a higher number of views and higher $I_{0}$ are evident.

Figure 2. — Summary of performance across the $(n_{view}, I_{0})$ protocol space. Three image quality metrics are shown: PSNR, bias, and STD. Additionally iso-TIP contours are shown illsutrated approximately dose-matched protocols.

Trends for protocols of fixed Total Incident Photons are somewhat difficult to assess. To better elucidate the relative trade-off between $I_{0}$ and $n_{view}$ more directly, we plot MSE, bias and STD at different Total Incident Photons in Fig. 3. For MSE at fixed dose, increasing $n_{view}$ reduces error significantly at first, then causes a slight increase at the higher $n_{view}$ . As the Total Incident Photons increase, $n_{view}$ of the minimum MSE also increases. This suggests that for a given fluence level there is a minimum number of views required before performance largely levels off; however, there is some additional performance to be gained by not spreading the “photon budget” over too many views. Bias plots show similar trends as MSE, where the bias drops at first then slightly increases. The minimum bias point also moves right as the Total Incident Photons increase. Again, the optimal acquisition parameters remain in the medium to higher range for nview and medium to lower range for $I_{0}$ . STD shows slightly irregular trends, particular for the very low TIP cases, likely due to the limited number of independent reconstructions, but suggests similar optimality trends.

Figure 3. — (a) MSE of a single sample as a function of $n_{view}$ at iso-TIP lines. Minimum is circled for each iso-TIP lines. (b) Bias between the average of 10 independent reconstructions and ground truth. Minimum is circled for each iso-TIP lines. (c) STD between 10 independent reconstructions.

To illustrate qualitatively the difference in image quality for different protocols Fig. 4 shows reconstructions with fixed Total Incident Photons (TIP = 4.5 × 10⁶) with different combinations of incident photons per view $(I_{0})$ and number of views $(n_{view})$ . The corresponding error maps are shown under each reconstruction to further illustrate the differences in image quality. From the reconstructions and error maps, image quality increases evidently as the number of views increases from 30 to 270 then drops slightly. Visually, the slight increase in error for increasing number of view beyond 270 is difficult to see qualitatively, though this is captured in the MSE trend.

Figure 4. — Reconstructions and Error maps (Total Incident Photons = 4.5 × 10⁶). Window: [0.015, 0.025]mm⁻¹.

4. DISCUSSION AND CONCLUSION

In this study, we explored the trade-off between different combinations of number of views $(n_{view})$ and incident photons per view $(I_{0})$ using DPS reconstruction. These preliminary findings suggest that there is indeed a trade-off that can be optimized, rather than strictly maximizing the number of views. A low number of views can impact image quality more significantly than lower incident photons per view, especially at low total incident photons cases. However, a combination of medium or higher number of views $(n_{view})$ and medium or lower incident photons per view $(I_{0})$ usually has the best image quality. Qualitatively, this improvement can be difficult to see since the quantitative image quality advantage is not especially large in the scenario we explored.

These preliminary investigations are limited in several ways. Ongoing work seeks to perform simulations across more imaging scenarios (additional patient anatomies, anatomical sites, etc.) and with more independent reconstructions.

Table 1.

Protocols Investigated

Parameter	Values
$n_{view}$	30, 60, 90, 120, 150, 180, 270, 360, 540, 720, 900, 1080
$I_{0}$	4166, 8333, 12500, 16666, 20833, 25000, 37500, 50000, 75000, 100000, 125000, 150000

Open in a new tab

ACKNOWLEDGMENTS

This work was supported, in part, by NIH grant R01CA249538 and R01EB035908.

REFERENCES

[1].Kalra MK, Maher MM, Toth TL, Hamberg LM, Blake MA, Shepard J-A, and Saini S, “Strategies for ct radiation dose optimization,” Radiology 230(3), 619–628 (2004). [DOI] [PubMed] [Google Scholar]
[2].Yan H, Cervino L, Jia X, and Jiang SB, “A comprehensive study on the relationship between the image quality and imaging dose in low-dose cone beam ct,” Physics in Medicine & Biology 57(7), 2063 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Beister M, Kolditz D, and Kalender WA, “Iterative reconstruction methods in x-ray ct,” Physica medica 28(2), 94–108 (2012). [DOI] [PubMed] [Google Scholar]
[4].Li S, Jiang X, Tivnan M, Gang GJ, Shen Y, and Stayman JW, “Ct reconstruction using diffusion posterior sampling conditioned on a nonlinear measurement model,” Journal of Medical Imaging 11(4), 043504–043504 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Tilley S, Jacobson M, Cao Q, Brehler M, Sisniega A, Zbijewski W, and Stayman JW, “Penalized-likelihood reconstruction with high-fidelity measurement models for high-resolution cone-beam imaging,” IEEE transactions on medical imaging 37(4), 988–999 (2017). [Google Scholar]
[6].Jiang X, Gang GJ, and Stayman JW, “Ctorch: Pytorch-compatible gpu-accelerated auto-differentiable projector toolbox for computed tomography,” arXiv preprint arXiv:2503.16741 (2025). [Google Scholar]
[7].Li S, Tivnan M, and Stayman JW, “Diffusion posterior sampling for nonlinear ct reconstruction,” in [Medical Imaging 2024: Physics of Medical Imaging], 12925, 196–200, SPIE (2024). [Google Scholar]
[8].Ho J, Jain A, and Abbeel P, “Denoising diffusion probabilistic models,” Advances in neural information processing systems 33, 6840–6851 (2020). [Google Scholar]
[9].Jiang X, Li S, Teng P, Gang G, and Stayman JW, “Strategies for ct reconstruction using diffusion posterior sampling with a nonlinear model,” ArXiv, arXiv-2407 (2024). [Google Scholar]
[10].Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, et al. , “The lung image database consortium (lidc) and image database resource initiative (idri): A completed reference database of lung nodules on ct scans,” Medical Physics 38(2), 915–931 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] [1].Kalra MK, Maher MM, Toth TL, Hamberg LM, Blake MA, Shepard J-A, and Saini S, “Strategies for ct radiation dose optimization,” Radiology 230(3), 619–628 (2004). [DOI] [PubMed] [Google Scholar]

[R2] [2].Yan H, Cervino L, Jia X, and Jiang SB, “A comprehensive study on the relationship between the image quality and imaging dose in low-dose cone beam ct,” Physics in Medicine & Biology 57(7), 2063 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Beister M, Kolditz D, and Kalender WA, “Iterative reconstruction methods in x-ray ct,” Physica medica 28(2), 94–108 (2012). [DOI] [PubMed] [Google Scholar]

[R4] [4].Li S, Jiang X, Tivnan M, Gang GJ, Shen Y, and Stayman JW, “Ct reconstruction using diffusion posterior sampling conditioned on a nonlinear measurement model,” Journal of Medical Imaging 11(4), 043504–043504 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Tilley S, Jacobson M, Cao Q, Brehler M, Sisniega A, Zbijewski W, and Stayman JW, “Penalized-likelihood reconstruction with high-fidelity measurement models for high-resolution cone-beam imaging,” IEEE transactions on medical imaging 37(4), 988–999 (2017). [Google Scholar]

[R6] [6].Jiang X, Gang GJ, and Stayman JW, “Ctorch: Pytorch-compatible gpu-accelerated auto-differentiable projector toolbox for computed tomography,” arXiv preprint arXiv:2503.16741 (2025). [Google Scholar]

[R7] [7].Li S, Tivnan M, and Stayman JW, “Diffusion posterior sampling for nonlinear ct reconstruction,” in [Medical Imaging 2024: Physics of Medical Imaging], 12925, 196–200, SPIE (2024). [Google Scholar]

[R8] [8].Ho J, Jain A, and Abbeel P, “Denoising diffusion probabilistic models,” Advances in neural information processing systems 33, 6840–6851 (2020). [Google Scholar]

[R9] [9].Jiang X, Li S, Teng P, Gang G, and Stayman JW, “Strategies for ct reconstruction using diffusion posterior sampling with a nonlinear model,” ArXiv, arXiv-2407 (2024). [Google Scholar]

[R10] [10].Armato SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, et al. , “The lung image database consortium (lidc) and image database resource initiative (idri): A completed reference database of lung nodules on ct scans,” Medical Physics 38(2), 915–931 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Evaluation of Fluence Reduction versus Sparsity for Diffusion Posterior Sampling Reconstruction in Low-Dose CT

Zimo Liu

Xin Wang

Xiao Jiang

Altea Lorenzon

Grace J Gang

J Webster Stayman

Abstract

1. INTRODUCTION

2. METHODS