Abstract
Many useful image quality metrics for evaluating linear image reconstruction techniques do not apply to or are difficult to interpret for nonlinear image reconstruction. The vast majority of metrics employed for evaluating nonlinear image reconstruction are based on some form of global image fidelity, such as image root mean square error (RMSE). Use of such metrics can lead to overregularization in the sense that they can favor removal of subtle details in the image. To address this short-coming, we develop an image quality metric based on signal detection that serves as a surrogate to the qualitative loss of fine image details. The metric is demonstrated in the context of a breast CT simulation, where different equal-dose configurations are considered. The configurations differ in the number of projections acquired. Image reconstruction is performed with a nonlinear algorithm based on total variation constrained least-squares (TV-LSQ). The resulting images are studied as a function of three parameters: number of views acquired, total variation constraint value, and number of iterations. The images are evaluated visually, with image RMSE, and with the proposed signal-detection-based metric. The latter uses a small signal, and computes detectability in the sinogram and in the reconstructed image. Loss of signal detectability through the image reconstruction process is taken as a quantitative measure of loss of fine details in the image. Loss of signal detectability is seen to correlate well with the blocky or patchy appearance due to overregularization with TV-LSQ, and this trend runs counter to the image RMSE metric, which tends to favor the over-regularized images. The proposed signal detection-based metric provides an image quality assessment that is complimentary to that of image RMSE. Using the two metrics in concert may yield a useful prescription for determining CT algorithm and configuration parameters when nonlinear image reconstruction is used.
Keywords: CT image quality, image reconstruction, model observers, total-variation
1. INTRODUCTION
The effort in developing nonlinear image reconstruction algorithms for x-ray computed tomography (CT) has been steadily increasing over the past couple of decades. The nonlinearity arises from incorporation of some forms of prior information in the reconstruction process or some forms of physics modeling. For example, edge-preserving regularization and spectral response modeling both yield an image reconstruction algorithm that yields images that depend nonlinearily on the CT data.1,2 Exploitation of sparsity or transform sparsity also involves nonlinear image reconstruction.3–6 Most recently, deep-learning-based data processing is being investigated for generating tomographic images directly from CT projection data using convolutional neural networks (CNNs).7,8 Such CNNs also process the tomographic data in a nonlinear fashion.
While nonlinear image reconstruction may allow for accurate image reconstruction in CT systems involving low-dose illumination or sparse sampling, the resulting image characteristics can depend strongly on the scanned object. This object dependence presents a difficult challenge for developing meaningful image quality metrics needed to guide algorithm parameter selection in a nonsubjective fashion. As a result, much work on nonlinear image reconstruction techniques present images resulting from algorithms where the parameters are tuned by eye. Such an approach may be fine for an initial introduction of a new image reconstruction algorithm or if the CT system/reconstruction parameter space is limited enough where it is feasible to tune by eye. The tune by eye method, however, blunts the impact of advanced image reconstruction because such image reconstruction techniques themselves involve numerous parameters and they aim to broaden the scope of possible CT system configurations — enlarging the parameter space of CT hardware. Attempting to perform comparisons between different nonlinear image reconstruction algorithms only complicates the matter further. With a large parameter space, the tune-by-eye method becomes impractical.
Avoiding the subjective tune-by-eye method, many researchers in advanced CT image reconstruction turn to one of three image fidelity metrics in their simulations: root-mean-square-error (RMSE), peak signal-to-noise ratio (PSNR), or structural similarity index (SSIM). These metrics are useful, in a simulation setting, because they present a measure of how close a reconstructed image is to a ground truth image. This information in turn is useful for investigating the underlying inverse problem. When considering clinical imaging tasks that rely on viewing subtle image features, optimizing system and reconstruction parameters on these global image fidelity metrics can lead to significantly over-regularized images.
One problem is that these image fidelity metrics do not provide a sense of image resolution, noise level, or noise quality. PSNR, from its name, would seem to provide information on the image noise level, but what is called “noise” in PSNR is actually the difference between the reconstructed and truth images, and this difference includes both image noise and deterministic artifacts from either unmodeled non-stochastic physics or insufficient sampling. For nonlinear image reconstruction algorithms, concepts such as the point-spread function and the noise power spectrum do not have a simple and direct interpretation as they do for linear systems theory. For example, in nonlinear image reconstruction, the resulting image cannot be interpreted as a convolution of a reconstructed point-like object and the underlying true object function. As a result, they are used rarely in the evaluation of nonlinear image reconstruction.
In order to prevent over-smoothing by optimizing nonlinear image reconstruction solely on image RMSE, an image quality metric is needed that is sensitive to subtle features in the image and that is easy to interpret. To develop such a metric, we turn to signal detection theory, and investigate the use of the ideal observer for a simple signal-known-exactly/background-known-exactly (SKE/BKE) detection task.9 Signal detection theory has been investigated in the context of evaluation of image reconstruction algorithms.10–18 For the present work, the signal is chosen to be a point-like object and its amplitude is set so that it is at the limit of detectability in the CT data space. It is known, that image reconstruction or any other image processing operations cannot increase signal detectability (see pages 829–30 in Barrett and Myers9), but it is possible that image reconstruction can reduce signal detectability. Quantifying this loss of detectability is precisely what we would like to use as a measure of overregularization. Having such a measure would allow optimization of image RMSE under the condition that signal detection is constrained to be at or above a desired set level and thus prevent overregularization.
The setting for developing this metric, here, is a dedicated breast CT simulation where image reconstruction is performed by total variation (TV), least-squares optimization (TV-LSQ). The TV-LSQ algorithm is nonlinear and it allows for accurate image reconstruction from sparse-view CT data under ideal noiseless conditions. When TV-LSQ is employed for noisy, realistic data, it is often reported that the images are patchy or blocky, and one solution to avoid this subject quality is to generalize the TV-norm.19,20 For the present work, however, we argue that the patchiness resulting from use of TV regularization can also be a side effect of overregularization due to parameter optimization using image RMSE. Using the proposed signal detectability metric can help to disallow parameter settings that cause overregularization and, specifically, the patchy appearance from overregularization with the TV-norm.
We point out that the patchy appearance for overregularization with the TV-norm is a somewhat subjective assessment, and therefore the claim that the proposed metric characterizes patchiness quantitatively is also subjective and cannot be proven mathematically. We do attempt to design the simulation so that the subjectivity is limited as much as possible, but in the end the utility of the proposed metric can only be demonstrated by showing metric correspondences with images and it is left to the observer to decide whether this correspondence is useful or not.
In Section 2, we present the parameters of the breast CT simulation, the details of the TV-LSQ algorithm, and the channelized Hotelling observer (CHO) for the SKE/BKE detection task. The results, presented in Section 3, demonstrate the correspondences between the proposed signal detection metric and reconstructed images for select parameter settings of the breast CT simulation and TV-LSQ algorithm. The results are discussed in Section 4, and finally, we conclude the paper in Section 5.
2. MATERIALS AND METHODS
2.A. Breast CT simulation
For the studies presented here, we consider a fixed dose simulation, where the number of projections is varied while keeping the total patient exposure constant. The configuration is two-dimensional circular, fan-beam scanning and is representative of the mid-plane slice of a three-dimensional circular cone-beam scan. The mean continuous data function, g, is modeled as the x-ray transform of the object function
| (1) |
where f represents the continuous object function; Pf is the continuous x-ray transform of f; θ indicate the view angle of the x-ray source; ξ is the detector bin location on a linear detector; r0(θ) indicates the x-ray source position; and the unit vector points from the x-ray source to the detector bin indicated by ξ, accordingly is a function of θ and ξ. The data function is sampled at a variable number of views Nviews and 512 detector bins. The noise level in the measured transmission data is specified by fixing the total number of incident photons to
In the simulations, we consider varying Nviews between 128 and 512, and for the maximum end of this range the number of incident photons along each measured ray is approximately 38,000 photons, which is on the low end of actual breast CT systems.21,16 To model noise due to the detection of finite numbers of quanta, a Poisson distribution is assumed in the x-ray transmission measurements. Accounting for the logarithm processing needed to arrive at the line-integration model, Eq. (1), we model the noisy discrete data with a Gaussian distribution with mean
| (2) |
and variance
| (3) |
where ℓ is an index for each of the transmission rays in the projection data. It is clear from Eq. (3) that the noise variance decreases with decreasing numbers of views, and there is a tradeoff between Nviews and signal-to-noise ratio in each projection.
2.B. TV-LSQ image reconstruction
In order to formulate the TV-LSQ optimization, the continuous data model in Eq. (1) is discretized, taking the form of a large linear system
where the pixelized 512 × 512 image is represented by f; x-ray projection becomes the matrix X; and the Nviews × 512 data is denoted by g. Because we consider Nviews ≤ 512, this linear system can be under-determined. The TV-LSQ optimization problem is formulated as
| (4) |
where D is the finite differencing approximation to the image gradient; |·|mag is the pixelwise magnitude of the spatial gradient vector Df; ∥(|Df|)mag)∥1 is the image total variation (TV); and γ is the TV constraint value. When the data g are generated from a test image ftest with no noise added, the test image can be recovered with TV-LSQ choosing γ = ∥Dftrue∥1 for sparse-view sampling with Nviews < 512. The degree of under-sampling permitted depends on the sparsity in the gradient magnitude image (GMI) |Dftest|mag.22 This possibility of accurate image reconstruction for sparse-view CT enables the consideration of the CT configurations described in the breast CT simulation.
The TV-LSQ optimization problem can be efficiently solved by the Chambolle–Pock primal-dual (CPPD) algorithm.23–25 For completeness, we provide the pseudocode for this algorithm in Appendix 1. We do consider early stopping of the CPPD algorithm and allow the total number of iterations, Niter, to vary from 10 to 500. At Niter 500 the TV-LSQ problem is solved to a high degree of numerical accuracy for all scan configurations considered in this work.
In total, three parameters are varied in the breast CT simulation: Nviews, Niter, and the TV constraint γ. Even under this restricted simulation with three parameters specifying the image, it is difficult to tune-by-eye; not only is the parameter space too large but the image qualities are difficult to compare. As will be seen, quantitative image fidelity metrics such as image RMSE, alone, may not provide a reasonable objective means of image comparison and optimization, particularly when small subtle signals are the features of interest.
2.C. SKE/BKE signal detection model
To provide an objective metric that characterizes the preservation of subtle details in the TV-LSQ reconstructed images, signal detection theory is employed to measure the loss of signal detectability for an ideal observer model. The design of the detectability metric involves the following steps: select the signal properties such that it is on the border of detectability in the sinogram data domain; generate multiple realizations of signal-present and signal-absent sinograms; perform TV-LSQ reconstruction of all data realizations; divide the resulting image set into training and testing data; train the signal-present/signal-absent classifier; and finally, measure the image domain detectability with the testing images. The data model and data signal detection task is set up so that the ideal observer performance can be analytically computed. In this way, the data domain detectability serves as a precisely known upper bound to the image domain detectability. The loss in detectability, passing through image reconstruction, provides a quantitative measure that is an indication of loss of fine details in the image and may reflect the subjective property of image patchiness.
The images in Fig. 1 illustrate the detection task employed in this work. The background disk attenuation is representative of fat tissue and is set to 0.194 cm−1. The ring at the edge represents the skin line with attenuation 0.233 cm−1. The phantom is defined on a 2048 × 2048 grid and is 18 × 18 cm2 in physical dimensions. The pixel size is chosen much smaller than the detector resolution so that the phantom can be regarded as quasi-continuous. Projection of this background image yields the mean background sinogram. The signal is defined as a Gaussian function with full-width-half-maximum of 100 microns (the reconstructed image grid uses a pixel size of 350 microns) and amplitude of 0.04 cm−1. Projection of the background plus signal yields the mean signal-present sinogram. To appreciate the difficulty of the detection task, we also show in Fig. 1 the mean difference image of both hypotheses over 200 noise realizations, reconstructed by FBP for Nviews = 512. The reconstruction grid is a 512 × 512 pixel array. It is apparent that the speckle noise is still visible even after averaging over 200 realizations; the signal would not be visible in the reconstructed image from a single noise realization.
Fig. 1.

(Left) Background image used for the signal-known-exactly/background-known-exactly detection task. The gray scale window is [0.174,0.253] cm−1. (Right) Central 128 × 128 ROI of the mean difference of 200 filtered back-projection (FBP) reconstructed noise realizations from the signal-present and signal-absent sinograms. The size of this central ROI is indicated with the yellow box on the background image. The gray scale window is [−0.0075,0.02] cm−1.
The data domain ideal observer detectability is computed as a signal-to-noise ratio (SNR) for detection, see Sec. 13.2.8 in Barrett and Myers,9 which is straightforward for the data model specified in Eqs. (2) and (3). For additive Gaussian noise, using the small signal approximation, the ideal observer and ideal linear observer are equivalent. The ideal linear observer performance is computed by first solving for the Hotelling template
where
and the small signal approximation is assuming
The SNR for detection in the data domain is computed from the dot product of the Hotelling template and the signal projection data
The SNR metric can be converted to a receiver operating characteristic (ROC) area-under-the-curve (AUC), or equivalently a percent-correct (PC) on a two-alternative-forced-choice (2-AFC) observer experiment (page 823 in Barrett and Myers9)
For the equal-dose breast CT simulation at the specified noise level and the given signal properties, the signal detectability in the data domain corresponds to
where the range of possible performance values are 50%, corresponding to guessing on the 2-AFC experiment, to 100%, a 2-AFC perfect score. That the ideal observer performance is significantly less than 100% in the data domain is intended by design. This design requirement is why it is necessary to use the subtle signal shown in Fig. 1.
As pointed out in Sec. 13.2.6 of Barrett and Myers,9 image reconstruction can only maintain or lose signal detectability with the ideal observer, and as a result the ideal observer is not commonly used for assessing tomographic images after reconstruction. Essentially, from the ideal observer perspective, image reconstruction should not be performed at all. Constrained by the fact that human observers can interpret reconstructed images much more easily than sinograms, there is still potentially useful knowledge to be gained from the ideal observer in assessing the efficiency of the image reconstruction algorithm; namely, it can address the question of how well the separability between signal-present and signal-absent hypotheses is preserved in passing through image reconstruction. In other words, does the image reconstruction algorithm wipe out the signal in the detection task? This is a particularly relevant question for recent efforts in nonlinear image reconstruction where strong assumptions are being exploited to obtain tomographic images for sparse sampling conditions or low-dose scanning. The image-domain ideal observer performance is also useful in that it provides a theoretical upper bound on human observer performance, and no amount of postprocessing will allow this bound to be exceeded.
For computing the image-domain detectability, we employ the 2-AFC PC figure-of-merit for the ideal observer in the image domain because it is easy to interpret; the 2-AFC test intuitively connects image ensemble properties with single image noise realizations; and we have a hard theoretical upper bound that it cannot exceed PCdata = 86:57%. This last property that,
also naturally provides a measure for the loss of signal-detectability passing through image reconstruction. To provide an accurate and precise estimate of PCimage, 4000 noisy data realizations of both signal-present and signal-absent hypotheses are generated. All of the data realizations are reconstructed with the TV-LSQ algorithm. Half of the resulting images under each hypothesis are used to train an ideal-observer classifier, and the remaining half of the images is used to generate the PCimage metric and its error bars. (Because PCimage is computed from noise realizations, it is necessary to work with a small signal due to its inherent uncertainty. If the data domain PC is close to 100%, the resulting drop in going to the image domain PC may be too small to be significant.) The large number of image realizations leads to a high precision, and the accuracy results from surveying a number of classifiers including both ideal linear observer and ideal observer estimators. For the ideal linear observer, we have investigated the CHO26 with different channel formulations and a single-layer neural network (SLNN).27 For the ideal observer, several implementations of a convolutional neural network (CNN)27 have been explored. We have found that a hybrid-CHO yields PCimage equal to the results, within error bars, from the NN classifiers over the range of simulation parameters investigated. We present the hybrid-CHO because of its relative simplicity, but the equivalence of the hybrid-CHO with the SLNN and CNN is significant because the hybrid-CHO exploits approximate rotational symmetry in the detection task while the SLNN and CNN do not. This approximate symmetry allows for a reduction in the number of channels needed for the hybrid-CHO, and the equivalence with the NN-based observers indicates that the reduced set of channels in the hybrid-CHO is not compromising performance of the hybrid-CHO as an observer model.
2.C.1. Hybrid-CHO
The theory for estimation of the CHO PCimage and its variance is covered in Gallas and Barrett26 and Chen et al.28 The hybrid-CHO developed here exploits approximate rotational symmetry that results from use of a small rotationally symmetric signal and uniform angular sampling in the sinogram. Because the detection task design is approximately rotationally symmetric, it lends itself well to the use of standard Laguerre–Gauss channels,26 which are circularly symmetric. The Laguerre–Gauss channels on their own, however, do not provide an optimal basis because of the small size of the signal in combination with the fact that the image is discretized on a Cartesian grid. To account for both of these aspects of the CT imaging setup, we propose a hybrid channel set composed of Laguerre–Gauss channels combined with single-pixel channels at the location of the signal. The observer model is referred to as a hybrid-CHO, reflecting this hybrid channel set.
The data for computing the hybrid-CHO performance consist of the central 128 × 128 region of pixels from each of the 512 × 512 image realizations; thus there are a total of 4,000 signal-present and signal-absent 128 × 128 ROIs for training and testing the hybrid-CHO. The continuous definition of the Laguerre–Gauss channels is
| (5) |
where the radius r is defined r2 = x2 + y2; x,y indicate location on the 128 × 128 ROI; and the units of x and y are scaled so that (x, y) = (0, 0) is the center of the ROI and (x, y) = (1, 1) is the upper right corner of the ROI. The parameters of the Laguerre-Gauss channels are the order n and Gaussian radial decay parameter a, which is specified in the same scaled units as r. The discrete representation of the Laguerre–Gauss channels is obtained by evaluating un(r|a) at the center of each of the pixels in the ROI. The single-pixel channels are defined as
| (6) |
where (i,j) are the integer coordinates of the pixels in the discrete channel function; (s,t) is the location of the unit impulse; the origin of the integer coordinates (0,0) is at the lower left corner of the ROI.
The specific channel set employed for the breast CT simulation consists of fourteen channels. The first ten are the discrete Laguerre–Gauss channels, un(r|a), with n ∈ [0,9] and a = 0.5, and the remaining for are the single-pixel channels, {u(63, 63), u(63, 64), u(64, 63), u(64, 64)}. Considering the channel functions as column vectors of length 128 × 128, where the pixel elements are in lexicographical order, the 14 channels form a channelization matrix U of size 16,384 × 14 (16,384 = 128·128).
The channelized linear classifier is computed by estimating the mean channelized signal and the channelized image covariance. To compute these quantities, the channelized images are first obtained from the reconstructed training images by
where i is the realization index, which runs from 1 to Nreal = 4000; and is a column vector with pixels values from the central 128 × 128 ROI extracted from the reconstruction from signal-present (signal-absent) data. The first i = 1 through Ntrain = 2000 realizations are assigned to the training set, and the rest of the realizations i = Ntrain + 1 through Ntrain + Ntest are assigned to the testing set. The mean channelized signal is
Using the small signal approximation, the training images under both hypotheses can be combined to provide a covariance estimate where the barred variables indicate mean over
the ensemble of corresponding realizations. The channelized Hotelling template is computed as
and the ROI Hotelling template can be reconstituted by matrix-vector multiplication
Dotting a test image with the Hotelling template wimage provides the test statistic, which can be compared with a set threshold to make the classification into either signal-present or signal-absent hypotheses.
The detectability metric in the image domain is estimated by running a 2-AFC experiment with the hybrid-CHO for every possible combination of signal-present and signal-absent test images
and the two-sample kernel function c(a;b) is defined
In the 2-AFC experiment, the Hotelling template is dotted with a pair of test images, where one is drawn from the signal-present realizations and the other is drawn from the signal-absent realizations. Whichever dot product yields the higher value, the hybrid-CHO classifies the corresponding image as a signal-present image. The summation over the two-sample kernel function essentially counts all of the times that the hybrid-CHO identified the signal-present image correctly.
Once PCimage is computed it can be compared with PCdata to provide a measure of loss of signal detectability. The quantity PCdata is known analytically so the corresponding value does not have error bars. The value PCimage, on the other hand, is estimated from realizations, and thus it has variability due to the randomness of the testing set. There is also variability in the training of the hybrid-CHO because it is computed from the random training images. To account for both sources of variability, we employ the level 2 variability estimation from Chen et al.,28 and the 95% confidence intervals are reported.
2.D. Test phantom for visual correspondence
In order to illustrate the correspondence between visual image quality and the image quality metrics, the same simulation parameters and scan configurations are investigated using a test phantom with a structured fibroglandular tissue model,29 shown in Fig. 2. This breast phantom is composed of a 16-cm disk containing background fat tissue, attenuation 0.194 cm−1, skin-line and randomly generated fibroglandular tissue at attenuation 0.233 cm−1. These components of the phantom are defined on a 2048 × 2048 grid of dimensions 18×18 cm2. The structured background allows for visualization of fine details. In order to have a more direct comparison with a signal detection task, a contrast-detail (CD) insert is included in the phantom consisting of an 8 × 8 grid of point-like signals. The signals are defined as analytic disks so that the line integrals through the signals can be computed exactly and their contribution to the projection data is not subject to pixelization of the test phantom image. The disk contrast in the CD insert increases linearly from 0.01 cm−1 to 0.05 cm−1 going from left to right, and the disk radius starts at 200 microns and increases linearly to 500 microns going from top to bottom. For reference, the reconstruction grid’s image pixel width is 350 microns. To appreciate the noise level of the breast CT simulation, ROIs are shown of images reconstructed by FBP using a ramp filter and FBP followed by Gaussian blurring. For the FBP reconstructions, the Nviews = 512 scan configuration is used. Due to the high level of speckle noise in the unregularized FBP image, it is difficult to see even the most conspicuous of signals in the CD insert. With regularization, the larger, higher contrast corner of the CD insert becomes visible.
Fig. 2.

Computerized breast phantom with a contrast-detail (CD) insert. The displayed images are the image of the phantom (top, left), the ROI focused on the CD insert (top, right), an unregularied FBP reconstruction (bottom, left), and a regularized FBP image (bottom, right). For reference, the RMSE values of the unregularized and regularized FBP images are 0.0198 and 0.01155 cm−1, respectively. The gray scale window for all panels is [0.174,0.253]cm−1.
3. RESULTS
The hybrid-CHO signal detection figure-of-merit and image RMSE are computed alongside TV-LSQ reconstructed images of the breast phantom, exploring the three parameters of the CT-simulation: Niter, Nviews, and TV constraint parameter γ. The TV constraint is reported as a fraction of the TV of the ground truth image.
3.A. Signal detectability as a function of iteration number
The first set of results focus on Nviews = 128 and γ = 1.0, that is, the TV constraint is equal to the ground truth phantom TV. A series of ROI images are shown in Fig. 3 as a function of iteration number for the TV-LSQ reconstruction of the breast phantom. From the perspective of accurate recovery of the phantom, the gray level estimation appears to improve with increasing iteration number, as a general trend, which is to be expected because the TV constraint is selected to be the TV of the test phantom. From the perspective of visualizing the fine details in the image, the trend with iteration is more complex. The structure detail in the fibro-glandular tissue and many of the signals are visible already at 20 iterations, where it is clear from the overall gray value that the image is far from the solution to the TV-LSQ problem. As the iterations progress, the larger signals of the CD insert appear more conspicuous as the speckle noise amplitude is reduced. On the other hand, some of the more subtle features in the image appear to become distorted as the iterations progress, and the numerically converged image has a classic patchy look where it is difficult to distinguish noise from real structures.
Fig. 3.

(Top row) Images reconstructed by TV-LSQ for Nviews = 128 and γ = 1.0 with iteration number increasing from left to right. The iteration number is indicated in each panel of the figure. The gray scale window is [0.174,0.253] cm−1. (Bottom, left) Plot of the corresponding image RMSE values. For reference, the RMSE of the FBP and regularized FBP images from Fig. 2 are 0.0198 and 0.01155 cm−1, respectively. Both FBP values are indicated in the plot with dashed lines in red and black, respectively. (Bottom, right) Plot of the corresponding signal detectability metric, percent correct for an ideal-observer 2-AFC experiment. The dashed line indicates the theoretical maximum PC performance inherent in the data domain; it does not depend on iteration number and is indicated for reference.
Corresponding to the image series in Fig. 3, quantitative image quality metrics are also plotted, showing image RMSE and signal-detection PCimage. As expected, the RMSE trend shows improvement with iteration number, and the RMSE converges to a value well below that of the FBP reference images in Fig. 2. Again, γ is set to the truth value and the test phantom has a high-degree of gradient sparsity; thus, the solution to the TV-LSQ optimization problem is expected to yield a mathematically accurate solution and this is reflected in low RMSE values and the fact that the RMSE steadily improves as the TV-LSQ algorithm progresses toward the solution. This trend coincides with the visual gray-level accuracy seen in the series of images. It is interesting to note that the RMSE at Niter = 500 is substantially below the value of 0.01155 corresponding to the regularized FBP image in Fig. 2.
The iteration number trend for PCimage, however, runs opposite to the image RMSE. There is a clear decline in the signal detectability at early iterations, and as convergence is achieved this metric plateaus to a value well below the data domain signal detectability. The trend in image detectability coincides with the visual appearance of the increasing patchiness of the images shown in Fig. 3.
The main point of the PCimage metric is that it should reflect the disappearance of small subtle details in the image, and in this example, we see correspondence between this metric and the overall patchiness of the images. Thus, the quantitative PCimage metric appears to capture the desired image properties, providing a quantitative measure of overregularization. How to use this information to determine algorithm parameters depends on the goal of the CT system design. Clearly, the results of the iteration number study indicate that PCimage cannot be used alone to determine the optimal iteration number, because it has the largest value with one iteration. As an aside, we note that a similar behavior was observed for the maximum likelihood expectation maximization (MLEM) algorithm using a ROI-observer,10 and we take up a comparison of these experiments in Section 4.
Using PCimage in concert with image RMSE, which has the opposite trend, provides complimentary information. As an example of how it can be used, the desired image could be specified by minimizing RMSE with the constraint that the loss in signal detectability is bounded by a parameter ϵ, that is, PCimage/PCdata ≥ ε. Subjectively, the first frame at 20 iterations, in the series of images shown in Fig. 3, has the best visibility for the signals in the CD insert and image texture realism. The next image at 50 iterations already has a patchy appearance. Visualization of the intermediate frames (not shown) suggests that a value of ϵ = 97.0% for this particular example provides a useful bound. However, the details of how ϵ is chosen and how the detection task is designed depends on the desired imaging goal. Here, we only aim to establish correspondence between PCimage and the subjective image quality of patchiness or overregularization with nonlinear image reconstruction.
3.B. Signal detectability as a function of Nviews and γ
For the next set of results, we fix Niter = 100 and vary the other two parameters of the breast CT simulation. In Fig. 4, a grid of images is shown with each row and column corresponding to fixed Nviews and γ, respectively. As a general trend, the lower γ values reduce the speckle noise and streaks in the image; however, it is also clear that the heavy regularization imposed by γ = 0.75 effectively renders the borderline signals in the CD insert invisible. In terms of conspicuity of the signals in the CD insert, the images for γ = 1.5 and above appear to have similar numbers of signals visible.
Fig. 4.

(Top row) Images reconstructed by TV-LSQ for Niter = 100, varying Nviews from top to bottom and varying γ from left to right. These parameters are indicated in the figure panels. The gray scale window is [0.174,0.253]cm−1. (Bottom, left) Plot of the corresponding image RMSE values. For reference, the RMSE of the FBP and regularized FBP images from Fig. 2 are 0.0198 and 0.01155 cm−1, respectively, and the latter value is indicated in the plot with a dashed line. (Bottom, right) Plot of the corresponding signal detectability metric, percent correct for an ideal-observer 2-AFC experiment. The dashed line indicates the theoretical maximum PC performance inherent in the data domain.
The trend in Nviews is more difficult to discern because the conditions of the scan are set up to be equal dose. For the larger γ-values, Nviews = 128 images appear to have streak artifacts in addition to the speckle noise. In general, there is a different noise texture for the various equal-dose scan configurations.
The corresponding image RMSE and PCimage IQ metrics are also plotted in Fig. 4. The image RMSE favors γ = 1.0, the ground truth TV value, although the RMSE for γ = 0.75 is only slightly larger. Also, the RMSE values decrease weakly with increasing Nviews. The PCimage values favor an opposite trend, where the signal detectability increases with γ. Interestingly, for the different Nviews configurations, the intermediate value Nviews = 256 is slightly favored, although the values for 256 and 512 have overlapping error bars.
Again, we point out that the metrics are complimentary. Going by PCimage alone the TV constraint would be abandoned. Going by image RMSE alone, however, can also lead to an equally pathological situation where the image is over-regularized. Using PCimage in concert with RMSE yields a more useful picture. We observe that, while it is true that PCimage is monotonically increasing with γ, there is clearly diminishing returns for γ ≥ 1.5, where this metric appears to plateau. The RMSE, on the other hand, favors lower γ on the PCimage-plateau. Thus a prescription that combines the two metrics could reasonably select an intermediate γ value such as γ = 1.5, where again ε = PCimage/PCdata ≥ 97%. At this setting, we observe that the TV-LSQ reconstructed images in Fig. 4 do not have the patchy appearance of overregularization with TV. Also, compared with the FBP images, the image RMSE is lower and more CD insert signals are visible for TV-LSQ at γ = 1.5.
3.C. Estimation of subject TV and its impact on IQ metric trends
The dependence of the simulation results on γ are all referred to the ground truth TV value, which is object dependent. Thus applying the simulation-based IQ metrics to an actual scanning situation, where the ground truth is unknown, raises two important questions: (a) how to determine the subject TV, and (b) does the subject TV reference value yield universal IQ metric dependence on γ. Two simulations are performed to address both of these questions.
To estimate the subject TV, γ0, we have successfully applied a validation technique30 where image reconstruction is performed with a fraction of the available data and the remaining test data are compared with the projection of the estimated image. The constraint value is estimated to be the value that yields the smallest discrepancy between the test data and the corresponding estimated data. We perform this computation in the context of the present breast CT simulation for Nviews = 128 and Niter = 500. Image reconstruction is performed with 90% of the available line-integration data, chosen from the sinogram at random. This leaves 10% of the data for independent testing. The resulting reconstructed image is projected and the RMSEs for the reconstruction and testing data are plotted in Fig. 5 as a function of γ. From Fig. 5, we observe that there is a monotonically decreasing trend in the reconstruction data RMSE as a function of γ, but the data RMSE of the testing set shows a minimum at γ = 0.9, which is close to the true value of γ=1.0. This result demonstrates that this validation technique can provide an estimate of the subject TV to within 10 percent.
Fig. 5.

Plot of RMSE for the sinogram data used in image reconstruction (blue) and the RMSE on the left-out testing sinogram data used for validation (red) set as a function of the TV constraint parameter. The validation RMSE has a minimum at γ = 0.9 in units scaled to the ground truth image TV.
To address the universality question, the uniform background used in the process of estimating PCimage is changed to the nonuniform, but known, background of the breast phantom. This modification alters γ0 dramatically; thus, it is of interest to compare the resulting PCimage curve as a function of γ. In Fig. 6, this metric is plotted for Nviews = 128 and Niter = 100. From the graph, it is clear that there is some numerical discrepancy between the numerical values of PCimage for the same value of the scaled parameter γ; however, the trend of this metric as a function of γ matches fairly well. That there is discrepancy in the absolute numerical values is perhaps not too surprising considering the large difference in background structure. The similarity in trends is further evidence of the potential utility of the proposed IQ metric.
Fig. 6.

Detectability metrics using different background images. The label “uniform” and refers to the use of the background image shown in Fig. 1, and “structured” refers to using the breast phantom in Fig. 2 as the background image. Note that the data domain percent correct is lower for the structured background because it is more attenuating.
4. DISCUSSION
The proposed signal detectability index for nonlinear image reconstruction bears some similarity with the signal detectability studies on MLEM iteration number studies presented by Abbey et al.10 In particular, the ROI observer from that investigation showed a steadily decreasing trend with iteration number. The two detectability indices, however, are different and need to be interpreted differently. The detection task considered in Abbey et al.10 was meant to have direct relevance to a clinical detection task, and furthermore the authors were seeking correspondence between model and human observers on signal detection. For the detectability metric, presented here, the signal size and amplitude are chosen so that the signal is on the edge of detectability by the ideal observer in the data space. This signal is much too small to be detected by a human observer; thus, the detection task design itself is not directly relevant to a clinical detection task. The design and purpose of this detection task are meant to be a surrogate for the subjective image property of patchiness specific to overregularization in TV-LSQ reconstructed images.
The reduction of PCimage relative to PCdata represents an irretrievable loss of information in distinguishing signal-present and signal-absent hypothesis. No postprocessing operations can improve on PCimage. This metric, however, only captures loss of detectability due to non-invertibility of the image reconstruction algorithm. It does not necessarily reflect distortion of the signal. For example, regularizing FBP images with moderate blurring, such as what is seen in Fig. 2, is invertible and does not cause a reduction in PC even though the signal itself is broadened by the blurring operation. Reconstructing FBP images onto an image grid of large pixels, on the other hand, is a non-invertible and does cause loss in detectability.14
The SKE/BKE detection task paradigm with a small rotationally symmetric signal and uniform projection-angle sampling allows for the hybrid-CHO to accurately represent the ideal linear observer with a relatively small set of channels, because the detection task is well-suited to the rotationally symmetric LG channels. Considering non-rotationally symmetric signals or scanning angular ranges less than 2π breaks this symmetry. In such cases, a different channel representation and possibly a larger channel set would need to be developed in order for the hybrid-CHO to represent the ideal linear observer. The SKE/BKE detection task also considers the signal at one location in the image. For the presented nonlinear image reconstruction algorithm, this limitation does not impact the utility of the metric because the TV regularization is applied isotropically over the image and results are not expected to change appreciably for different signal locations. Regularization techniques that involve spatially varying weighting need to consider either multiple SKE/BKE detection tasks with signals at different locations or a signal-known-statistically (SKS) detection task where the signal location is drawn from a spatially uniform probability distribution.
5. CONCLUSION
We have developed and presented an image quality metric that is sensitive to the removal of subtle details in the image and that can be applied to the nonlinear TV-LSQ image reconstruction algorithm. The metric is based on the detection of a small signal at the border of detectability by the ideal observer, and this metric is hypothesized to quantify the subjective visual removal of subtle image details. The design of the proposed detection task, use of the ideal observer, and connection with the 2-AFC experiment makes the metric easy to interpret. The detectability index, which is an estimate of a property of an ensemble of reconstructed images, is connected to single image realizations through the interpretation as a PC on a 2-AFC experiment. Loss of detectability through the image reconstruction process, i.e. PCimage < PCdata, unambiguously represents a quantitative decrease in the ability to distinguish signal-present and signal-absent images. The bounds on this metric are clear: 0:5 ≤ PCimage ≤ PCdata, where the lower limit of 0.5 represents guessing on the 2-AFC experiment and the upper limit is the analytically known PCdata.
Correspondence between PCimage and visual assessment of the reconstructed images of the breast CT simulation shows that this metric may serve to quantify TV-LSQ overregularization. A decrease in this metric is seen to coincide with loss of borderline signals in the CD insert and with patchiness in the appearance of the images. This metric is seen to be complimentary to widely used image fidelity metrics such as image RMSE, and it may help to provide an objective means to establish useful tomographic system parameter settings. The presented methodology may also prove useful for quantifying overregularization with other nonlinear image reconstruction techniques.
ACKNOWLEDGMENT
This work is supported in part by NIH Grant Nos. R01-EB026282, R01-EB023968, R01-EB020604, R01-EB028652 and The University of Chicago Women’s Board. The computational resources for this work are funded in part by the NIH S10-OD025081, S10-RR021039, and P30-CA14599 awards. The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health.
A. APPENDIX
A.1. The CPPD algorithm for TV-LSQ
The CPPD algorithm23,24 can be used to efficiently solve non-smooth convex optimization problems for CT image reconstruction.25 We provide the pseudocode for CPPD-TV-LSQ in Algorithm 1.

In the CPPD-TV-LSQ algorithm, the parameters νs and νg normalize the linear transforms X and D:
where the ℓ2-norm of a matrix is its largest singular value. This scaling is performed so that algorithm efficiency is optimized and so that results are independent of the physical units used in implementing X and D. Note that the sinogram g in line 5 and the TV constraint parameter γ in lines 8 and 9 must also be multiplied by νs and νg, respectively. The step-size parameters σ and τ must satisfy the inequality
| (7) |
where L is the matrix norm of A, which is constructed by stacking νsX on νgD
Due the normalization of X and D and the fact that X and D approximately commute, L should be close to 1. The step-size inequality, Eq. (7), is satisfied with equality by setting
where the step-size ratio ρ is a free parameter that must be tuned because it can strongly impact CPPD convergence behavior. For all the simulations presented in this work, the step-size ratio is set to ρ = 1.
The “solve” function at line 9, returns the value of β that solves the equation written in its second argument, and “shrink” is defined component-wise as
where i is an index for the components of p. Solution of the equation at line 9 can be implemented by bisection, because ∥shrink(p,β)∥1 decreases monotonically as β increases and the root of the equation is bracketed in the interval [0,max(p)], where max acts component-wise on p.
Footnotes
CONFLICTS OF INTEREST
The authors have no conflicts to disclose.
Contributor Information
Emil Y. Sidky, Department of Radiology, The University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, USA.
John Paul Phillips, Department of Radiology, The University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, USA.
Weimin Zhou, Department of Bioengineering, University of Illinois at Urbana-Champaign, 1406 W. Green St., Urbana, IL 61801, USA.
Greg Ongie, Department of Mathematical and Statistical Sciences, Marquette University, 1313 W. Wisconsin Ave., Milwaukee, WI 53233, USA.
Juan P. Cruz-Bastida, Department of Radiology, The University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, USA
Ingrid S. Reiser, Department of Radiology, The University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, USA
Mark A. Anastasio, Department of Bioengineering, University of Illinois at Urbana-Champaign, 1406 W. Green St., Urbana, IL 61801, USA
Xiaochuan Pan, Department of Radiology, The University of Chicago, 5841 S. Maryland Ave., Chicago, IL 60637, USA.
REFERENCES
- 1.Elbakri IA, Fessler JA. Statistical image reconstruction for polyenergetic x-ray computed tomography. IEEE Trans Med Imaging. 2002;21:89–99. [DOI] [PubMed] [Google Scholar]
- 2.McCollough CH, Primak AN, Braun N, Kofler J, Yu L, Christner J. Strategies for reducing radiation dose in CT. Radiol Clinics. 2009;47:27–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Sidky EY, Pan X. Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization. Phys Med Biol. 2008;53:4777–4807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Chen G-H, Tang J, Leng S. Prior image constrained compressed sensing (PICCS): a method to accurately reconstruct dynamic CT images from highly undersampled projection data sets. Med Phys. 2008;35:660–663. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ritschl L, Bergner F, Fleischmann C, Kachelriess M. Improved total variation-based CT image reconstruction applied to clinical data. Phys Med Biol. 2011;56:1545. [DOI] [PubMed] [Google Scholar]
- 6.Batenburg KJ, Sijbers J. DART: a practical reconstruction algorithm for discrete tomography. IEEE Trans Image Proc. 2011;20:2542–2553. [DOI] [PubMed] [Google Scholar]
- 7.Gupta H, Jin KH, Nguyen HQ, McCann MT, Unser M. CNN-based projected gradient descent for consistent CT image reconstruction. IEEE Trans Med Imaging. 2018;37:1440–1453. [DOI] [PubMed] [Google Scholar]
- 8.Adler J, Öktem O. Learned primal-dual reconstruction. IEEE Trans Med Imaging. 2018;37:1322–1332. [DOI] [PubMed] [Google Scholar]
- 9.Barrett HH, Myers KJ, Rathee S. Foundations of Image Science. Hoboken, New Jersey: Wiley; 2004. [Google Scholar]
- 10.Abbey CK, Barrett HH, Wilson DW. Observer signal-to-noise ratios for the ML-EM algorithm, in Medical Imaging: Image Perception, eds. Kundel HL, Vol. 2712, Proc. SPIE, 1996:47–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Abbey CK, Barrett HH. Human-and model-observer performance in ramp-spectrum noise: effects of regularization and object variability. JOSA A. 2001;18:473–488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wunderlich A, Noo F. Image covariance and lesion detectability in direct fan-beam x-ray computed tomography. Phys Med Biol. 2008;53:2471–2493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Das M, Gifford HC, O’Connor JM, Glick SJ. Penalized maximum likelihood reconstruction for improved microcalcification detection in breast tomosynthesis. IEEE Trans Med Imaging. 2010;30:904–914. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Sanchez AA, Sidky EY, Reiser I, Pan X. Comparison of human and Hotelling observer performance for a fan-beam CT signal detection task. Med Phys. 2013;40:031104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Gang GJ, Stayman JW, Zbijewski W, Siewerdsen JH. Task-based detectability in CT image reconstruction by filtered backprojection and penalized likelihood estimation. Med Phys. 2014;41:081902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sanchez AA, Sidky EY, Pan X. Task-based optimization of dedicated breast CT via Hotelling observer metrics. Med. Phys 2014;41:101917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Xu J, Fuld MK, Fung GSK, Tsui BMW. Task-based image quality evaluation of iterative reconstruction methods for low dose CT using computer simulations. Phys Med Biol. 2015;60:2881–2901. [DOI] [PubMed] [Google Scholar]
- 18.Rose SD, Sanchez AA, Sidky EY, Pan X. Investigating simulation-based metrics for characterizing linear iterative reconstruction in digital breast tomosynthesis. Med Phys. 2017;44:e279–e296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Liu Y, Liang Z, Ma J, et al. Total variation-stokes strategy for sparse-view x-ray CT image reconstruction. IEEE Trans Med Imaging. 2013;33:749–763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Niu S, Gao Y, Bian Z, et al. Sparse-view x-ray CT reconstruction via total generalized variation regularization. Phys Med Biol. 2014;59:2997–3017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Boone JM, Kwan ALC, Seibert JA, Shah N, Lindfors KK, Nelson TR. Technique factors and their relationship to radiation dose in pendant geometry breast CT. Med Phys. 2005;32:3767–3776. [DOI] [PubMed] [Google Scholar]
- 22.Jørgensen JS, Sidky EY. How little data is enough? Phase-diagram analysis of sparsity-regularized x-ray computed tomography. Philos Trans Royal Soc A. 2015;373:20140387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Chambolle A, Pock T. A first-order primal-dual algorithm for convex problems with applications to imaging. J Math Imag Vis. 2011;40:120–145. [Google Scholar]
- 24.Pock T, Chambolle A. Diagonal preconditioning for first order primal-dual algorithms in convex optimization, in International Conference on Computer Vision (ICCV 2011), 2011:1762–1769. [Google Scholar]
- 25.Sidky EY, Jørgensen JH, Pan X. Convex optimization problem prototyping for image reconstruction in computed tomography with the Chambolle-Pock algorithm. Phys Med Biol. 2012;57:3065–3091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gallas BD, Barrett HH. Validating the use of channels to estimate the ideal linear observer. JOSA A. 2003;20:1725–1738. [DOI] [PubMed] [Google Scholar]
- 27.Zhou W, Li H, Anastasio MA. Approximating the Ideal observer and Hotelling observer for binary signal detection tasks by use of supervised learning methods. IEEE Trans Med imaging. 2019;38: 2456–2468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Chen W, Gallas BD, Yousef WA. Classifier variability: accounting for training and testing. Pattern Recogn. 2012;45:2661–2671. [Google Scholar]
- 29.Reiser I, Nishikawa RM. Task-based assessment of breast tomosynthesis: effect of acquisition parameters and quantum noise. Med Phys. 2010;37:1591–1600. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Schmidt TG, Barber RF, Sidky EY. A spectral CT method to directly estimate basis material maps from experimental photon-counting data. IEEE Trans Med Imaging. 2017;36:1808–1819. [DOI] [PMC free article] [PubMed] [Google Scholar]
