Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Mar 16.
Published in final edited form as: IEEE Trans Med Imaging. 2019 Apr 11;38(10):2469–2481. doi: 10.1109/TMI.2019.2910760

Learning to Reconstruct Computed Tomography (CT) Images Directly from Sinogram Data under A Variety of Data Acquisition Conditions

Yinsheng Li 1, Ke Li 1,2, Chengzhu Zhang 1, Juan Montoya 1, Guang-Hong Chen 1,2
PMCID: PMC7962902  NIHMSID: NIHMS1541000  PMID: 30990179

Abstract

Computed tomography (CT) is widely used in medical diagnosis and non-destructive detection. Image reconstruction in CT aims to accurately recover pixel values from measured line integrals, i.e., the summed pixel values along straight lines. Provided the acquired data satisfy the data sufficiency condition as well as other conditions regarding the view angle sampling interval and the severity of transverse data truncation, researchers have discovered many solutions to accurately reconstruct the image. However, if these conditions are violated, accurate image reconstruction from line integrals remains an intellectual challenge. In this paper, a deep learning method with a common network architecture, termed iCT-Net, was developed and trained to accurately reconstruct images for previously solved and unsolved CT reconstruction problems with high quantitative accuracy. Particularly, accurate reconstructions were achieved for the case when the sparse view reconstruction problem (i.e., compressed sensing problem) is entangled with the classical interior tomographic problems.

Keywords: Image reconstruction, Deep learning, Sparse-view, Interior tomography

I. INTRODUCTION

The reconstruction of a function in N-dimensional space from its integral values over a K-dimensional hyperplane (1 ≤ K < N) is a central topic in integral geometry [1], [2]. The importance of integral geometry in our daily life can be appreciated by noting that the data acquired in x-ray medical computed tomography (CT) are essentially line integrals through the human body. These line integral data (i.e., integral values for K = 1) are acquired at different view angles as the tube-detector assembly rotates from one angular position to another. Image reconstruction from line integrals is also central to other imaging modalities [3], [4] such as Single Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET).

In an ideal scenario, when acquired line integral data can be converted to properly fill the corresponding Fourier space of the image function, the modern filtered back projection (FBP) [5] solution can be readily derived using the inverse Fourier transform, essentially equivalent to the one discovered by Radon [6], [7] in 1917. However, the Fourier transform related FBP reconstruction method is rather restrictive [8]. Due to the quasi-local nature of the information encoding process (i.e., the acquisition of line integral data only involves the function values along a straight line) as well as the use of divergent beam acquisition geometry in CT, there are many other new solutions [9], [10] to exactly reconstruct the image function. Interestingly, these solutions are not mathematically equivalent to one another and these new solutions even enable one to accurately reconstruct a region of interest (ROI) inside the scan field of view (FOV) [11]–[15] with much more relaxed data acquisition conditions, e.g., the super-short scan problem. In this case, it is important to note that there are missing data in Fourier space and thus the Fourier based FBP methods fail to accurately reconstruct the image. Furthermore, if all of the acquired line integral data are potentially truncated, the intrinsic connection with the Fourier transform of the image object completely fails. In this so-called interior problem [3], [4], it has been mathematically proven [16]–[18] that a stable solution does exist under certain conditions, albeit no analytical inversion formula has been discovered yet for this case.

The reconstruction problem with line integral data becomes even more difficult when data acquisition view angles are sparse. Despite the so-called compressed sensing (CS) theory [19], [20] having provided a mathematical foundation to address this sparse view reconstruction problem, when the super-short scan and interior problems in CT encounter sparse view acquisitions, it remains unknown whether it is possible to accurately reconstruct either the entire image or local ROIs within the FOV. Additionally, the inevitable noise contamination in data acquisition further complicates image reconstruction problems from line integral data.

Inspired by the breakthroughs of deep learning [21]–[24] in computer vision and natural language processing, and its success in computer games [25]–[27], physics [28], [29], chemistry [30], and recently tomographic image reconstruction problems in MRI, CT, and other modalities [31], [32], one may wonder whether deep learning may be employed to not only accurately reconstruct images for those line integral reconstruction problems that have already been solved through human knowledge, but also those that have not yet been solved by human knowledge such as the interior tomographic problem with sparse view angles. In this work, we developed a deep neural network, referred to as intelligent CT network (iCT-Net) and demonstrated that this iCT-Net can be trained to reconstruct images with high quantitative accuracy with either complete or incomplete line integral data including problems that have not been solved or have not been satisfactorily solved by human knowledge.

II. NETWORK ARCHITECTURE AND TRAINING STRATEGIES

A. Deep Learning Neural Network Architecture

When x-ray photons interact with an image object to encode the structural information of that object into measured line integral data, quantum noise caused by the intrinsic photon number fluctuations is inherent in the measured data. Therefore, uncertainty is inevitable in the acquired line integral data in x-ray CT and thus it is natural to use a statistical framework to address the image reconstruction problem. In this framework, an image estimate x^ is defined as the image that maximizes the posteriori conditional probability P(x|y) given the measured line integral data yY, where y denotes the individual line integral datum in sinogram space which is denoted as Y. This is accomplished via the Bayesian inference and solving the optimization problem:

x^=argmaxP(x|y)=argmaxP(y|x)P(x) (1)

This method requires an explicit assumption about the a priori distribution P(x). In statistical machine learning, instead of using an explicit assumption on the prior P(x), the posterior distribution P(x|y) is directly learned from the training data via a supervised learning process [33]. In this process, a sample xi is drawn from the output training image data set and a sample yi is drawn from the input training line integral data set. The data pairs (yi, xi) are used to train the iCT-Net in this work, to learn a map f:YX (X denotes image space), i.e., a map directly from sinogram space to image space, such that the learned model distribution, Q(x|y; f), can best approximate the underlying posterior distribution, P(x|y). Once the map f:YX is learned, it is applied to predict an image output from the input projection data not used in the training process.

The design of our iCT-Net was inspired by the current FBP based CT imaging pipeline which consists of three major cascaded steps: The first step is to correct measured signals to account for erroneous detector counts caused by a variety of physical reasons such as excessive noise and beam hardening, followed by the second step to filter the corrected data with an apodized ramp filter, and the third step to backproject the filtered data to accomplish the domain transform from line integral space to tomographic image space. In the iCT-Net architecture, multi-channel convolutional neural layers were designed to not only maintain the primary functionality of each of the above three steps in the conventional FBP based CT imaging pipeline, but also to enable iCT-Net to address difficult image reconstruction problems such as view angle truncation, the view angle undersampling, and interior problems using the same architecture. Specifically, the design of our iCT-Net consists of four major cascaded components as shown in Figure 1: (1) Convolutional layers (L1–L5) suppress excessive noise in line integral data and convert a sparse view sinogram into a dense view sinogram. These layers accomplish a manifold learning process, i.e., to learn a noise-reduced and complete data manifold from a noise contaminated and sparse view data manifold. This component is analogous to the signal correction step in the conventional FBP based CT imaging pipeline. (2) Convolutional layers (L6–L9) learn high level feature representations from the output data of the L5 layer. This component is analogous to the filtering step in the conventional FBP based CT imaging pipeline. (3) A fully connected layer L10 performs a domain transform from the extracted data feature space to image space. (4) Layers L11 and L12 learn a combination of the partial image from each view angle to generate a final image. These final two components are analogous to the backprojection and summation steps in the conventional FBP based CT imaging pipeline, but with learnable summation weights to account for potential data redundancy and differences caused by the completely different strategies used in iCT-Net to filter data. Parameters in all layers are directly learned from the input data and training images in the training data set. The iCT-Net architecture enables us to reconstruct images with a 512 × 512 matrix since the number of parameters is on the order of O(N2 × Nc), which is in contrast to O(N4) in other architectures [32]. Here, N denotes the image matrix size and Nc denotes the number of detector elements.

Fig. 1.

Fig. 1.

Architecture of iCT-Net. The proposed deep neural network consists of a total of 12 layers (L1–L12). The L11 layer is a frozen layer, which means that parameters in this layer are not updated in the training process. Both linear and nonlinear activations are used as indicated in the graphics. Sλ is a hard thresholding activation function defined in Eq. (2).

As shown in Figure 1, iCT-Net takes an acquired sinogram with dimensions of Nc × Nv, to generate a CT image with a matrix size of N × N (N = 512), via a twelve-layer deep neural network. Here Nv denotes the number of view angles. Specifics of each of the twelve layers in iCT-Net are described as follows.

L1–L5 are five convolutional layers. L1–L3 operate along the dimension of detector elements while L4 and L5 operate along the dimension of view angles. The L1 layer has 64 convolutional kernels, each with a dimension of 3 × 1 × 1, followed by a hard shrinkage operator (Sλ) as the activation function, which is defined as:

Sλ( output )={ output, |output|>λ0,|output|λ, (2)

where λ is the threshold value. The L2 layer has another 64 convolution kernels with a dimension of 3 × 1 × 64, followed by Sλ as the activation. In order to learn new features from the output of the L1 and L2 layers, the original input and the feature outputs of the first two layers were concatenated to form inputs for the L3 layer, the L3 layer has a single channel convolution kernel with a dimension of 3 × 1 × 129, followed by Sλ as the activation. The hyper-parameter was empirically selected to be λ = 1 × 10−5 for L1–L3 layers. In the L4 layer, there are α1Nv convolutional kernels with the dimension of 1 × 1 × Nv, followed by an activation Sλ. In the L5 layer, there are α1Nv convolutional kernels with the dimension of 1 × 1 × α1Nv, followed by another activation Sλ. A hyperparameter value of λ = 1 × 10−8 in L4 and L5 layers and a hyperparameter value of α1 = α2 = 1 was selected for the dense view reconstruction problem while α1 = 2, α2 = 4 were empirically selected for the sparse view reconstruction problem with a factor of four view angle undersampling.

L6–L10 are another five convolutional layers. In the L6 layer, there is one kernel with a dimension of Nc × α2Nv × 1, followed by a linear activation. In the L7 layer, there are sixteen kernels with a dimension of β × 1 × 1, followed by a hyperbolic tangent activation, i.e., the operation of the function tanh(x). There is one kernel with dimensions of β × 1 × 16 followed by a hyperbolic tangent activation in the L8 layer. There are Nc kernels with dimensions of 1 × 1 × Nc followed by a hyperbolic tangent activation in the L9 layer. Finally, there are N2 kernels with dimensions of 1 × 1 × Nc followed by a linear activation in the L10 layer. Hyperparameters N = 512 and Nc = 888 were selected for the non-interior reconstruction problem while Nc = 222 was selected for the interior problem with ∅ = 12.5 cm FOV.

Kernels with stride one were used for all convolutional layers. All layers were designed with bias terms except for the L6, L10, and L12 layers. Convolution operations in all convolutional layers were performed with padding to maintain the dimensionality before and after the convolution operations.

L11–L12 layers generate the final image. The dimensions of the output of the L10 layer are α2Nv × N2. For each of the α2Nv channels, the N2 values were reshaped into a matrix with a size of N × N. The matrix was then rotated around its center by an increment angle ϕi = (α2Nviϕ, (i = 1, 2, ⋯ ,α2Nv) followed by a bilinear interpolation to make sure the rotated matrix stays on a Cartesian grid. Hyperparameter Δϕ=π492 was selected in this work. The rotated matrix was then reshaped back to a column vector with dimension of N2. The L12 layer combines the contribution from each of the α2Nv channels via a convolution kernel with dimension 1 × 1 × α2Nv followed by a linear activation to generate the final image with size of N2. Note that the introduction of a separated rotation layer (L11) reduces the dimension of learnable parameters in L10 from α2NvNcN2 to NcN2 and makes L10 trainable using limited GPU memory designed for personal computers.

To help keep track the number of training parameters and the dimension of each layer, these parameters are summarized in Figure 2. Each entry in this table consists of the first number to denote the number of kernels and the tuple followed by the comma denotes the dimension of the used kernel in each layer. For example, (64, 3 × 1 × 1) in L1 layer means that there are 64 kernels with dimensions 3 × 1 × 1.

Fig. 2.

Fig. 2.

Number of kernels and kernel dimensions as well as the corresponding output in all twelve iCT-Net layers.

B. Training strategies

To maximize the potential generalizability of the trained iCT-Net, training datasets should be maximally expanded to include a wide variety of human anatomy at a wide variety of x-ray exposure levels. Although it is possible to access the anonymized clinical CT image data with a variety of human anatomy and other animal anatomy, it is very difficult to obtain data with a wide variety of radiation dose levels. Additionally, the quality of training data acquired from real CT scanners may be compromised due to physical confounding factors such as beam hardening, scatter, the x-ray tube heel effect, and the limited dynamic range of x-ray detectors. To minimize the impact of these confounding factors without compromising the applicability of the trained iCT-Net in experimental evaluations, a two-stage training strategy was used in this study. The first training stage was performed using numerical simulation data and the second training stage was performed using experimental data acquired from a 64-slice MDCT scanner (Discovery CT750 HD, GE Healthcare, Waukesha, WI).

1). Stage-1 Training:

This stage includes both a segment-by-segment pre-training phase followed by an end-to-end training phase. The pre-training for the segment L1–L3 was performed using paired training data with low dose (high noise) projection data as input and high dose (low noise) projection data as output. The segment L4–L5 was pre-trained using sinograms with sparse view angles as input and sinograms with dense view angles as output. The segment L7–L9 was pre-trained using sinogram data with dense view angles as input and the corresponding sinograms filtered with a conventional Ram-Lak filter as output. Note that for the interior problem, the input sinogram data are truncated, but the output data used in pre-training are a correspondingly truncated portion of the filtered data generated by applying the Ram-Lak filter to the non-truncated data. In the segment-by-segment pre-training stage, the weights were initialized as Glorot uniform distribution [34] random numbers, and biases were initialized as zeros. The batch size was fixed to 100 in each segment-by-segment pre-training phase and 100 epochs were used as the empirical stopping criterion. In the segment-by-segment pre-training phase, the loss function is the correspondingly defined mean squared error in all training stages.

The number of training samples for the pre-training of each segment was 3,747,072 for the L1–L3 segment, 3,381,504 for the L4–L5 segment, 3,747,072 for the L7–L9 segment, and 3,747,072 for L10.

After pre-training all segments, input sinogram data and output reconstructed images were used to perform the end-to-end training of the iCT-Net using simulated projection data.

2). Stage-2 Training:

This stage is an end-to-end training step using experimental phantom data and human subject data acquired from a 64-slice MDCT scanner (Discovery CT750 HD, GE Healthcare, Waukesha, WI). Projection data of an anthropomorphic abdominal phantom (CIRS, Norfolk, Virginia) were acquired at different radiation dose levels and projection data from 58 human subject cases were used to perform the Stage-2 training for the Stage-1 trained iCT-Net.

The loss function was minimized using the standard stochastic gradient descent technique with a learning rate of 1 × 10−3 and decay factor of 1 × 10−6. Thirty epochs were used as an empirical stopping criterion, batch size was fixed to 3 and training samples were randomly shuffled. The change of the loss function per epoch for both training and validation was been carefully monitored to make sure there is no overfitting problem in the entire training process.

Training was performed on the platform Keras [35] with TensorFlow [36] backend deep learning library and a single Graphic Processing Unit (GPU) (NVIDIA Quadro P6000, Santa Clara, CA). After the iCT-Net was trained, it took 0.14 seconds (average over all tested data conditions) to reconstruct a single image slice.

C. Backpropagation in iCT-Net

In this work, a target function f to generate an image vector xN2 from an input data set vector yM is approximated by a neural network representation with a feedforward deep network architecture with a multi-layer composition of a series of nonlinear mappings, i.e., x=f(y)f^(y)=h(L)°h(L1)°°h(l)°°h(1)(y) here l ∈ {1, 2, …, L} denotes the layer index and L denotes the total number of layers. The output from a previous layer is the input for the next layer:

ycl(l)=ζ(l)(cl1Wcl1,cl(l)ycl1(l1)+bcl(l)), (3)

where ζ(l) denotes the activation function for the l-th layer. cl ∈ {1, 2, … , Cl} denotes the feature channel index in the l-th layer and Cl denotes the total number of features in the l-th layer. ycl(l) denotes the cl-th feature in the l-th layer. Wcl1,cl(l) denotes the l-th layer linear mapping that transforms the c(l−1)-th feature at the previous layer to the cl-th feature at the current layer. bcl(l) denotes the bias in the l-th layer. To simplify the notation and help avoid confusion, a compact notation without subscript indices is introduced to denote the input-output relationship at the l-th layer as y(l) = h(l−1)(y(l−1)).

Using the above notation, the output image x^:=y(L)=f^(y) is parameterized by a group of parameters {Wcl1,cl(l)} and {bcl(lc)}. Using the mean least square error as the goodness metric, a loss function is defined to optimize the unknown weights and bias parameters by solving the following optimization problem:

{W^cl1,cl(l),b^cl(l)}=argmin12Nsif^(yi)xi22, (4)

where i ∈ {1, 2, …, Ns} denotes the index of the training sample, and Ns denotes the total number of samples.

To perform the backpropagation procedure for the proposed iCT-Net framework, the gradients in each layer need to be calculated. Most of the gradient computation is similar to other well-known convolution neural network (CNN) models except that some extra care is needed for the layer with rotation operations (the L11 layer). Nevertheless, the calculations are purely algebraic. L denotes the loss function and Θ(l) denotes the unknowns to be learned at the l-th layer. The associated gradient LΘ(l) can be obtained through backpropagation as:

LΘ(l)=y(l+1)Θ(l)y(l+2)y(l+1)y(L)y(L1)Ly(L). (5)

Here we need to calculate four types of gradients:

y(l+1)W(l)=h(l)(y(l))W(l); (6)
y(l+1)b(l)=h(l)(y(l))b(l); (7)
y(l+2)y(l+1)=h(l+1)(y(l+1))y(l+1); (8)
Ly(L2)=R(α2Nvi)Ly(L1). (9)

The first three gradients were calculated using the numerical routines provided by TensorFlow. Compared to R(α2Nvi) which rotates the i-th channel of the output at the L10 layer by the angle of ϕi = (α2Nviϕ, (i = 1, 2, …,α2Nv) in the feedforward path, its gradient, the operation R(α2Nvi) rotates Ly(L1) by the angle of −ϕi in the backpropagation path to form the gradient at the L11 layer. Image rotation and resampling were implemented using TensorFlow operations and incorporated as a layer in Keras to numerically rotate the image matrix by ϕi (in the feedforward path) or −ϕi (in the backpropagation path) and resample the image matrix using bilinear interpolations.

III. MATERIALS AND METHODS

A. Training Data Preparation

To train the proposed iCT-Net, three types are training data were prepared: numerical simulation data, experimental phantom data, and clinical human subject data. The preparation of these training data were presented in the following subsections.

1). Numerical Simulation Training Data Acquisitions:

Twenty clinical CT image volumes, each containing 150–250 image slices, were used to generate simulation training data by using a standard ray-driven numerical forward projection procedure [37] in a fan-beam geometry. The parameters for the fan-beam acquisition geometry are the same as that used in the 64-slice MDCT scanner. To generate projection data at a variety of noise levels, Poisson noise was added to each simulated projection datum. The entrance mean photon number at the reference (100%) dose level was set to be I0 = 1 × 106 per ray. Other reduced-dose datasets were generated with entrance photon fluence of 50%, 25%, 10%, 5% of I0.

To incorporate the effect of electronic noise which may be significant at low exposure levels, zero-mean Gaussian distributed random values were added into the projection data before the log-transform to generate line integral data. The added Gaussian electronic noise has a noise-equivalent-quanta of 10 photons per ray, which is consistent with the typical electronic noise level in the MDCT scanner used in this work.

2). Experimental Phantom Training Data Acquisitions:

Despite our efforts in generating training data using numerical simulations to simulate the geometry and physics of the data generation process in a physical CT scanner, there was no specific effort in simulating the tube physics, detector physics and electronics of the physical CT scanners. Therefore, it was important to acquire experimental data from physical scanners to fine tune the iCT-Net parameters such that the trained network is able to produce desired reconstruction results for a specific CT scanner. This training stage was referred to as the scanner-specific fine-tuning training process in this paper. Given the fact that majority of the training tasks were sufficiently completed using the large data set from numerical simulations, only the anthropomorphic abdominal phantom (CIRS, Norfolk, Virginia) was scanned using the 64-slice MDCT scanner in our scanner-specific training process. Specifically, the anthropomorphic abdominal phantom was scanned at six mAs levels (6, 14, 28, 56, 112, and 220 mAs) with a tube potential of 120 kV using a clinical abdominal CT scan protocol. The acquired CIRS phantom data were used to perform further training of the entire iCT-Net in an end-to-end manner. Specifically, raw sinogram data were generated using a proprietary software provided by the manufacturer. The raw sinogram data were then retrospectively sorted into different groups (short-scan, super-short scan, interior tomography with both dense view and sparse view conditions) to train the corresponding iCT-Net parameters in an end-to-end training session.

3). Clinical Human Subject Training Data Acquisitions:

With HIPAA compliance and IRB approval, similar to the generation of experimental phantom data, raw sinogram datasets of 118 human subjects scanned with a coronary CT angiography protocol were retrospectively retrieved. A routine dose CT scan was prescribed to each subject with clinical indications. Among the 118 subjects, 58 subjects were randomly selected, and their projection data corresponding to the central detector row and the FBP reconstructed images were used to train the iCT-Net during the fine-tuning phase. The remaining 60 subjects were used as part of human subject data in our generalizability test.

B. Testing Data Preparation

To test the generalizability of the trained iCT-Net for each reconstruction task, it is critically important to test the reconstruction performance for different phantoms and different data acquisition conditions from those used in training processes. Therefore, in this work, the data sets consist of the remaining 60 exams in our available 118 coronary CT angiography cohort (58 exams were used in training stage), as well as 5 additional abdominal CT exams that have never been seen by iCT-Net in training stage. Additionally, an anthropomorphic head phantom was also scanned at four available tube potentials to generate data to test the generalizability of iCT-Net to data acquired at different x-ray spectral conditions.

1). Experimental Testing Phantom Data Acquisitions:

To significantly deviate from the anatomical conditions of the abdominal phantom used in training stage, the generalizability test of iCT-Net was performed using an anthropomorphic head phantom (PH-3 ACS, Kyoto Kagaku, Kyoto, Japan). The head phantom was scanned at each of the four different available tube potentials (80, 100, 120 and 140 kV) using the same 64-slice MDCT scanner. The tube current-exposure time product for these testing data acquisitions were 500 mAs.

2). Human Subject Testing Data Acquisitions:

Besides the remaining 60 cases out of 118 coronary CT angiography exams which have never been used in training, with HIPAA compliance and IRB approval, tests were performed using additional 5 retrospectively abdominal CT exams scanned at 120 kV and 500 mAs to demonstrate its performance generalizability.

C. Quantitative Accuracy Analysis

Reconstruction accuracy is quantified using two standard metrics: relative root mean square error (rRMSE) and structural similarity index metric (SSIM) defined as follows:

rRMSE=xx02x02×100%, (10)

where x denotes the reconstructed image and x0 denotes the corresponding reference image.

SSIM(x,x0)=(2μxμx0+a1)(2σx,x0+a2)(μx2+μx02+a1)(σx2+σx02+a2), (11)

where μx denotes the mean value of x, σx2 denotes the variance of x, and similar properties are defined for the reference image x0. In Eq. (11), σx,x0 is the covariance of x and x0 and a1 = 10−6 and a2 = 3 × 10−6 are two constants which are used to stabilize the division with a weak denominator. The size of each of the ROIs used to calculate SSIM is 30 mm × 30 mm.

In addition to the above metrics to assess reconstruction accuracy, line profiles across images have also been used to demonstrate the reconstruction accuracy across the images.

IV. RESULTS

Results were presented for the following three CT reconstruction problems with increasing difficulty: (1) standard short-scan acquisition with and without sparse view angles sampling; (2) super-short scan acquisition with and without sparse view angle sampling; and (3) interior tomography reconstruction problem with and without sparse view angle sampling.

A. Reconstruction results for short-scan acquisition mode: dense view and sparse view reconstruction

We first demonstrate that iCT-Net can be trained to accurately reconstruct images from data acquired in a short-scan acquisition mode for both dense view angle sampling (Figure 3) and sparse view angle sampling conditions (Figure 4). For comparison, reference images were generated using the standard FBP reconstruction with a Ram-Lak filter at dense view sampling condition. To benchmark the iCT-Net reconstruction performance, an iterative reconstruction technique referred to as the compressed sensing (CS) reconstruction in this paper was implemented using a total variation regularization. The pseudo code of our implementation can be found in one of our previously publications [38] (please see Table 1 in that paper). The corresponding reconstruction parameters are given below: for the dense-view reconstruction cases, μ = 1.5 × 10−5 and for the sparse-view cases, μ = 2 × 10−5. For all cases, λ = 0.1, s = 0.2, NIter = 15 and Ndenoising = 100 were used for reconstruction. These parameters were empirically optimized for the most appealing reconstruction performance of the CS results presented in this paper.

Fig. 3.

Fig. 3.

The iCT-Net short-scan reconstruction results for the dense view reconstruction problem. Reference images (1st column) were generated with a standard FBP reconstruction with a Ram-Lak filter using data from 644 view angles densely sampled across a short-scan angular range. Both the CS method (2nd column) and the trained iCT-Net (3rd column) were applied to the same input data to reconstruct the corresponding images. The difference images were generated by subtracting the CS and iCT-Net image from the reference image. Images shown in the first row were generated from numerical simulation data without added noise to demonstrate reconstruction accuracy. Images shown in the third row were generated from real human subject data to demonstrate the generalizability of iCT-Net to experimental data. W/L=1000/100 HU for reconstructed images in simulation, W/L=1000/0 HU for reconstructed images in experiment, and 300/0 HU for difference images. Zoom-in figures are presented in Figure 1 (simulation) and Figure 2 (experiment) in on-line supplemental materials. Supplementary materials are available in the supplementary files /multimedia tab. ROIs inside the images show where the SSIM calculations were performed.

Fig. 4.

Fig. 4.

The iCT-Net short-scan reconstruction results for the sparse view reconstruction problem. Reference images (1st column) were generated with a standard FBP reconstruction with a Ram-Lak filter using data from 644 view angles densely sampled across a short-scan angular range. FBP images (2nd column) were generated with a standard FBP reconstruction with a Ram-Lak filter using data from 161 view angles sparsely sampled across a short-scan angular range. Both the CS method (3rd column) and the trained iCT-Net (4th column) were applied to the same input data as in FBP reconstruction to reconstruct the corresponding images. The difference images were generated by subtracting the FBP, CS and iCT-Net image from the reference image. Images shown in the first row were generated from numerical simulation data without added noise to demonstrate reconstruction accuracy. Images shown in the third row were generated from real human subject data to demonstrate the generalizability of iCT-Net to experimental data. W/L=1000/100 HU for reconstructed images in simulation, W/L=1000/0 HU for reconstructed images in experiment, and 300/0 HU for difference images. Zoom-in figures are presented in Figure 3 (simulation) and Figure 4 (experiment) in on-line supplemental materials. Supplementary materials are available in the supplementary files /multimedia tab. ROIs inside the images show where the SSIM calculations were performed.

Difference images were generated by subtracting the iCT-Net reconstruction results from the corresponding reference image and the rRMSE was calculated using Eq. (10) to assess reconstruction accuracy.

As the first test of the generalizability of iCT-Net with quantifiable reconstruction accuracy, numerical simulation data without added noise were generated from human CT images and these sinogram data were directly used as input to the trained iCT-Net to reconstruct images. As shown in the reconstructed image and the difference image presented in Figure 3 and Figure 4, iCT-Net is able to accurately reconstruct images with lower overall rRMSE values and higher SSIM values shown in Figure 3 and Figure 4 when it is compared with the corresponding CS and FBP reconstructions.

To demonstrate that the trained iCT-Net is able to reconstruct images directly from experimental data, sinogram data from human subject cases were used as input to reconstruct images. As shown in Figure 3 and Figure 4, iCT-Net is able to accurately reconstruct images for sparse view problem, achieving a higher SSIM and lower rRMSE than the corresponding FBP and CS reconstructions.

B. Reconstruction results for super-short-scan acquisition mode: dense view and sparse view reconstruction

We next demonstrate that iCT-Net can be trained to accurately reconstruct images for the super-short scan acquisition with 180° angular range. In this case, there are missing data in the corresponding Fourier space. Therefore, one cannot expect the conventional short-scan FBP reconstruction to be able to accurately reconstruct the image for both the dense view and sparse view problems. After the training data sets were used to train the same iCT-Net, the trained iCT-Net is able to accurately reconstruct the image content in the same upper half of the FOV as shown in the images and line profiles (Figure 5 and Figure 6). Note that, according to the modern analytical reconstruction theories [8], it is possible to accurately reconstruct image content in half of the FOV for a view angle range of 180° provided that the view angles are not sparse. The results from one such method [39] (termed LCFBP) are shown in Figure 5 and Figure 6. The choice of LCFBP for comparison is due to the nature that the derivative operations have been totally eliminated in LCFBP such that it does not penalize the performance of these modern super-short scan reconstruction algorithms in sparse view angle reconstruction scenario.

Fig. 5.

Fig. 5.

The iCT-Net super-short scan reconstruction results for the dense view reconstruction problem. Reference images (1st column) were generated with a standard FBP method with a Ram-Lak filter and data from 644 view angles densely sampled across a short-scan angular range. Given the 180° view angle range, modern reconstruction theory (LCFBP shown in the 2nd column) declares that the image content in half of the FOV can be accurately reconstructed if view angles are densely sampled. The CS method (3rd column) and the trained iCT-Net (4th column) were applied to the same input data as in LCFBP reconstruction to reconstruct images shown in the same row. Reconstruction accuracy (2nd and 4th rows) is shown by comparing plots of iCT-Net reconstructed image values along a vertical line (Line 1) and horizontal line (Line 2) crossing the FOV to the corresponding plots from the reference image. The generalizability of iCT-Net to real human subject data is shown in the third row. W/L=1000/100 HU for reconstructed images. Zoom-in figures are presented in Figure 5 (simulation) and Figure 6 (experiment) in on-line supplemental materials. Supplementary materials are available in the supplementary files /multimedia tab. ROIs inside the images show where the SSIM calculations were performed.

Fig. 6.

Fig. 6.

The iCT-Net super-short scan reconstruction results in sparse view reconstruction problem. Reference images (1st column) were generated with a standard FBP method with a Ram-Lak filter and data from 644 view angles densely sampled across a short-scan angular range. Given the 180° view angle range, the modern reconstruction theory (LCFBP shown in the 2nd column) declares that the image content in half of the FOV can be accurately reconstructed if view angles are densely sampled. The CS method (3rd column) and the trained iCT-Net (4th column) were applied to the same input data as in LCFBP reconstruction to reconstruct images shown in the same row. Reconstruction accuracy (2nd and 4th rows) is shown by comparing plots of iCT-Net reconstructed image values along a vertical line (Line 1) and horizontal line (Line 2) crossing the FOV to the corresponding plots from the reference image. The generalizability of iCT-Net to real human subject data is shown in the third row. W/L=1000/100 HU for reconstructed images. Zoom-in figures are presented in Figure 7 (simulation) and Figure 8 (experiment) in on-line supplemental materials. Supplementary materials are available in the supplementary files/multimedia tab. ROIs inside the images show where the SSIM calculations were performed.

When view angles are sparse, as shown in Figure 6, iCT-Net is able to accurately reconstruct images for the sparse view reconstruction problem. In contrast, strong aliasing artifacts appeared in the LCFBP reconstruction for the sparse view super-short scan reconstruction problem.

Although the CS method may not be strictly applicable to the super-short scan reconstruction problem with dense view or sparse view sampling, out of curiosity, the CS method was blindly applied to the super-short scan input data to reconstruct images as shown in the third column in Figure 5 and Figure 6.

C. Reconstruction results for interior tomography problem: dense view and sparse view reconstruction

Figure 7 and Figure 8 show the iCT-Net reconstruction performance for the interior problem without and with sparse view acquisitions. For the interior problem, mathematical proofs are available [16]–[18] to show that a stable solution exists for an accurate reconstruction of the interior region under the condition that either the function values are known for some interior region or the function is known a priori to be piece-wise constant. However, in either case, iterative reconstruction schemes must be employed to account for these additional mathematical constraints to regularize the reconstruction. It is important to note that the available mathematical solvability proofs of the interior problem rely on the concept of analytical continuation in complex analysis which is incompatible with the sparse view condition.

Fig. 7.

Fig. 7.

The iCT-Net reconstruction results in dense view interior tomographic reconstruction problem. Reference images (1st column) were generated by first applying a standard FBP method with a Ram-Lak filter at full FOV (∅ = 50 cm) from 644 view angles densely sampled across a short-scan angular range. After reconstruction, the central portion of the image corresponding to a truncated FOV (∅ = 12.5 cm) was cropped to generate a reference image in this interior problem. For comparison, the standard FBP method (2nd column) and CS method (3rd column) were applied to the extrapolated data to reconstruct images shown in the same row, and the trained iCT-Net (4th column) was applied to the truncated input data to reconstruct images. In addition to the rRMSE and SSIM values, the reconstruction accuracy is also shown by comparing the reconstructed image values along the two straight lines to those of the corresponding reference values. Images shown in the first row were generated from numerical simulation data without added noise to demonstrate reconstruction accuracy. Images shown in the third row were generated from the anthropomorphic chest phantom data to demonstrate the generalizability of iCT-Net to experimental data. W/L=1000/0 HU for reconstructed images. ROIs inside the images show where the SSIM calculations were performed.

Fig. 8.

Fig. 8.

The iCT-Net reconstruction results in sparse view interior to-mographic reconstruction problem. Reference images (1st column) were generated by first applying a standard FBP method with a Ram-Lak filter at full FOV (∅ = 50 cm) from 644 view angles densely sampled across a short-scan angular range. After reconstruction, the central portion of the image corresponding to a truncated FOV (∅ = 12.5 cm) was cropped to generate a reference image in this interior problem. For comparison, the standard FBP method (2nd column) and CS method (3rd column) were applied to the extrapolated data to reconstruct images shown in the same row, and the trained iCT-Net (4th column) was applied to the truncated input data to reconstruct images. In addition to the rRMSE and SSIM values, the reconstruction accuracy is also shown by comparing the reconstructed image values along the two straight lines to that of the corresponding reference values. Images shown in the first row were generated from numerical simulation data without added noise to demonstrate reconstruction accuracy. Images shown in the third row were generated from the anthropomorphic chest phantom data to demonstrate the generalizability of iCT-Net to experimental data. W/L=1000/0 HU for reconstructed images. ROIs inside the images show where the SSIM calculations were performed.

Our results in Figure 7 (dense view angle sampling) and Figure 8 (sparse view angle sampling) show that the same iCT-Net can be trained to accurately reconstruct image for the interior problem without the explicit use of the aforementioned solvability conditions and it is able to accurately reconstruct images for both the dense view and even the sparse view interior problems down to a FOV of diameter ∅ = 12.5 cm, a severe truncation situation.

To benchmark the performance of iCT-Net, extra efforts have been taken to help the FBP and CS reconstructions to perform better. Specifically, the values at the edges of the measured sinogram data were extrapolated to fill the truncated area on a view-by-view basis. Values at the truncated area were estimated by assuming an elliptical curve such that the extrapolated value smoothly drops to zero. Both standard FBP and CS methods were applied to the extrapolated sinogram and results are presented in Figures 7 and 8. Note that iCT-Net was directly applied to the truncated sinogram, and no data extrapolation was performed.

D. Generalizability of iCT-Net to other data conditions

The training of iCT-Net was performed using data from coronary angiography CT data sets. To test whether the iCT-Net truly learned to reconstruct CT images under generic data conditions, the sinogram data acquired from the anthropomorphic head phantom at 80, 100, 120 and 140 kV tube potentials were directly reconstructed by the trained iCT-Net at the short scan condition. As shown in Figure 9, the trained iCT-Net is able to accurately reconstruct images directly from experimental data acquired at all four different tube potentials.

Fig. 9.

Fig. 9.

The iCT-Net reconstruction results of an anthropomorphic head phantom acquired at four different x-ray tube potentials with a short-scan angular range. Reference images (1st row) were generated with a standard FBP reconstruction with a Ram-Lak filter using data from 644 view angles densely sampled across a short-scan angular range. The trained iCT-Net (2nd row) was applied to the same input data to reconstruct the corresponding images. The difference images (3rd row) were generated by subtracting the iCT-Net image from the reference image. W/L=1000/0 HU for reconstructed images and 300/0 HU for difference images. ROIs inside the images show where the SSIM calculations were performed.

In addition to the reconstruction results for chest CT protocols, the trained iCT-Nets for the short-scan, super-short-scan, and interior problems were directly used to reconstruct images acquired from abdominal CT data and results were presented in Figure 10. Results shown in Figure 10 demonstrate that iCT-Net is able to accurately reconstruct images for a variety types of anatomy for short-scan problems, super-short scan problems, and interior problems with and without sparse view sampling conditions. It is also interesting to point out that there is only one kidney in this presented human subject, instead of two kidneys in normal conditions.

Fig. 10.

Fig. 10.

The iCT-Net reconstruction results of real human subject data acquired in an abdomen-pelvis scan protocol with the short-scan angular range, super-short scan angular range and interior problem. Dense view reconstruction results are presented in the 2nd column and sparse view reconstruction results are presented in the 3rd column. The corresponding reference images were generated by applying a standard FBP method with a Ram-Lak filter at full FOV (∅ = 50 cm) from 644 view angles densely sampled across a short-scan angular range. Note that the central portion of the FBP reconstruction without truncation was cropped to generate the reference image for the interior problem with the truncated FOV (∅ = 12.5 cm). The trained iCT-Net was applied to reconstruct images for each of the corresponding data acquisition conditions. W/L=400/0 HU for reconstructed images. ROIs inside the images show where the SSIM calculations were performed.

E. Intermediate iCT-Net Outputs

To gain some intuitive understanding on how iCT-Net addresses difficult reconstruction problems, some of the inter-mediate output from the trained iCT-Net is shown in Figure 11. The vanishingly small difference between the intermediate output from the L5 layer and the corresponding reference dense view sinogram indicates that iCT-Net addresses the sparse view reconstruction problem by transforming the sparse view reconstruction problem into a dense view reconstruction problem. In other words, iCT-Net learns to complete the missing line integral data for those view angles for which no data acquisition was performed.

Fig. 11.

Fig. 11.

With a sparse view sinogram as input, intermediate outputs of the L5 layer of the iCT-Net are shown in the upper half of the figure. Difference between intermediate output from the L5 layer and the corresponding dense view sinogram are presented for both simulation data and real human subject data. To demonstrate the quality of the output of sinogram completion in the L5 layer of iCT-Net, the FBP reconstruction of L5 output were presented and compared with that of the final iCT-Net reconstruction results. W/L=1000/100 HU for reconstructed images and 300/0 HU for difference images. ROIs inside the images show where the SSIM calculations were performed.

To further substantiate the above claim, the output of the L5 layer with a sparse view input sinogram was directly reconstructed using the conventional short-scan FBP reconstruction and the results were shown in Figure 11. The reconstruction results do indicate that the missing data were completed by the trained iCT-Net. As compared with the full iCT-Net reconstruction result, however, the direct FBP reconstruction L5 output generated a decent image but still has higher residual errors and thus lower reconstruction accuracy.

V. DISCUSSION AND CONCLUSION

Although the short-scan problem with dense view sampling has been completely solved through human efforts, due to data redundancy in divergent beam data acquisitions and the resulting choice of redundant data weighting strategies, one may theoretically have infinitely many possible reconstruction solutions. A challenge in practice is how to choose proper reconstruct schemes that can best fit the data quality from a CT acquisition system. As a matter of fact, these different solutions may have different performances when applied to different strategies of using the acquired data [9], [10]. The data redundancy problem also exists in iCT-Net reconstruction strategy. Since the iCT-Net can be considered as a network representation of one of the many available solutions, the iCT-Net is actually adaptively chosen by the provided training data set. Namely, it is the provided training data set that helps iCT-Net pick up the suitable solution.

It is intriguing to note that a long-standing puzzle in deep learning methodology is its mechanism to select a solution among the many local minima in the loss function. It has been suggested [40] that these existing local minima in the loss function might represent different solutions to the same problem. When the input data are ideal, e.g., no noise, then all these local minima may yield equivalent solutions to the same problem. For real data with nonidealities such as noise and bias, the learning process seeks a local minimal solution which is consistent with the noise distribution presented in the training data set. The presented iCT-Net reconstruction results to the short-scan problem provide concrete examples to support this argument in deep learning methodology. Namely, the local minima in the iCT-Net loss function correspond to the network representations of the many available analytical solutions to the same short-scan reconstruction problem and the training process helps select a solution that best fits the training data.

For the interior problem with sparse view sampling, it remains unknown whether a stable solution even exists for the problem, yet iCT-Net manages to accurately reconstruct fully truncated sparse view data through deep learning. This success seems to imply that the interior problem with sparse view sampling might be meaningful and theoretically solvable provided that some appropriate constraint conditions can be explicitly formulated.

It is also important to emphasize that, to address the sparse view reconstruction problem, the conventional CS method explicitly incorporates the sparsity information into a nonlinear iterative reconstruction procedure to obtain a sparse solution. In contrast, iCT-Net offers an alternative strategy to address the CS reconstruction problem by transforming the problem into a dense view reconstruction problem (Figure 11) and then a network approximation of the potential solution is learned from the training data. The presented results also indicate the following possibility for the future works. For the ordinary CT reconstruction problem without the severe transverse data truncations like what happened in interior tomography problem, it might be feasible to combine the first five layers in iCT-Net with the conventional FBP to perform end-to-end training to enable FBP to reconstruct a sparse view angle data set. This may open up a new opportunity to extend to the current work to the cone beam CT reconstruction problem.

Given the success in training iCT-Net to accurately reconstruct images for the super-short scan reconstruction problem, it would be an interesting future task to train iCT-Net to accurately reconstruct images for the intrinsic limited-view angle reconstruction problem, i.e. the intrinsic tomosynthesis reconstruction problem that is beyond the reach of the super-short-scan reconstruction algorithms.

It is also important to acknowledge that there have been many other intriguing applications of machine learning methods in x-ray CT. In these applications, it was demonstrated that the machine learning methods can be used to (1) learn patient cohort adaptive regularizers in iterative reconstruction [41]–[48]; (2) to reduce noise [49]–[56]; (3) to remove artifacts [57]–[63] after the standard FBP reconstruction is applied to accomplish a domain transform from sinogram space to image space; (4) to learn adaptive filtering kernel [64] and data redundancy weighting [65] in FBP reconstruction; (5) to learn to optimize regularization strength in iterative image reconstruction methods [66]; (6) to learn to perform projection data interpolation/correction before FBP is used for image reconstruction [67]–[69]; or (7) to learn to perform image deconvolution after direct backprojection is used for domain transform [70], [71]. It is important to emphasize that the iCT-Net strategy is fundamentally different from these available deep learning methods in CT. iCT-Net learns the necessary domain transform on its own to accomplish high quality image reconstruction directly from the noise contaminated and incomplete sinogram data to image via an end-to-end training. In future studies, it would be interesting and important for the entire community to join the effort to establish new benchmark for performance comparison, instead of the current common practice of using the corresponding FBP and iterative image reconstruction methods.

One limitation of the current work is that it did not address the generalization of the iCT-Net to the cone beam CT reconstruction problem. The direct application of the iCT-Net to cone beam reconstruction encounters a computer memory problem. Although efforts have been taken in iCT-Net network architecture to reduce the number of parameters down to the order of O(N2 × Nc) so that a currently available single GPU is able to handle the computations, when this is extended to a cone beam CT reconstruction problem, the number of network parameters will be up to O(N3×Nc2) due to the increase of image dimensionality by another factor of N and the increase of detector channels by another Nc. The training of these many parameters is beyond the current capacity offered by a single GPU. However, this potential challenge may be addressed by slightly modifying the current iCT-Net architecture. For example, instead of requiring more powerful GPUs, one alternative strategy is to replace the L10–L12 layers by the conventional backprojection operation which was widely used in iterative CT reconstructions in the past decades.

In this paper, the training of iCT-Net was performed with respect to a specific FBP reconstruction kernel (Ram-Lak kernel). In conventional FBP reconstruction, different reconstruction kernels have been developed to tradeoff noise and spatial resolution to meet the specific needs in a variety of clinical applications. One can train the proposed iCT-Net to accomplish this objective as well. Since the change from one reconstruction kernel to another is not significant, it is anticipated that a fine-tuning of the iCT-Net training would be sufficient, not a full-scale re-training as detailed in the method section in this paper.

To conclude, deep learning using iCT-Net provides a new paradigm to reconstruct CT images for a variety of reconstruction problems under very different conditions within a unified framework. The method shows the capability to accurately reconstruct images for those reconstruction problems that have already been completely solved by human efforts, problems that have been solved only partially by human efforts, and problems that have not been successfully addressed in any meaningful way using human knowledge.

Supplementary Material

supplementary

ACKNOWLEDGMENT

The authors thank John Hayes, Dalton Griner, and John Garrett for their meticulous editorial assistance and many stimulating discussions.

REFERENCES

  • [1].Gelfand IM, Gindikin SG, and Graev MI, Selected topics in integral geometry, ser. Translations of mathematical monographs,. Providence, R.I.: American Mathematical Society, 2003. [Google Scholar]
  • [2].Helgason S, Integral geometry and Radon transforms. New York: Springer, 2010. [Google Scholar]
  • [3].Natterer F, The mathematics of computerized tomography, ser. Classics in applied mathematics. Philadelphia: Society for Industrial and Applied Mathematics, 2001. [Google Scholar]
  • [4].Natterer F and Wubbeling F, Mathematical methods in image reconstruction, ser. SIAM monographs on mathematical modeling and computation. Philadelphia: Society for Industrial and Applied Mathematics, 2001. [Online]. Available: Publisher description http://www.loc.gov/catdir/enhancements/fy0665/00053804-d.html [Google Scholar]
  • [5].Kak AC and Slaney M, Principles of Computerized Tomographic Imaging. IEEE Press, 1988. [Google Scholar]
  • [6].Radon J, “ber die bestimmung von funktionen durch ihre integralwerte lngs gewisser mannigfaltigkeiten [on the determination of functions from their integral values along certain manifolds],” Ber Verh Schs Akad Wiss. Leipzig Math.-Phys, vol. 69, pp. 262–277, 1917. [Google Scholar]
  • [7].——, “On the determination of functions from their integral values along certain manifolds,” IEEE Trans Med Imaging, vol. 5, no. 4, pp. 170–6, 1986. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/18244009 [DOI] [PubMed] [Google Scholar]
  • [8].Clackdoyle R and Defrise M, “Tomographic reconstruction in the 21st century,” IEEE Signal Processing Magazine, vol. 27, no. 4, pp. 60–80, 2010. [Google Scholar]
  • [9].Clackdoyle R and Noo F, “A large class of inversion formulae for the 2d radon transform of functions of compact support,” Inverse Problems, vol. 20, no. 4, p. 1281, 2004. [Online]. Available: http://stacks.iop.org/0266-5611/20/i=4/a=016 [Google Scholar]
  • [10].Zhuang T and Chen G, “New families of exact fan-beam and cone-beam image reconstruction formulae via filtering the backprojection image of differentiated projection data along singly measured lines,” Inverse Problems, vol. 22, no. 3, p. 991, 2006. [Online]. Available: http://stacks.iop.org/0266-5611/22/i=3/a=016 [Google Scholar]
  • [11].Noo F, Defrise M, Clackdoyle R, and Kudo H, “Image reconstruction from fan-beam projections on less than a short scan,” Phys Med Biol, vol. 47, no. 14, pp. 2525–46, 2002. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/12171338 [DOI] [PubMed] [Google Scholar]
  • [12].Chen GH, “A new framework of image reconstruction from fan beam projections,” Med Phys, vol. 30, no. 6, pp. 1151–61, 2003. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/12852540 [DOI] [PubMed] [Google Scholar]
  • [13].Noo F, Clackdoyle R, and Pack JD, “A two-step hilbert transform method for 2d image reconstruction,” Phys Med Biol, vol. 49, no. 17, pp. 3903–23, 2004. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/15470913 [DOI] [PubMed] [Google Scholar]
  • [14].Zhuang T, Leng S, Nett BE, and Chen GH, “Fan-beam and cone-beam image reconstruction via filtering the backprojection image of differentiated projection data,” Phys Med Biol, vol. 49, no. 24, pp. 5489–503, 2004. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/15724538 [DOI] [PubMed] [Google Scholar]
  • [15].Zou Y, Pan X, and Sidky EY, “Image reconstruction in regions-of-interest from truncated projections in a reduced fan-beam scan,” Phys Med Biol, vol. 50, no. 1, pp. 13–27, 2005. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/15715419 [DOI] [PubMed] [Google Scholar]
  • [16].Ye Y, Yu H, Wei Y, and Wang G, “A general local reconstruction approach based on a truncated hilbert transform,” Int J Biomed Imaging, vol. 2007, p. 63634, 2007. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/18256734 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Kudo H, Courdurier M, Noo F, and Defrise M, “Tiny a priori knowledge solves the interior problem in computed tomography,” Phys Med Biol, vol. 53, no. 9, pp. 2207–31, 2008. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/18401067 [DOI] [PubMed] [Google Scholar]
  • [18].Yu H and Wang G, “Compressed sensing based interior tomography,” Phys Med Biol, vol. 54, no. 9, pp. 2791–805, 2009. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/19369711 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Candes EJ, Romberg J, and Tao T, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509, 2006. [Google Scholar]
  • [20].Donoho DL, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006. [Google Scholar]
  • [21].Goodfellow I, Bengio Y, and Courville A, Deep Learning. The MIT Press, 2016. [Google Scholar]
  • [22].Schmidhuber J, “Deep learning in neural networks: An overview,” Neural Networks, vol. 61, pp. 85–117, 2015. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0893608014002135 [DOI] [PubMed] [Google Scholar]
  • [23].Jordan MI and Mitchell TM, “Machine learning: Trends, perspectives, and prospects,” Science, vol. 349, no. 6245, pp. 255–260, 2015. [Online]. Available: http://science.sciencemag.org/content/sci/349/6245/255.full.pdf [DOI] [PubMed] [Google Scholar]
  • [24].LeCun Y, Bengio Y, and Hinton G, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–44, 2015. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/26017442 [DOI] [PubMed] [Google Scholar]
  • [25].Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, and Hassabis D, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–33, 2015. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/25719670 [DOI] [PubMed] [Google Scholar]
  • [26].Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, and Hassabis D, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, pp. 484–503, 2016. [DOI] [PubMed] [Google Scholar]
  • [27].Silver D, Schrittwieser J, Simonyan K, Antonoglou I, Huang A, Guez A, Hubert T, Baker L, Lai M, Bolton A, Chen Y, Lillicrap T, Hui F, Sifre L, van den Driessche G, Graepel T, and Hassabis D, “Mastering the game of go without human knowledge,” Nature, vol. 550, p. 354, 2017. [Online]. Available: 10.1038/nature24270 [DOI] [PubMed] [Google Scholar]
  • [28].Schoenholz SS, Cubuk ED, Sussman DM, Kaxiras E, and Liu AJ, “A structural approach to relaxation in glassy liquids,” Nature Physics, vol. 12, p. 469, 2016. [Online]. Available: 10.1038/nphys3644 [DOI] [Google Scholar]
  • [29].Van Nieuwenburg EPL, Liu Y-H, and Huber SD, “Learning phase transitions by confusion,” Nature Physics, vol. 13, p. 435, 2017. [Online]. Available: 10.1038/nphys4037 [DOI] [Google Scholar]
  • [30].Segler MHS, Preuss M, and Waller MP, “Planning chemical syntheses with deep neural networks and symbolic ai,” Nature, vol. 555, no. 7698, pp. 604–610, 2018. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/29595767 [DOI] [PubMed] [Google Scholar]
  • [31].Wang G, Ye JC, Mueller K, and Fessler JA, “Image reconstruction is a new frontier of machine learning,” IEEE Trans Med Imaging, vol. 37, no. 6, pp. 1289–1296, 2018. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/29870359 [DOI] [PubMed] [Google Scholar]
  • [32].Zhu B, Liu JZ, Cauley SF, Rosen BR, and Rosen MS, “Image reconstruction by domain-transform manifold learning,” Nature, vol. 555, p. 487, 2018. [Online]. Available: 10.1038/nature25988 [DOI] [PubMed] [Google Scholar]
  • [33].Ghahramani Z, “Probabilistic machine learning and artificial intelligence,” Nature, vol. 521, p. 452, 2015. [Online]. Available: 10.1038/nature14541 [DOI] [PubMed] [Google Scholar]
  • [34].Glorot X and Bengio Y, “Understanding the difficulty of training deep feedforward neural networks,” Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, vol. 9, pp. 249–256, 2010. [Online]. Available: http://proceedings.mlr.press [Google Scholar]
  • [35].Chollet F, “Keras,” 2015. [Online]. Available: https://keras.io [Google Scholar]
  • [36].Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Vigas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, and Zheng X, “Tensorflow: Large-scale machine learning on heterogeneous systems,” 2015. [Online]. Available: http://tensorflow.org/ [Google Scholar]
  • [37].Siddon RL, “Fast calculation of the exact radiological path for a three-dimensional ct array,” Med Phys, vol. 12, no. 2, pp. 252–5, 1985. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/4000088 [DOI] [PubMed] [Google Scholar]
  • [38].Garrett JW, Li Y, Li K, and Chen G-H, “Reduced anatomical clutter in digital breast tomosynthesis with statistical iterative reconstruction,” Medical physics, vol. 45, no. 5, pp. 2009–2022, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Chen G, Tokalkanahalli R, Zhuang T, Nett BE, and Hsieh J, “Development and evaluation of an exact fanbeam reconstruction algorithm using an equal weighting scheme via locally compensated filtered backprojection (lcfbp),” Medical Physics, vol. 33, no. 2, pp. 475–481, 2006. [Online]. Available: https://aapm.onlinelibrary.wiley.com/doi/abs/10.1118/1.2165416 [DOI] [PubMed] [Google Scholar]
  • [40].Bengio Y, “Machines who learn,” Sci Am, vol. 314, no. 6, pp. 46–51, 2016. [Online]. Available: https://www.ncbi.nlm.nih.gov/pubmed/27196842 [DOI] [PubMed] [Google Scholar]
  • [41].Wu D, Kim K, El Fakhri G, and Li Q, “Iterative low-dose ct reconstruction with priors trained by artificial neural network,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2479–2486, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [42].Zhang Y, Rong J, Lu H, Xing Y, and Meng J, “Low-dose lung ct image restoration using adaptive prior features from full-dose training database,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2510–2523, 2017. [DOI] [PubMed] [Google Scholar]
  • [43].Kelly B, Matthews TP, and Anastasio MA, “Deep learning-guided image reconstruction from incomplete data,” arXiv preprint arXiv:1709.00584, 2017. [Google Scholar]
  • [44].Zheng X, Ravishankar S, Long Y, and Fessler JA, “Pwls-ultra: An efficient clustering and learning-based approach for low-dose 3d ct image reconstruction,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1498–1510, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Adler J and Öktem O, “Learned primal-dual reconstruction,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1322–1332, 2018. [DOI] [PubMed] [Google Scholar]
  • [46].Chen B, Xiang K, Gong Z, Wang J, and Tan S, “Statistical iterative cbct reconstruction based on neural network,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1511–1521, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Gupta H, Jin KH, Nguyen HQ, McCann MT, and Unser M, “Cnn-based projected gradient descent for consistent ct image reconstruction,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1440–1453, 2018. [DOI] [PubMed] [Google Scholar]
  • [48].Chen H, Zhang Y, Chen Y, Zhang J, Zhang W, Sun H, Lv Y, Liao P, Zhou J, and Wang G, “Learn: Learned experts assessment-based reconstruction network for sparse-data ct,” IEEE transactions on medical imaging, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Chen H, Zhang Y, Kalra MK, Lin F, Chen Y, Liao P, Zhou J, and Wang G, “Low-dose ct with a residual encoder-decoder convolutional neural network,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2524–2535, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [50].Kang E, Min J, and Ye JC, “A deep convolutional neural network using directional wavelets for low-dose x-ray ct reconstruction,” Medical physics, vol. 44, no. 10, 2017. [DOI] [PubMed] [Google Scholar]
  • [51].Kang E, Chang W, Yoo J, and Ye JC, “Deep convolutional framelet denosing for low-dose ct via wavelet residual network,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1358–1369, 2018. [DOI] [PubMed] [Google Scholar]
  • [52].Wolterink JM, Leiner T, Viergever MA, and Išgum I, “Generative adversarial networks for noise reduction in low-dose ct,” IEEE transactions on medical imaging, vol. 36, no. 12, pp. 2536–2545, 2017. [DOI] [PubMed] [Google Scholar]
  • [53].Yi X and Babyn P, “Sharpness-aware low-dose ct denoising using conditional generative adversarial network,” Journal of digital imaging, pp. 1–15, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [54].Shan H, Zhang Y, Yang Q, Kruger U, Kalra MK, Sun L, Cong W, and Wang G, “3-d convolutional encoder-decoder network for low-dose ct via transfer learning from a 2-d trained network,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1522–1534, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [55].Yang Q, Yan P, Zhang Y, Yu H, Shi Y, Mou X, Kalra MK, Zhang Y, Sun L, and Wang G, “Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1348–1357, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [56].Kang E, Koo HJ, Yang DH, Seo JB, and Ye JC, “Cycle-consistent adversarial denoising network for multiphase coronary ct angiography,” Medical physics, 2018. [DOI] [PubMed] [Google Scholar]
  • [57].Han YS, Yoo J, and Ye JC, “Deep residual learning for compressed sensing ct reconstruction via persistent homology analysis,” arXiv preprint arXiv:1611.06391, 2016. [Google Scholar]
  • [58].Jin KH, McCann MT, Froustey E, and Unser M, “Deep convolutional neural network for inverse problems in imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017. [DOI] [PubMed] [Google Scholar]
  • [59].Han Y and Ye JC, “Framing u-net via deep convolutional framelets: Application to sparse-view ct,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1418–1429, 2018. [DOI] [PubMed] [Google Scholar]
  • [60].Xie S, Zheng X, Chen Y, Xie L, Liu J, Zhang Y, Yan J, Zhu H, and Hu Y, “Artifact removal using improved googlenet for sparse-view ct reconstruction,” Scientific reports, vol. 8, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [61].Zhang Z, Liang X, Dong X, Xie Y, and Cao G, “A sparse-view ct reconstruction method based on combination of densenet and deconvolution.” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1407–1417, 2018. [DOI] [PubMed] [Google Scholar]
  • [62].Zhang Y and Yu H, “Convolutional neural network based metal artifact reduction in x-ray computed tomography,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1370–1381, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [63].Han Y, Gu J, and Ye JC, “Deep learning interior tomography for region-of-interest reconstruction,” arXiv preprint arXiv:1712.10248, 2017. [Google Scholar]
  • [64].Syben C, Stimpel B, Breininger K, Würfl T, Fahrig R, Dörfler A, and Maier A, “A deep learning approach for reconstruction filter kernel discretization,” arXiv preprint arXiv: 1710.06287, 2017. [Google Scholar]
  • [65].Würfl T, Hoffmann M, Christlein V, Breininger K, Huang Y, Unberath M, and Maier AK, “Deep learning computed tomography: Learning projection-domain weights from image domain in limited angle problems,” IEEE transactions on medical imaging, vol. 37, no. 6, pp. 1454–1463, 2018. [DOI] [PubMed] [Google Scholar]
  • [66].Shen C, Gonzalez Y, Chen L, Jiang SB, and Jia X, “Intelligent parameter tuning in optimization-based iterative ct reconstruction via deep reinforcement learning.” IEEE Trans. Med. Imaging, vol. 37, no. 6, pp. 1430–1439, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [67].Lee H, Lee J, Kim H, Cho B, and Cho S, “Deep-neural-network based sinogram synthesis for sparse-view ct image reconstruction,” arXiv preprint arXiv:1803.00694, 2018. [Google Scholar]
  • [68].Han Y, Kang J, and Ye JC, “Deep learning reconstruction for 9-view dual energy ct baggage scanner,” arXiv preprint arXiv:1801.01258, 2018. [Google Scholar]
  • [69].Park HS, Lee SM, Kim HP, Seo JK, and Chung YE, “Ct sinogram-consistency learning for metal-induced beam hardening correction,” Medical physics, vol. 45, no. 12, pp. 5376–5384, 2018. [DOI] [PubMed] [Google Scholar]
  • [70].Ye DH, Buzzard GT, Ruby M, and Bouman CA, “Deep back projection for sparse-view ct reconstruction,” arXiv preprint arXiv:1807.02370, 2018. [Google Scholar]
  • [71].Ge Y, Zhang Q, Hu Z, Chen J, Shi W, Zheng H, and Liang D, “Deconvolution-based backproject-filter (bpf) computed tomography image reconstruction method using deep learning technique,” arXiv preprint arXiv:1807.01833, 2018. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementary

RESOURCES