Skip to main content
The British Journal of Radiology logoLink to The British Journal of Radiology
. 2017 Oct 16;90(1079):20160519. doi: 10.1259/bjr.20160519

Assessment of structural similarity in CT using filtered backprojection and iterative reconstruction: a phantom study with 3D printed lung vessels

Raoul M S Joemai 1, Jacob Geleijns 1,
PMCID: PMC5963378  PMID: 28830200

Abstract

Objective:

To compare the performance of three generations of CT reconstruction techniques using structural similarity (SSIM) as a measure of image quality for CT scans of a chest phantom with 3D printed lung vessels.

Methods:

CT images of the chest phantom were acquired at seven dose levels by changing the tube current while other acquisition parameters were kept constant. Three CT reconstruction techniques were applied on each acquisition. The first technique was filtered backprojection (FBP), the second technique was FBP with iterative filtering (adaptive iteration dose reduction in 3 dimensions (AIDR 3D)) and the third technique was model-based iterative reconstruction (Forward projected model-based Iterative Reconstruction SoluTion (FIRST)). Image quality of the CT data was quantified in terms of SSIM. The SSIM index was used for image quality comparison between the dose levels and different reconstruction techniques. The SSIM index gives a value between 0 and 1, with 0 as the lowest image quality and 1 as an excellent image quality.

Results:

The lowest SSIM index was observed for FBP at all dose levels. The reconstruction technique with the highest SSIM depends on the dose level. For tube currents higher than 80 mA, AIDR 3D showed the highest SSIM index, and for tube currents lower or equal to 80 mA FIRST showed the highest SSIM index.

Conclusion:

SSIM index is a robust quantity and is correlated to the image quality as perceived by the humans. Advanced CT reconstruction techniques provide better image quality in all conditions compared to FBP.

Advances in knowledge:

SSIM is a robust measure to compare CT image quality for advanced reconstruction techniques relative to a reference.

The 3D print technology is an useful method for the development of dedicated phantoms for CT image quality evaluation.

INTRODUCTION

An ongoing challenge in CT is to obtain diagnostic image quality at a dose level that is as low as possible. An essential element in this challenge is to optimize the reconstruction techniques. The focus on optimizing reconstruction techniques over the last years led to a considerable improvement in CT reconstructions. Major CT manufacturers offer in addition to their standard filtered backprojection (FBP) also improved FBP techniques with noise and artefact reduction techniques either in the raw projection data, in reconstructed image space or in both. Most recently, statistical model-based iterative reconstruction techniques have been developed for CT with a potential of further optimized image quality and dose reduction.1

Image quality assessment for these advanced reconstruction techniques is more complicated than plain FBP because of non-linearity; the spatial resolution and noise can depend on the dose, the contrast and even the shape of the object. Therefore, conventional image quality measures such as contrast, noise and contrast-to-noise ratio are often too limited for proper image quality assessment. As an alternative, noise power spectra and modulation transfer functions (MTF) at different contrast levels could be calculated using a phantom for quality assurance such as Catphan 500 (The Phantom Laboratory, Greenwich, NY) or Gammex 464 (Gammex, Middleton, WI). A drawback of such quality assurance phantoms is that they do not represent a clinical condition. The nonlinearity of advanced reconstruction techniques could result in different performance in a quality assurance phantom compared to an anthropomorphic phantom. Furthermore, concepts such as noise power spectra and MTF are particularly difficult to assess in nonlinear systems and they are difficult to interpret in terms of clinical performance.2

In this study, we focussed on the evaluation of image quality using an anthropomorphic chest phantom with 3D printed lung vessels. A mathematical method, referred to as the structural similarity (SSIM) index, was used to compare image quality of CT reconstructions with a reference image. The SSIM is based on the assumption that the human visual system is adapted to extract structural information from images. Therefore, SSIM provides a good approximation of perceived image quality.35 Validation of image quality metrics like SSIM have been performed using the LIVE image QA database.6 The SSIM index quantifies distortions in the image in relation to the reference image, with a higher SSIM index associated with less distortions and better image quality.

The purpose of this study was to assess CT image quality for 3D printed lung vessels in terms of SSIM for three CT reconstruction techniques at different dose levels. The 3D digital model of the lung vessels is used as the reference standard for calculation of SSIM.

METHODS AND MATERIALS

3D digital lung vessel model

The digital lung vessel model that was used in this study7 is an anthropomorphic version of a comparable lung phantom that was described in another publication.2 The vessels were formed in an oval shaped shell with dimensions 150 × 104 × 29 mm (height × width × thickness). The vessel diameters are ranging from 0.25 to 10 mm. The digital lung vessel model was stored as an image volume with voxel sizes of 0.25 mm3. The pixel values of the vessels and air were 120 Hounsfield Units (HU) and −1000 HU, respectively. This model was used as a reference in the image quality assessment.

Chest phantom with 3D printed lung vessels

The digital lung vessel model was converted into a physical model using a ProJet 3D printer (3D Systems, Rock Hill, SC). The 3D print was created using the material VisiJet EX200 (3D Systems, Rock Hill, SC) with a density at 80°C of 1.02 g cm3. The density was only specified at this temperature by the manufacturer. This material was chosen based on an experimental study in which the HU of different materials were measured. For this material, the grey value was 120 HU at a tube voltage of 120 kV. Printing was done at ultra high definition setting resulting to a layer thickness of 0.032 mm.

The lung vessel insert was placed in a polymethylmethacrylate chest phantom (115 HU @ 120 kV). The dimension of the chest phantom was 300 × 200 × 29 mm (height × width × thickness). There are two spaces simulating the lungs with the exact same size as the lung vessel insert (Figure 1). The spine was simulated by a 34-mm Teflon insert (1022 HU @ 120 kV).

Figure 1.

Figure 1.

The 3D printed lung vessel insert (a) contains vessels with a diameter ranging from 0.25 to 10 mm. This insert was placed into a polymethylmethacrylate chest phantom (b) to mimic the anatomy of the patient’s chest.

Imaging protocol

Imaging was performed on an Aquilion ONE/GENESIS Edition CT scanner working with software version 7.30ER001 (Toshiba Medical Systems, Otawara, Japan). The following configurations were used for imaging the chest phantom with the lung vessel insert: 80 × 0.5 mm (number of active detector rows × detector row width), rotation time 0.5 s, bow tie filter L and pitch factor 0.813. Acquisitions were performed at a tube voltage of 120 kV and tube currents of 600, 300, 150, 80, 40, 20 and 10 mA, respectively. The focal spot size is automatically determined by the system according to the combination of acquisition parameters. In this protocol, the scanner uses a large focal spot size (1.6 × 1.5 mm) at a tube current of 600 mA while acquisitions at other tube currents were performed with a small focal spot size (0.9 × 0.8 mm).

A clinical chest CT on our CT scanner is routinely applied using tube current modulation. Tube current modulation applied on this particular phantom will range from 20 to 70 mA dependent on the type of chest CT.

Reconstruction

Three types of CT reconstruction techniques were applied on each acquisition. The first technique was plain FBP without additional noise or artefact reduction, the second technique was Adaptive Iteration Dose Reduction in 3 Dimensions (AIDR 3D, Toshiba Medical Systems, Otawara, Japan) and the third technique was Forward projected model-based Iterative Reconstruction SoluTion (FIRST, Toshiba Medical Systems, Otawara, Japan).

CT images were reconstructed with 0.5 mm slice thickness, 0.5 mm slice interval and a field of view of 160 mm. The reconstructed field of view was centred around the lung vessel insert. The reconstruction kernel for FBP and AIDR 3D was the standard lung kernel, i.e. kernel FC51. FIRST reconstruction was performed with a lung setting at a standard level, according to the recommendation of the manufacturer. Consequently, there were three reconstructed volumes for each acquisition. The number of acquisitions was 7 resulting in 21 reconstructed volumes.

Image quality evaluation

Simple image quality metrics, like noise, expressed as SD, and contrast-to-noise ratio, are not acceptable metrics of image quality in CT.8 Noise of a model-based statistical iterative reconstruction is significantly different from noise in FBP reconstructions.9 Consequently, assessment of contrast-to-noise ratios overestimates the performance of model-based statistical iterative reconstructions compared to FBP reconstructions.10 Fourier-based image quality metrics, like the noise power spectrum and transfer functions like MTF and PSF are useful quantities for linear, shift-invariant systems and have been used for quantifying image quality in FBP reconstructions in CT. However, all before mentioned metrics cannot be applied to non-linear model based statistical iterative reconstruction algorithms, since they do not fulfil the condition of a shift invariant linear imaging system.

Human observers can be used to assess task-based performance methods, e.g. forced-choice and rating-scale experiments. Signal-known exact tasks and search and free-response tasks can be applied. In practice, such studies have major limitations, they are time consuming, and they have limited accuracy due to inter- and intraobserver variability, observer learning and fatigue.8 Human model observers may solve some of the drawbacks of human observers, but such studies are complex and labourious and require dedicated phantoms, lesions and algorithms.11

We explored some metrics that are used for the quantification of image quality that were developed primarily for image compression of digital images and videos. The mean square error and the peak signal-to-noise ratio compare the reference (original) and distorted (compressed) image on a point-to-point basis. However, the mean square error and peak signal-to-noise ratio provide error metrics that are poorly correlated with the human visual system and perceived quality. The SSIM metric was developed to overcome this and it is now widely applied for quantifying loss of image quality in the video industry, and it has also applications for still photography. SSIM was developed to predict image quality as perceived by humans taking into account characteristics of the human visual system.4 The SSIM index has not yet been applied to assess generic image quality in diagnostic radiology but was regarded as appropriate for the purpose of our study, i.e. to assess quality of images from different reconstruction algorithms acquired at different dose levels against a well-defined reference standard.

The SSIM metric is based on three properties that have an effect on the human visual system. SSIM compares three aspects in a pair of images to derive a measure relating to the differences human observers would observe: luminance (differences in brightness), contrast (differences in contrast) and structure (correlation measured as covariance). SSIM is particularly sensitive to degradation of image quality due to blur, artefacts and noise.3 SSIM compares windows within the reference image and the distorted image, it is thus applied locally instead of globally. SSIM compares against a reference, any deviation from the reference image is regarded as a distortion. In this study image quality of the CT reconstructions of the lung vessel insert was quantified in terms of the SSIM index with the digital lung vessel model as the reference and the CT reconstructions as the distorted images. The method as defined by Wang et al4 was applied with the default parameters in this study.This means that the regularization constants are as follows: C1 = (0.01*216)2, C2 = (0.03*216)2, C3 = C2/2. Furthermore, the exponent for the three terms (luminance, contrast and structure) are all equal to 1. In this default method, a 11 × 11 circular-symmetric Gaussian weighting function with SD of 1.5 samples was used to calculate the local statistics. The SSIM index is calculated for image quality comparison between different dose levels and different reconstruction techniques.Image quality assessment was performed after a 3D registration of the digital lung vessel model, which served as reference, with reconstructed images of the lung vessel insert. Registration was performed in two steps. A rigid registration, containing translation and rotation, using a Mattess’ mutual information metric was performed with 100 iterations.12 The resulting registration was used to roughly identify the position of the lung vessel insert. The second registration was limited to the identified location of the lung vessel insert using a mean squares metric with a maximum iteration of 1000 as stopping condition. Scaling was performed according the known pixel sizes in mm. In the first registration, a nearest neighbour interpolation was used and in the second, a linear interpolation was applied.

Experimental findings revealed that multiple registrations per acquisition will lead to an additional error because of the differences in image quality between the reconstruction techniques. Therefore, registration was performed once per acquisition (dose level). This was done by registering the digital lung vessel model on the AIDR 3D reconstruction of the lung vessel insert. The resulting transformation was stored and other reconstructions from the same acquisition were registered using the stored transformation. Each registration was visually inspected for errors by the calculated difference between the reference volume and the CT reconstruction. An ideal reconstruction and registration would have a difference of zero. However, because of blurring, noise and possible artefacts differences will be noticed. In general, small registration errors will reveal as asymmetrical blurring around small structures. Visual inspection of symmetrical blurring around small structures was used as quality measure of the registration. This was applied through the whole volume of the phantom.

An SSIM error map was generated. It visually highlights the areas of an image that are associated with errors. The pixel values in the SSIM error map represent the local SSIM index. The SSIM error map was used in combination with the CT data to investigate the location and severity of the error in the CT data. Box and whisker plots were used to visualize the variations in the SSIM error map in the entire volume of the lung vessel insert. These profiles were characterized by five parameters. The central line in the box represents the median value, the edges of the box, the 25 percentile and 75 percentile and the maximum whisker length is equal to 1.5x the interquartile range (IQR). The whisker extends to the smallest and largest values excluding outliers. Outliers are visualized as dots and represent values that are outside 1.5x IQR.

RESULTS

Registration

The difference images for the registered CT reconstructions of the lung vessel insert and the digital lung vessel model were calculated and visually inspected. Large and small vessels with diameters equal to the pixel size (0.31 mm) were used to evaluate the performance of the registration. The difference image showed that the amount of blurring around the vessels was symmetrical. This was considered as an indication of good performance of the registration.

SSIM index at decreasing dose

SSIM index was calculated for a region of interest covering the lung vessel insert. SSIM indices can be compared between different dose levels and reconstruction techniques. The SSIM confirms that image quality decreases at decreasing dose for all reconstruction techniques with one exception being the acquisition at a tube current of 600 mA (Figure 2). The acquisition at 600 mA is the only acquisition with a large focal spot size. The large focal spot size adds additional unsharpness to the image. Visual inspection of the images revealed that this blurring was easy to notice when comparing the acquisition at 300 mA with the acquisition at 600 mA.

Figure 2.

Figure 2.

SSIM index for three reconstruction techniques at seven dose levels. AIDR 3D, adaptive iteration dose reduction in 3 dimensions; FBP, filtered back projection; FIRST, Forward projected model-based Iterative Reconstruction SoluTion.

The lowest SSIM index was observed for FBP at all dose levels. The reconstruction technique with the highest SSIM depends on the dose level. For tube currents higher than 80 mA, AIDR 3D showed the highest SSIM, and for tube currents lower or equal to 80 mA, FIRST showed better image quality.

Error map

The error map visualizes the local SSIM indices or local errors. The distribution of the errors was visualized in a box plot (Figure 3). Unlike the SSIM index, which is the mean of the entire error map, the boxplot gives more insight in the distribution of the errors. It can be seen that the error distribution of FBP and FIRST at 600 mA are similar, AIDR 3D has a narrower distribution and higher median value. Whisker lengths at acquisitions 80 to 300 mA are almost equal for all reconstructions but AIDR 3D and FIRST show higher, and thus better, median values. Compared to FBP and AIDR 3D, the FIRST reconstructions show higher SSIM thus better image quality at acquisitions lower than 80 mA, with smallest IQR, smallest whisker length and highest median value.

Figure 3.

Figure 3.

Boxplots of the local SSIM indices from the error maps for seven dose levels and three reconstruction techniques. AIDR 3D, adaptiveiteration dose reduction in 3 dimensions; FBP, filtered backprojection; FIRST, Forward projected model-based Iterative Reconstruction SoluTion.

Figure 4 shows one axial slice of the error maps together with the reference model and CT reconstructions at the lowest dose level. The error maps are depicted with a very narrow window setting to enhance the differences between the reconstruction techniques. The lighter the error map the better the image quality. The error map of FBP shows relatively low values because the high amount of noise, changes the image texture compared to the reference. This hampers localization of the lung vessels in the error map and was only observed for FBP reconstructions at tube currents 10 and 20 mA. Largest errors were found at the location of the smallest vessels. For larger vessels, blurring at the edges can be noted. In FBP, reconstruction streak artefacts caused by the Teflon cylinder can be noticed at the lower part of the image. AIDR 3D and FIRST did not produce noticeable artefacts.

Figure 4.

Figure 4.

Example of one slice of the lung vessel insert acquired at the lowest dose level (10 mA). From top to bottom the reference, FBP, AIDR 3D and FIRST are shown. The same slice is visualized in lung and soft tissue window setting. In the right column, the error map of the three reconstruction techniques are shown. The entire chest phantom was scanned, but the error map was only calculated for the lung vessel phantom. The darker the pixel values in the error map the larger the error. AIDR 3D, adaptive iteration dose reduction in 3 dimensions; FBP, filtered back projection; FIRST, Forward projected model-based Iterative Reconstruction SoluTion.

The minimum intensity projection (MINIP) of the error maps gives in one plane an indication of the errors in the entire image volume. MINIP is a volume rendering technique that projects voxels with lowest attenuation value on every slice throughout the volume on a two-dimensional image. The MINIP error maps are shown in Figure 5 for two acquisitions at clinically relevant dose levels (tube current 80 and 20 mA). Some dark spots were noted in each error map, these dark spots are most likely related to errors in 3D printing, since there were some vessels present in the reference model but not in any CT reconstruction. These errors are constant and will, therefore, not obstruct comparisons of reconstruction techniques. The diagonal line pattern is caused by the linear interpolation in the registration process. A nearest neighbour interpolation does not show a line pattern, however, this was not used as it resulted in larger local errors, leading to a lower SSIM. FBP shows clearly the darkest image compared to other reconstruction techniques, which translates into inferior image quality. In these relatively low dose acquisitions, FIRST shows best image quality.

Figure 5.

Figure 5.

Minimum intensity projections of the error maps for two clinically relevent dose levels. Window setting: 0.030 window width, 0.985 window level. The entire chest phantom was scanned, but the minimum intensity projection was only calculated for the lung vessel phantom. The darker the pixel values the larger the error. AIDR 3D, adaptive iteration dose reduction in 3 dimensions; FBP, filtered back projection; FIRST, Forward projected model-based Iterative Reconstruction SoluTion.

DISCUSSION

This study involved the use of a 3D printed phantom for image quality evaluation in CT. In our study, we focussed on the application of SSIM, which is a measure for perceived image quality, to a 3D printed lung vessel insert embedded in a chest phantom. The application of SSIM as a measure of CT image quality has been reported in another study.13 These authors focussed on sampling properties of the detector plane in experimental, sparse array, setups. Our study focussed on applying SSIM in clinically available CT reconstruction techniques for a clinically relevant application, i.e. the visualization of lung vessels.

AIDR 3D was introduced in 2011 as an improved reconstruction technique and it reduces noise and (streak) artefacts compared to FBP.14 Several studies showed dose reduction and improved image quality using AIDR 3D.68 AIDR 3D is an algorithm that incorporated noise optimizations in raw data and image space. During the reconstruction, a scanner model and statistical noise model are used to minimize the effects of electronic noise and statistical noise in the raw data (sinogram space), while an image-based denoizing technique is applied using an anatomical model. There are four levels available for AIDR 3D: mild, standard, strong and enhanced level. The enhanced level was introduced later and available from software version 7.0 and since then, the recommended setting for clinical practice, leading to improved noise texture. The enhanced level was used in this study.

FIRST was presented in 2015 as a model-based iterative reconstruction technique for Toshiba CT scanners. Model-based iterative reconstruction techniques may perform better compared to FBP, e.g. with regard to spatial resolution, low contrast resolution and noise reduction. However, no studies have been performed with FIRST iterative reconstruction yet. The reconstruction setting of FIRST is specific for the anatomical region, i.e. head, body, cardiac, cardiac sharp or lung. Each of these categories contains three levels: mild, standard and strong. An important difference between FIRST and FBP based reconstruction techniques is that reconstruction kernels are not used in iterative reconstructions like FIRST.

SSIM indices were in our study relatively high (>0.97 on a scale from 0 to 1), even for low dose acquisitions with high noise levels. The low variation of SSIM was caused by the type of images that were used in this study. CT images are often 16-bit integer images, which allows for windowing in different clinical relevant window settings. These type of images have a dynamic range of 65,535. The regularization constants in the formula of the SSIM take into account the dynamic range. The larger the dynamic range the larger the regularization constant and also the SSIM index. Therefore, a small absolute increase in SSIM index might already indicate a clinically relevant difference.

Registration was performed once per acquisition (dose level) and the transformation data was saved and applied on other reconstructions from the same acquisition. Using the same transformation, on volumes of the same acquisition, ensures a good comparison without the uncertainty of variable registration errors. This was confirmed by experimental findings. Visual inspection revealed a slightly worse registration result when registration was done for each reconstruction technique individually. In general, we subjectively found that the registration result improved with the level of image quality of the reconstruction technique. In contrast to this finding, the experiment revealed that applying one transformation determined at a high dose level was not exactly matching on a subpixel level at lower dose levels. This might be caused by hysteresis in the position of the CT table or minor movements during acquisition.

Only one acquisition was performed for each dose level while the whole lung vessel insert was acquired and used for analysis. There were 6,079,853 voxels analysed for a single image volume; in the digital lung vessel model 1,316,200 of these voxels correspond to lung vessels or the surrounding shell and 4,763,653 of these voxels correspond to air within the lung vessel model. The high amount of voxels leads to accurate calculations with high reproducibility. This was confirmed by an experiment in which the phantom was scanned 10 times at a tube current of 300 mA and reconstructed using FBP, AIDR 3D and FIRST. The 30 volumes were analysed with a registration performed for each acquisition separately. Although, the accuracy of the SSIM measurement and registration was combined in this experiment, a high reproducibility was found for the SSIM index with a mean ± standard deviation of 0.9965 ± <0.00005, 0.9960 ± <0.00005 and 0.9964 ± <0.00005 for FBP, AIDR3D and FIRST, respectively.

Since the introduction of SSIM, many publications confirmed a good agreement between SSIM and qualitative assessment by human observers. In this study, SSIM was not compared with human observers, however, the trends of the SSIM are in agreement with expectations after visual inspection of each result. No indication was found that SSIM behaves differently than human observers in this study. The contradicting decrease of SSIM for the highest dose level, compared to SSIM at half of the dose, was easy to recognize after visual inspection. This contradiction of higher dose but lower image quality was directly related to the larger focus size at the highest tube current which causes relatively more blurring.

There were several limitations in this study. First, the image quality evaluation was limited to the lung vessels using a chest phantom with a simplified representation of the human body. The non-linearity of advanced reconstruction techniques prevents a general image quality evaluation using dedicated image quality phantoms like Catphan or ACR phantom. It is, therefore, recommended to evaluate image quality using clinical data or anthropomorphic phantoms for various indications to get thorough insight in the performance of these advanced reconstruction techniques. This study gives only a good insight of the performance of CT reconstruction techniques for a model of the lung vessels. Second, although the SSIM index gives a good estimation of human preference, it is not a perfect measure. Human preference is a subjective score and therefore, shows a relatively large variation. The SSIM gives an approximation of the mean human preference. A human observer study could be more precise compared to SSIM, however, this is time consuming as the number of readings and readers should be substantial to get an acceptable accuracy. Finally, this study is the first where SSIM is used for assessment of image quality in diagnostic radiology, further exploration of its possibilities and pitfalls is desired.

CONCLUSIONS

In this study, a 3D digital anthropomorphic lung vessel model was used for image quality assessment in CT. The lung vessel model itself and the corresponding 3D printed lung vessel module were used for the evaluation of image quality of advanced CT reconstruction techniques using SSIM. The SSIM index is a robust quantity. It is relatively easy to measure, and only one acquisition per condition is sufficient. The SSIM is correlated to the image quality as perceived by humans. When a digital reference image is available, this measure can be applied to other reconstruction techniques and other imaging modalities. The study showed that advanced CT reconstruction techniques provide better image quality in all conditions compared to FBP. Relatively good performance of a model-based statistical reconstruction technique was observed at low doses.

FUNDING

This study was funded by Toshiba Medical Systems Japan. J.M. den Harder is the designer of the chest and lung vessel phantom including the design of the algorithm for the digital model as well as the physical design. The phantom was developed in a project funded by the technology foundation STW (project CLUES, number 13592).

Contributor Information

Raoul M. S. Joemai, Email: r.m.s.joemai@lumc.nl.

Jacob Geleijns, Email: j.geleijns@lumc.nl.

REFERENCES


Articles from The British Journal of Radiology are provided here courtesy of Oxford University Press

RESOURCES