Abstract
Aim
To create a new, automated method of evaluating the quality of optical coherence tomography (OCT) images and to compare its image quality discriminating ability with the quality assessment parameters signal to noise ratio (SNR) and signal strength (SS).
Methods
A new OCT image quality assessment parameter, quality index (QI), was created. OCT images (linear macular scan, peripapillary circular scan, and optic nerve head scan) were analysed using the latest StratusOCT system. SNR and SS were collected for each image. QI was calculated based on image histogram information using a software program of our own design. To evaluate the performance of these parameters, the results were compared with subjective three level grading (excellent, acceptable, and poor) performed by three OCT experts.
Results
63 images of 21 subjects (seven each for normal, early/moderate, and advanced glaucoma) were enrolled in this study. Subjects were selected in a consecutive and retrospective fashion from our OCT imaging database. There were significant differences in SNR, SS, and QI between excellent and poor images (p = 0.04, p = 0.002, and p<0.001, respectively, Wilcoxon test) and between acceptable and poor images (p = 0.02, p<0.001, and p<0.001, respectively). Only QI showed significant difference between excellent and acceptable images (p = 0.001). Areas under the receiver operating characteristics (ROC) curve for discrimination of poor from excellent/acceptable images were 0.68 (SNR), 0.89 (IQP), and 0.99 (QI).
Conclusion
A quality index such as QI may permit automated objective and quantitative assessment of OCT image quality that performs similarly to an expert human observer.
Keywords: optical coherence tomography, signal to noise ratio
Optical coherence tomography (OCT) is a powerful ocular imaging tool for visualising cross sectional retinal images in vivo.1,2,3 Quantitative assessments of the retinal nerve fibre layer (RNFL), the optic nerve head (ONH), and the macular region in patients with known or suspected glaucoma are some of the major applications of OCT technology.4,5,6 OCT is also increasingly being used to assess a variety of other ocular pathologies, including retinal pathologies in the macular region such as macular hole.7,8,9,10 As OCT measurements are made directly from the acquired image, the quality of the scan is an important factor to consider when evaluating the results. At this point there are very limited means, other than subjective analysis, to evaluate OCT scan quality.
OCT quality assessment can be divided into two components: image quality and analysis quality. Image quality refers to the quality of the acquired signal itself (fig 1), while analysis quality refers to the quality of post‐processing and image analysis performed by the OCT system software. Since analysis quality usually depends upon the image quality, controlling the image quality at acquisition time is required.

Figure 1 Examples of acceptable optical coherence tomography image quality (top) and poor image quality (bottom). Quality parameters for the top image: SNR = 40, SS = 8, IR = 116.32, TSR = 0.344, and QI = 40.0, for the bottom image: SNR = 21, SS = 2, IR = 69.44, TSR = 0.094, and QI = 6.5.
In previous versions of OCT software the signal to noise ratio (SNR) was the only parameter that was available and widely used to objectively evaluate the quality of acquired images. SNR gives a general indication of the strength of the acquired signal within a given scan; however, its utility is limited as it takes into account only the single a‐scan that demonstrates the strongest signal and does not account for the distribution of this signal strength throughout the scan image. In the most recent version of the Stratus OCT software, a new image quality parameter has been introduced: signal strength (SS). Little is known at this time about the clinical utility of this new parameter. The establishment of an image quality parameter that can evaluate image quality in a manner similar to that of an OCT expert could augment the ability of all OCT users to acquire high quality images and obtain accurate and reliable clinical measurements.
The purpose of this study was to develop a new method to evaluate the quality of OCT images in a quantitative and objective fashion. This new method was then compared with currently available parameters (SNR and SS) in its ability to agree with experts' assessments of OCT image quality.
Methods
An equal number of normal subjects, early/moderate glaucoma and advanced glaucoma subjects were enrolled retrospectively according to the date of visit, starting with the first day of initiating this study. Subjects were then collected consecutively until we reached seven from each group. All subjects signed an informed consent, and all research activities were approved by the institutional review board of the New England Medical Center. Subjects underwent the following standard procedures: medical and ophthalmic history, complete ophthalmic examination including visual acuity (VA) testing, tonometry, slit lamp biomicroscopy, monochromic automated perimetry, and OCT imaging (Stratus OCT, Carl Zeiss Meditec, Dublin, CA, USA; software version 2.1). One eye was randomly chosen if both were eligible for the study.
Subjects
Inclusion criteria for the normal eyes were as follows: no history or evidence of intraocular surgery or laser, no history or evidence of retinal pathology or glaucoma, best corrected VA of 20/40 or better, normal Swedish interactive thresholding algorithm (SITA) standard Humphrey 24–2 visual field (HVF, Carl Zeiss Meditec, Dublin, CA, USA) with glaucoma hemifield test (GHT) within normal limits and no clusters of three adjacent points depressed more than 5 dB or two points depressed more than 10 dB in the pattern deviation plot, intraocular pressure (IOP) 21 mm Hg or lower, and normal appearing optic nerve head.
Inclusion criteria for the glaucomatous eyes were Humphrey visual field (HVF) glaucoma hemifield test (GHT) outside normal limits, with corresponding abnormality of the optic nerve head including optic nerve head (ONH) rim notching, cup asymmetry, or large cupping (vertical cup disc ratio >0.7), and/or nerve fibre layer defect on stereobiomicroscopy. We set a cut off at HVF mean deviation (MD) –9 dB to distinguish early/moderate (better than −9 dB) from advanced glaucoma (−9 dB or worse).
Development of a new quality parameter
Several steps were taken to define a new parameter to evaluate overall image quality. Firstly, four new parameters were created based on reflectivity values of the pixels that comprise each image (fig 2).

Figure 2 Signal intensity histogram of an OCT scan. Intensity values up to the 75th percentile (“Noise”) were considered extraneous signal from sources other than the retina. Values ranging from the 75th to the 99th percentile were fit to a pseudocolor scale to represent the various signal intensity levels of the retinal tissues. The value “low” is the lowest, 1st percentile signal intensity, “middle” is the mathematical mean of the 75th and 99th percentile values, and “saturation” is the 99th percentile intensity value.
Low: the first (lowest) percentile of all recorded reflectivity values in a given image.
Noise: the reflectivity value that corresponds to the 75th percentile of all recorded reflectivity values in a given image. Signal values up to this point in the scan were considered extraneous signal from sources other than the retina.
Saturation: the reflectivity value representing the 99th percentile of all recorded reflectivity values in a given image.
Middle: the mean value of noise and saturation.
We then defined a new quality assessment parameter, quality index (QI), which consisted of two components:
(1) Intensity ratio (IR):
![]() |
IR is analogous to signal to noise ratio (SNR) provided by the manufacturer, except for one aspect. The manufacturer's SNR is the maximum SNR value among all a‐scans, whereas IR is calculated based on a histogram that takes the entire image into account. SNR calculation requires preprocessed signal data that are proprietary and not available to OCT users. We therefore calculated an SNR equivalent, IR, using OCT raw data that any user can export.
(2) Tissue signal ratio (TSR):
![]() |
This calculates the ratio of the number of highly reflective pixels versus those with lower reflectivity. The numerator is the number of pixels between the “Middle” and “Saturation” intensity values. The denominator is the number of pixels between the “Noise” and “Middle” intensity values.
Our new parameter, QI, was then constructed as the product of the IR and the TSR described above:
QI = IR×TSR
Data collection
Stratus OCT images of differing scan types—macular scan (using “fast macula” protocol), peripapillary circular scan (“fast RNFL” protocol), and optic nerve head scan (“fast ONH” protocol)—were collected for each subject. One image of each scan type was randomly selected for analysis. No set of image quality criteria were used to screen these images, in order to allow for a full spectrum of image quality to be represented in this study. Further, in order to maximise the variability in image quality the scans used were those acquired in the first weeks after instalment of the Stratus OCT at New England Eye Center. The raw data and images of each scan were exported from the device onto a PC for further analysis.
Three different types of image quality assessments were then collected and for each scan as follows:
Manufacturer provided image assessment parameters: SNR and SS were collected from the exported data.
Custom image assessment parameter – QI: Using software of our own design, the raw data from each OCT scan collected were analysed and numerical IR, TSR, and QI values were calculated as outlined above for each image.
OCT expert subjective assessment: OCT images were presented in a randomised, masked fashion to three OCT experts who were independently asked to subjectively categorise the scans into a three level grading system (excellent = 3; acceptable = 2; poor = 1). Each observer used his own, independent criteria for quality assessment, without instruction or guidelines for the definition of each rating category. The observers were masked to the manufacturer provided image quality parameters, SNR and SS. The outcome scores of the observers were summed and the images were categorised into one of the three groups based on the total score (excellent = 8–9; acceptable = 5–7; poor = 3–4).
Statistical analysis
Descriptive statistics were calculated on the study population. Kappa agreement test and correlation analysis were conducted on expert OCT assessments. Using the OCT expert assessment as the “gold standard” for image quality, we determined all image quality parameters' ability to discriminate acceptable from unacceptable images by calculating the areas under the receiver operating characteristic (ROC) curves. Both excellent and acceptable grades were treated as acceptable for this analysis. Agreements in human assessment between the experts were assessed by calculating kappa values. To summarise three pairwise kappa values, a mathematical average of three kappa values was calculated.
Results
In all, 63 images of 21 eyes were enrolled in this study (seven normal, seven early/moderate, and seven advanced glaucoma). The mean age of the normal group was 58.3 (12.2) years (mean (SD)), of the early/moderate group 62.6 (11.0) years, and of the advanced glaucoma group 66.3 (16.1) dB. No significant difference in age was found among the groups (p = 0.54, ANOVA). The SITA MD for the normal group was −1.8 (2.3) dB (mean (SD)), for the early/moderate group −8.0 (1.4) dB, and for the advanced glaucoma group −19.5 (7.3) dB (p<0.001, ANOVA).
Expert assessments showed high agreement (mean kappa = 0.72) and significant correlations (mean r = 0.84, p<0.0001, Spearman's rank correlation). A histogram of the distribution of the expert subject assessments is shown in figure 3. There was no case with an assessment mixture of excellent and poor. Differing assessments were always with adjacent categories. Using the overall expert assessment score, nine images (14%) were assessed as excellent, 32 images (51%) were acceptable, and 22 images (35%) were poor. By diagnosis, 57% of advanced glaucomatous eyes had poor quality images, while only about 14% of normal eyes were poor quality (table 1).
Figure 3 Distribution of the expert assessment scores.
Table 1 Expert assessment by diagnosis.
| Excellent | Acceptable | Poor | |
|---|---|---|---|
| Normal | 7 (33.3%) | 11 (52.4%) | 3 (14.3%) |
| Early glaucoma | 0 (0.0%) | 14 (66.7%) | 7 (33.3%) |
| Advanced glaucoma | 2 (9.5%) | 7 (33.3%) | 12 (57.1%) |
By OCT scan type, the linear macular scan had more poor quality scans than nerve fibre layer or optic nerve head scans (table 2). There was no significant correlation between expert score and subject age (Spearman's rank correlation, r = 0.13, p = 0.32). All scans with normal eyes were either excellent or acceptable for RNFL scan type, while there was no excellent scan with early or advanced glaucoma.
Table 2 Expert assessment by OCT scan type with diagnosis breakdown.
| Excellent | Acceptable | Poor | |||||||
|---|---|---|---|---|---|---|---|---|---|
| N | E | A | N | E | A | N | E | A | |
| Circular (RNFL) | 4 (19.0%) | 11 (52.4%) | 6 (28.6%) | ||||||
| 4 | 0 | 0 | 3 | 5 | 3 | 0 | 3 | 3 | |
| Macula | 0 (0.0%) | 11 (52.4%) | 10 (47.6%) | ||||||
| 0 | 0 | 0 | 5 | 3 | 3 | 2 | 5 | 3 | |
| Optic nerve head | 5 (23.8%) | 10 (47.6%) | 6 (28.6%) | ||||||
| 3 | 0 | 2 | 3 | 6 | 1 | 1 | 2 | 3 | |
N, number of normal; E, number of early glaucoma; A, number of advanced glaucoma.
QI was successfully collected for all images (n = 63). Owing to an unexpected corruption of archived data, SS values were obtainable for 45/63 images. In addition, the built‐in OCT software did not provide SNR values for ONH scans. SNR values were therefore obtainable for 36/63 images. There were significant differences in SNR, SS, and QI between excellent and poor images and between acceptable and poor images (table 3).
Table 3 Quality parameters v expert assessment (p values, Wilcoxon test comparing with “poor” scan values).
| Excellent | Acceptable | Poor | |
|---|---|---|---|
| SNR (dB) (n = 36) | 40.2 (9.0) (0.04) (n = 4) | 35.4 (5.5) (0.02) (n = 19) | 30.0 (6.1) (n = 13) |
| SS (n = 45) | 6.11 (1.54) (0.002) (n = 7) | 5.53 (1.31) (<0.001) (n = 24) | 3.21 (1.63) (n = 14) |
| QI (n = 63) | 34.6 (6.9) (<0.001) (n = 9) | 24.5 (6.4) (<0.001) (n = 32) | 11.9 (2.9) (n = 22) |
Only QI showed a significant difference between excellent and acceptable images (p = 0.001).
Areas under the ROC curve (AROC) for discrimination of poor from excellent/acceptable images were 0.68 (SNR), 0.89 (SS), and 0.99 (QI) (fig 4). All AROCs were significantly different from each other (p⩽0.001).11
Figure 4 Area under the receiver operating characteristic (AROC) curves for discriminating poor from excellent/acceptable images using signal to noise ratio (SNR) (left), signal strength (SS) (middle), and quality index (QI) (right) (top row). Diamond plots of each parameters are shown in the bottom row. The line across each diamond represents the group mean. The vertical span of each diamond represents the 95% confidence interval for each group.
Eyes with advanced glaucoma had more poor images than in other clinical diagnostic groups. We found a significant difference in the mean MD between acceptable and unacceptable images (p = 0.004, Wilcoxon).
Using the 63 images collected (three from each eye of scan types macula, nerve fibre layer (NFL), and ONH) the standard deviation of the measured QIs was 11.2. The power of detecting a difference in QI of 10.1 (the smallest observed difference excellent versus acceptable) was 0.99 assuming total independence of all images. The power of detecting the smallest observed differences in SNR and SS were 0.84 and 0.50 respectively.
Discussion
This study aimed to develop a new, automated method of assessing the quality of optical coherence tomography (OCT) images that agreed with expert assessment in order to ensure high quality images and therefore accurate clinical measurements. We used the previously available parameters for this task, SNR and SS, and compared them with our new parameter, QI. While all quality parameters showed significant differences between excellent or acceptable versus poor images, the best separation between groups was found when using QI. In fact, QI was the only parameter that showed a significant difference between excellent and acceptable images. This implies that QI, an independently developed automated image quality parameter, correlates more closely with human expert assessment. However, since we had smaller sample sizes for SS and SNR, we did not have enough power to detect statistical significance with the present data set. Moreover, we treated all images as completely independent data so that our power calculation might have been maximised. Further investigation with larger sample size is required.
The concept of QI calculation was conceived before collecting the present data so that the equation was not optimised for this particular data set. Also, since we did not optimise the equation for maximum performance in any way, our results were not conclusive. However, an important finding here was that the StratusOCT built‐in quality assessment function had some room for improvement. Further investigation is required to develop a robust quality assessment method.
SS outperformed SNR in terms of poor image discrimination. SS is a combination of image quality (SNR) and uniformity of signal strength within a scan (further detail is not available from the manufacturer because of its proprietary nature). Perhaps this provides insight into how experts subjectively assess OCT images. It suggests that some of the factors taken into account by human experts might be signal balance or uniformity within a scan.
Poor images were more frequently acquired from advanced glaucomatous eyes than from the other groups. This was likely because RNFL was thinner and had lower internal reflectivity as a result of glaucomatous damage in these eyes. Again, this stresses the need for a robust measure of image quality, especially in the diseased eyes that tended to yield poorer quality images. It also raises the question of whether or not these images are indeed of “poor quality,” or if these are the best possible quality images that can be acquired in an eye with advanced damage.
It was also noted that scans of the macula region were more often of poor quality than those of the ONH and peripapillary areas. There are two possible explanations for this finding. Firstly, scans in the macular region usually do not have thick highly reflective layers (such as NFL) so that overall appearance of the macular region is less reflective. Furthermore, scans of the macular region tend to provide images with finer layer structures visible than seen in the NFL or ONH scans. Therefore, it is likely that the subjective assessment of these scans is more stringent because of higher expectations. This finding suggests that perhaps in a future study we should set different QI criteria for macular scans since they tend to be smaller values than ones for NFL and ONH scans.
The percentage of poor images in this study was relatively high (34.9%). This may be because images were collected for assessment shortly after acquiring the new StratusOCT. We revisited this issue (after our operators had approximately one year of StratusOCT experience) and the data suggest that the poor image ratio has significantly decreased. This stresses the role of operator experience in image quality, and underscores the need for an accurate, automated way to evaluate acquired images in order to ensure the collection of high quality data.
In summary, our results suggest that QI and SS successfully improve poor image discrimination compared to using the basic SNR. A quality index such as the one described in this paper (QI) may permit automated objective and quantitative assessment of OCT image quality similar to that of an expert human observer.
Acknowledgement
We were supported by NIH grants R01‐EY013178‐6, R01‐EY11289‐21, and P30‐EY008098, Research to Prevent Blindness, and the Eye and Ear Foundation (Pittsburgh).
Abbreviations
AROC - areas under the ROC curve
GHT - glaucoma hemifield test
HVF - Humphrey visual field
NFL - nerve fibre layer
OCT - optical coherence tomography
ONH - optic nerve head
QI - quality index
RNFL - retinal nerve fibre layer
ROC - receiver operating characteristics
SNR - signal to noise ratio
SS - signal strength
TSR - tissue signal ratio
VA - visual acuity
Footnotes
Competing interests: JGF and JSS receive royalties for intellectual property licensed by Massachusetts Institute of Technology to Carl Zeiss Meditec, Inc.
References
- 1.Huang D, Swanson E A, Lin C P.et al Optical coherence tomography. Science 19912541178–1181. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fujimoto J G, Brezinski M E, Tearney G J.et al Optical biopsy and imaging using optical coherence tomography. Nat Med 19951970–972. [DOI] [PubMed] [Google Scholar]
- 3.Hee M R, Izatt J A, Swanson E A.et al Optical coherence tomography of the human retina. Arch Ophthalmol 1995113325–332. [DOI] [PubMed] [Google Scholar]
- 4.Schuman J S, Hee M R, Arya A V.et al Optical coherence tomography: a new tool for glaucoma diagnosis. Curr Opin Ophthalmol 1995689–95. [DOI] [PubMed] [Google Scholar]
- 5.Paunescu L A, Schuman J S, Price L L.et al Reproducibility of nerve fiber thickness, macular thickness, and optic nerve head measurements using StratusOCT. Invest Ophthalmol Vis Sci 2004451716–1724. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Guedes V, Schuman J S, Hertzmark E.et al Optical coherence tomography measurement of macular and nerve fiber layer thickness in normal and glaucomatous human eyes. Ophthalmology 2003110177–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Tanner V, Chauhan D S, Jackson T L.et al Optical coherence tomography of the vitreoretinal interface in macular hole formation. Br J Ophthalmol 2001851092–1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Hee M R, Puliafito C A, Wong C.et al Optical coherence tomography of macular holes. Ophthalmology 1995102748–756. [DOI] [PubMed] [Google Scholar]
- 9.Drexler W, Sattmann H, Hermann B.et al Enhanced visualization of macular pathology with the use of ultrahigh‐resolution optical coherence tomography. Arch Ophthalmol 2003121695–706. [DOI] [PubMed] [Google Scholar]
- 10.Hee M R, Puliafito C A, Duker J S.et al Topography of diabetic macular edema with optical coherence tomography. Ophthalmology 1998105360–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zhou X H, Gatsonis C A. A simple method for comparing correlated ROC curves using incomplete data. Stat Med 1996151687–1693. [DOI] [PubMed] [Google Scholar]




