Abstract
Loss of cone photoreceptor neurons is a leading cause of many blinding retinal diseases. Direct visualization of these cells in the living human eye is now feasible using adaptive optics scanning light ophthalmoscopy (AOSLO). However, it remains challenging to monitor the state of specific cells across multiple visits, due to inherent eye-motion-based distortions that arise during data acquisition, artifacts when overlapping images are montaged, as well as substantial variability in the data itself. This paper presents an accurate graph matching framework that integrates (1) robust local intensity order patterns (LIOP) to describe neuron regions with illumination variation from different visits; (2) a sparse-coding based voting process to measure visual similarities of neuron pairs using LIOP descriptors; and (3) a graph matching model that combines both visual similarity and geometrical cone packing information to determine the correspondence of repeated imaging of cone photoreceptor neurons across longitudinal AOSLO datasets. The matching framework was evaluated on imaging data from ten subjects using a validation dataset created by removing 15% of the neurons from 713 neuron correspondences across image pairs. An overall matching accuracy of 98% was achieved. The framework was robust to differences in the amount of overlap between image pairs. Evaluation on a test dataset showed that the matching accuracy remained at 98% on approximately 3400 neuron correspondences, despite image quality degradation, illumination variation, large image deformation, and edge artifacts. These experimental results show that our graph matching approach can accurately identify cone photoreceptor neuron correspondences on longitudinal AOSLO images.
Keywords: Adaptive optics, Split detection, Graph matching, Sparse coding, Cone photoreceptor neurons
1 Introduction
Adaptive optics scanning light ophthalmoscopy (AOSLO) [2,7] provides microscopic access to individual neurons of the retina directly in the living human eye. Critical to the phenomenon of human vision are specialized neurons called cone photoreceptors. These neurons can be noninvasively imaged using AOSLO (protrusions in Fig. 1). The loss of cone photoreceptors is a critical feature of many blinding retinal diseases. Therefore, longitudinal monitoring of these neurons can provide important information related to the onset, status, and progression of blindness.
Fig. 1.
Framework for neuron correspondence matching on longitudinal AOSLO images of the human eye, taken two months apart. In each panel, a portion of the image from the first visit is overlaid in the bottom left corner (solid rectangle) of the second visit image. Its corresponding location in the second visit is indicated by the dashed rectangles. (A) Identification of neurons (+’s) and convex hull regions (orange curves). (B) For each neuron from the first visit (e.g. blue dot), the LIOP feature descriptor and spare coding is used to determine candidate image points on the second visit (black +’s). (C) Based on the voting response at each candidate image point (i.e. visual similarity), candidate neurons for pairing are assigned, each with a visual similarity score (cyan and yellow dots). (D) Graph matching is used to determine correspondences based on both visual similarity (dashed green lines) and the arrangement of neighboring neurons (white lines). Scale bar=10 μm.
Currently, longitudinal monitoring of individual neurons within AOSLO images across different visits has only been attempted manually, which is not only labor-intensive, but also prone to error, and applicable over only small retinal regions [4,8]. Existing algorithms for cell tracking from microscopy videos require uniform illumination and small time intervals. For example, Dzyubachyk [3] utilized a coupled level-set method to iteratively track cells where overlapping regions in previous video frames were used for initialization. Padfield [6] modeled cell behaviors within a bipartite graph, and developed a coupled minimum-cost flow algorithm to determine the final tracking results. Longitudinal AOSLO imaging datasets contain inherent challenges due to non-uniform illumination, image distortion due to eye motion or montaging of overlapping images, and a time interval between subsequent imaging sessions that can be on the order of several months.
To address these unique challenges, we developed a robust graph matching approach to identify neuron correspondences across two discrete time points. The main contributions are three-fold. First, a local intensity order pattern (LIOP) feature descriptor is exploited to represent neuron regions, robust against non-uniform changes in illumination. Second, a robust voting process based on sparse coding was developed to measure visual similarities between pairs of neurons from different visits. Third, a global graph matching method was designed to identify neuron correspondences based on both visual similarity and geometric constraints. Validation on longitudinal datasets from ten subjects demonstrated a matching accuracy over 98%, which is promising for potential clinical implementation.
2 Methodology
2.1 Longitudinal Matching of Cone Photoreceptor Neurons
Step 1: Detection of cone photoreceptor neurons
The first step is to identify neurons on images from multiple visits. A simplified version of a cell segmentation algorithm [5] was implemented, using the multi-scale Hessian matrix to detect neurons, and the convex hull algorithm to determine neuron regions (Fig. 1A).
Step 2: Neuron-to-region matching
The next step is to find all relevant neuron pairs between visits in order to set up graph matching, which relies on robust feature descriptors for neuron regions and an image matching process.
Since longitudinal AOSLO images often have significant illumination variation, we adapted the LIOP feature descriptor [10]. The LIOP descriptor starts by sorting all pixels in a neuron region based on their intensity values, I, in increasing order, and then equally dividing the region into M ordinal bins in terms of the intensity order. For each image point p from bin B, an N-dimensional vector v = 〈I(q)〉, q ∈ N(p) is established by collecting all intensity values I(q) from their N-neighborhood points, and then the indices of v are re-ordered based on intensity values to derive vector v̂. Let W be an N! × N matrix containing all possible permutations of {1, 2, …, N}, and I be an N!×N! identity matrix. The LIOP descriptor for point p is
| (1) |
The LIOP for each ordinal bin is defined as
| (2) |
The LIOP descriptor of the entire neuron region is built by concatenating all sub-descriptors at each bin, which has the dimension of N!×M. Note that LIOP groups image points with similar intensity in each bin, instead of their spatial neighborhood. Therefore, the LIOP descriptor is insensitive to the global illumination changes, such as when entire neuron regions become darker or brighter, which often happens in longitudinal AOSLO images.
We also developed a robust neuron-to-region matching strategy based on sparse coding to identify relevant neuron pairs. Suppose the LIOP descriptor for the neuron detection p (blue dot in Fig. 1B) in the first visit is an N! × M dimensional vector d1. Transform p into the second visit image, and define a large image matching range Ω with size M1 × M1 > N! × M, centered at the transformed point. The LIOP descriptor is again established for each image point q ∈ Ω, and combining all descriptors over Ω leads to basis matrix D of size (N! × M) × (M1 × M1), which then fulfills the requirement of sparse coding that the basis matrix should be over-complete. Therefore, the image matching problem is converted into the vector d1 represented by the basis matrix D, and mathematically defined as
| (3) |
where denotes the L1 norm of the vector x. Subspace pursuit [1] was used to minimize Eq. 3, and non-zero elements of sparse vector x̄ are illustrated as black crosses in Fig. 1B. A voting process can thus be developed to find relevant neuron candidates (cyan and yellow points in Fig. 1C) in the second visit if their convex hulls have image points with non-zero sparse vector elements. Most of the black crosses are within the convex hull of actual corresponding neuron, and only a small set of relevant neuron pairs get reported from the neuron-to-region matching strategy, which significantly simplifies graph matching.
Step 3: Similarity assignment of neuron pairs
Using the sparse vector x̄, the similarity of a selected neuron pair can be computed as
| (4) |
Here, x̄j denotes a non-zero sparse element associated with an image point which is within the convex hull of the neuron in the second visit. Utilizing Eq. 4, we can obtain discriminative assignments for all selected neuron pairs (e.g. blue to cyan and blue to yellow pairings in Fig. 1C).
Step 4: Graph matching
We now describe the graph matching model for finding neuron correspondences on longitudinal AOSLO images. Let P1 and P2 be the sets of neuron detections in two visits (blue and red crosses in Fig. 1D), and A ⊆ P1 × P2 be the set of neuron pairs found from step 2. A matching configuration between P1 and P2 can be represented as a binary valued vector m = {0, 1}A. If a neuron pair α ∈ A is a true neuron correspondence, mα = 1; otherwise, mα = 0. Therefore, finding neuron correspondences is mathematically equivalent to calculating m for all possible neuron pairs.
The first constraint is that the matching graph should contain the similarity assignments of the selected neuron pairs from the previous step depicted as dashed green curve in Fig. 1D, given by
| (5) |
The second important constraint in the matching graph is the similarity of the adjacent neuron packing of neuron pairs (S), which is modeled as
| (6) |
S contains all adjacent neuron pairs defined over neighboring neurons
| (7) |
NK indicates the set of K-nearest neighborhood in the graph structure. In this paper, we set K = 6 as illustrated with white lines in Fig. 1D, motivated by the hexagonal packing arrangement observed for human cone photoreceptors. The similarity of adjacent neuron packing is calculated by combining both distance and direction constraints:
| (8) |
We set σ = 2 in our experiments.
The third term in our graph matching model is to ensure unique one-to-one neuron correspondence, which can be used to identify neuron appearance and disappearance.
| (9) |
|P1| and |P2| denote the number of neuron detections in the two visits, respectively.
Combining Eqs. 5, 6, and 9 leads to our graph matching model:
| (10) |
Here, λv, λg, and λp are weights set to 2, 1, and 10, respectively, in our experiments. Equation 10 was minimized by a dual decomposition approach [9], which leads to the final neuron correspondences for longitudinal AOSLO images.
2.2 Data Collection and Validation Method
To the best of our knowledge, there are no algorithms or publicly-available datasets utilizing this recently-developed AOSLO instrumentation [7] that could be used for comparison to our proposed method. Therefore, we acquired imaging data from ten subjects (5 male, 5 female; age: 26.3 ± 5.4 years, mean±SD) by repeatedly imaging the same retinal regions over several months. To construct larger regions of interest, overlapping images were acquired and then montaged together. Imaging data was used to construct two types of datasets from ten subjects to evaluate the robustness and accuracy of the matching framework. For the first dataset (“validation dataset”), from each subject we collected multiple images of a retinal region within a time period of several hours and generated two different sets of images of the same retinal region, each with unique distortions due to eye motion (300 × 300 pixels; approximately 100 × 100 microns). Then, two different modifications were performed on the artificial image pairs: neuron removal on one image to simulate cell loss/gain, and artificial image translation to simulate mismatches in alignment between visits. The second dataset (“test dataset”) consisted of two sets of images collected several months apart from the same retinal region of each subject (500 × 500 pixels; approximately 170 × 170 microns). The matching accuracy was estimated as:
| (11) |
Here, the errors include two different types: type 1, incorrect pairings between two neurons visible across both visits (this type of error usually leads to at least one additional error due to the one-to-one mapping) and type 2, incorrect pairings where one neuron was only visible on one of the visits (typically due to alignment errors at the boundaries).
3 Experimental Results
3.1 Validation Dataset
The number of neuron correspondences of each subject varied from 48 to 137 due to subject-to-subject anatomical differences (total: 713 neuron pairs). To test whether the proposed methods could detect cases of newly-appearing or disappearing neurons, 10 neurons were artificially removed from one image of each pair of images, resulting in a net increase in number of neurons of 8.0% to 26.3% (18.0 ± 5.5%), or conversely, a net loss of 7.3% to 21.4% (15.1 ± 3.8%) neurons (by reversing the order of visits; all numbers in this paper reported as mean±SD). In the case of adding neurons, 7 of 10 subjects maintained an accuracy of 100%, while the remaining 3 subjects had one error due to a mis-connection of one of the erased neurons. The overall matching accuracy in the presence of appearing neurons was 99.5% over 713 neuron correspondences. In the case of neuron removal, 6 of 10 subjects maintained an accuracy of 100%, while the remaining 4 subjects had one error which occurred at a site of artificial neuron removal. The overall accuracy in the presence of disappearing neurons was 98.2% over 713 correspondences. In both cases, the matching accuracy for the neuron pairs which were not removed was 100%, demonstrating that the algorithm was robust to different sets of distortion due to eye motion. The average computation time for the 300×300 pixel images which all contained different numbers of cells was 90 ± 28 s (Intel i7-3770 CPU, 16GB RAM).
The matching accuracy after artificial translation, which effectively reduces the area of overlap between two visits, was no lower than 99.5% for a range of translations tested (from 0 to up to 150 pixels, corresponding to overlaps ranging from 100% down to 50%). These validation results establish that the proposed methods performed well even in the presence of disappearing/appearing neurons, artifacts due to eye motion distortion, and alignment mismatches resulting in a significant reduction in the amount of overlap between image pairs.
3.2 Test Dataset
Across 20 image pairs in the test dataset, the total number of neurons from the first and second visits were 3905, and 3900, respectively. Our matching framework determined that there were 3399 correspondences between the two visits. To evaluate accuracy, images were manually examined to detect all matching errors, including type 1 (black circle, Fig. 2K), and type 2 (black circle, Fig. 2I) errors. Across the entire test dataset, a total of 44 type 1 and 34 type 2 errors were flagged. The overall accuracy achieved was 98%.
Fig. 2.
Example matching results (each column is a subject), with neuron detections (+’s) from the first visit shown in the top row, second visit in the middle, and matching results overlaid on visit 2 in the bottom (dashed square indicates actual position of visit 1). In the bottom row, neuron correspondences are marked as green ellipses. Circles show examples of type 1 (K) and type 2 (I) errors.
Matching results for four subjects are shown in Fig. 2. In the first column, the image pair (A and E) exhibits significant illumination variation across visits, with most neurons in Fig. 2E being brighter than those in Fig. 2A. In addition, the contrast between neurons and background tissue is also higher in Fig. 2E. Overall, our matching framework was robust to the illumination changes. In the second column, the image quality was significantly lower across both visits, but our matching framework could still find neuron correspondences accurately. Large image distortions due to eye motion are visible in the third subject (Figs. 2C, G), but our matching framework was still able to identify most neuron correspondences. Finally, due to montaging of overlapping images, edge artifacts are sometimes present (Fig. 2H). Nevertheless, our matching framework was still able to accurately identify neuron correspondences. The average computation time for 500 × 500 pixel images was 430 ± 79 s.
4 Conclusion and Future Work
In this paper, we developed a robust matching framework to accurately determine cone photoreceptor neuron correspondences on longitudinal AOSLO images. The matching framework was developed based on three key contributions: application of the LIOP descriptor for neuron regions to tolerate illumination variation, a sparse-coding based voting process select relevant neuron pairs with discriminative similarity values, and a robust graph matching model utilizing both visual similarity and geometrical cone packing information. The validation dataset showed that the matching accuracy could achieve 98.2% even with about 15% neuron loss. The matching framework was able to tolerate an alignment error of at least 50% while maintaining over 99% accuracy. The matching accuracy on the test dataset was 98% over 3399 neuron correspondences, and showed high robustness to illumination variation, low image quality, image distortion, and edge artifacts. Future work will include application of our framework to additional patient datasets and optimization of computational speed.
Acknowledgments
This research was supported by the intramural research program of the National Institutes of Health, National Eye Institute.
Footnotes
The rights of this work are transferred to the extent transferable according to title 17 § 105 U.S.C.
References
- 1.Dai W, Milenkovic O. Subspace pursuit for compressive sensing signal reconstruction. IEEE Trans Inf Theory. 2009;55(5):2230–2249. [Google Scholar]
- 2.Dubra A, Sulai Y. Reflective afocal broadband adaptive optics scanning ophthalmoscope. Biomed Opt Express. 2011;2(6):1757–1768. doi: 10.1364/BOE.2.001757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dzyubachyk O, van Cappellen W, Essers J, et al. Advanced level-set-based cell tracking in time-lapse fluorescence microscopy. IEEE Trans Med Imaging. 2010;29(3):852–867. doi: 10.1109/TMI.2009.2038693. [DOI] [PubMed] [Google Scholar]
- 4.Langlo C, Erker L, Parker M, et al. Repeatability and longitudinal assessment of foveal cone structure in CNGB3-associated achromatopsia. Retina. doi: 10.1097/IAE.0000000000001434. (EPub Ahead of Print) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Liu J, Dubra A, Tam J. A fully automatic framework for cell segmentation on non-confocal adaptive optics images. SPIE Medical Imaging. 2016:97852J. [Google Scholar]
- 6.Padfield D, Rittscher J, Roysam B. Coupled minimum-cost flow cell tracking for high-throughput quantitative analysis. Med Image Anal. 2011;15(4):650–668. doi: 10.1016/j.media.2010.07.006. [DOI] [PubMed] [Google Scholar]
- 7.Scoles D, Sulai Y, Langlo C, et al. In vivo imaging of human cone photoreceptor inner segments. Invest Ophthalmol Vis Sci. 2014;55(7):4244–4251. doi: 10.1167/iovs.14-14542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Talcott K, Ratnam K, Sundquist S, et al. Longitudinal study of cone photoreceptors during retinal degeneration and in response to ciliary neurotrophic factor treatment. Invest Ophthalmol Vis Sci. 2011;54(7):498–509. doi: 10.1167/iovs.10-6479. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Torresani L, Kolmogorov V, Rother C. A dual decomposition approach to feature correspondence. IEEE Trans Pattern Anal Mach Intell. 2013;35(2):259–271. doi: 10.1109/TPAMI.2012.105. [DOI] [PubMed] [Google Scholar]
- 10.Wang Z, Fan B, Wang G, Wu F. Exploring local and overall ordinal information for robust feature description. IEEE Trans Pattern Anal Mach Intell. 2016;38(11):2198–2211. doi: 10.1109/TPAMI.2015.2513396. [DOI] [PubMed] [Google Scholar]


