Abstract
Optical coherence tomography (OCT) is an evolving noninvasive imaging modality that has been used to image the human larynx during surgical endoscopy. The design of a long gradient index (GRIN) lens–based probe capable of capturing images of the human larynx by use of swept-source OCT during a typical office-based laryngoscopy examination is presented. In vivo OCT imaging of the human larynx is demonstrated with a rate of 40 frames per second. Dynamic vibration of the vocal folds is recorded to provide not only high-resolution cross-sectional tissue structures but also vibration parameters, such as the vibration frequency and magnitude of the vocal cords, which provides important information for clinical diagnosis and treatment, as well as fundamental research of the voice itself. Office-based OCT is a promising imaging modality to study the larynx for physicians in otolaryngology.
Keywords: larynx, biomedical imaging, early diagnosis, laryngeal cancer, optical coherence tomography
Introduction
Laryngeal carcinoma is one of the most common primary head and neck tumors. The cardinal symptom of early laryngeal cancer is hoarseness, but because this complaint is relatively innocuous; laryngeal cancer often goes undiagnosed for many months, and referral to an otolaryngologist may be delayed up to nine months following the onset of symptoms. Accurate clinical diagnosis and treatment of early-stage laryngeal cancer, based on an initial consult, is extremely difficult for a physician to determine even with proper imaging techniques, and a biopsy is required to differentiate between benign, premalignant, and malignant pathologies. Conventional laryngeal examination can be performed using a laryngeal mirror and flexible fiber-optic or rigid laryngoscopes (with or without videostroboscopy) to achieve a two-dimensional (2-D) view of the laryngeal structures.
Although it is difficult to differentiate among the wide spectrum of diseases ranging from chronic laryngitis to premalignant and malignant lesions, the lack of basement membrane integrity is a key feature of early invasive cancer of the vocal cords. The basement membrane is a thin layer of collagen and other proteins that the surface epithelial cells rest on, and it is the dividing line between the epithelium and the larger layers of the tissue such as the lamina propria. Therefore, early laryngeal carcinoma can be diagnosed quickly if the basement membrane can be visualized. Currently, there is no reliable noninvasive or nonoperative method available for surgeons to make the diagnosis of laryngeal cancer without a biopsy. Using current flexible fiber-optic techniques, the endoscopic yield in terms of diagnosing sensitivity and specificity for visible lesions in patients is very low. Biopsies of the vocal cord aimed at diagnosing cancer require a full-thickness excision of superficial epithelium, basement membrane, and connective tissues. These biopsies may have a detrimental effect on the patient’s vocal cord vibration and ultimately lead to a permanent change in voice. Repeated biopsies are common in order to ascertain a definite diagnosis and will bring even higher risks. The preceding difficulties demonstrate the need for improved noninvasive diagnostic technology, as well as improved abilities to determine margins and to perform definitively safe biopsies on patients with clinically suspected larynx malignancies.
Optical coherence tomography (OCT) is a noninvasive medical imaging method based on the principle of low-coherence interferometry.1 Low-coherence interferometry is a system where a low coherent light source is split into two identical beams, one for reference purposes and the other for the actual imaging information (sample arm). When the reference and sample arm are the same length, any difference in the backscattered light from the sample should provide information about subepidermal tissue. OCT was developed for the exact purpose of performing in vivo cross-sectional tomography imaging of tissue structure and composition with high imaging speed and resolution. It has become a powerful tool for medical diagnostics.2, 3, 4 Recently Luerssen et al. reported in vivo OCT imaging of vocal folds in a contact mode with local anesthesia.5 Our group also reported an office-based laryngeal time-domain OCT imaging device.6 A rigid laryngoscope served as a platform to which a second device could be attached to perform simultaneous OCT imaging. However, the scanning mechanism was too slow at a rate of ∼1 frame per second. A gradient index (GRIN) lens–based probe was also developed for office-based laryngeal imaging at a rate of ∼8 frames per second in conjunction with a spectral domain OCT system.7 However, one of the biggest challenges with an office-based OCT laryngeal imaging device is the movement of the patient’s head and the physician’s hand in the examination process. The latter is further levered by the cantilevered design of the probe. The relative movements between the vocal cords and the probe tip can easily exceed several millimeters, thereby shifting the images outside the OCT imaging window (or A-scan imaging depth). Also, the working distance (in this case, the distance from the probe tip to the vocal cords) is different from one patient’s anatomy to another; physicians have to practice their ability to adjust the working distance while holding the probe steady to capture an OCT image. This is no simple task and requires the full attention and skill of the physician. This practice takes the form of the physician using the probe on an illuminating objective slide model or an ex vivo head model fashioned by our group with the use of swine vocal cords, as shown in Fig. 1. In vivo and noncontact imaging of the vocal folds in awake patients without anesthesia has not been reported.
In this paper, we demonstrate an office-based laryngeal swept-source OCT imaging system. Fast laryngeal imaging of 40 frames per second is realized to greatly eliminate motion artifacts caused by tremors (<1 Hz) between the patients and the probe. In vivo noninvasive and noncontact OCT imaging of the vocal folds in awake patients without the use of anesthesia is reported here. Furthermore, dynamic vibration of the vocal folds is recorded to provide not only the high-resolution cross-sectional tissue structures but also important vibration parameters, such as the vibration frequency and amplitude of the vocal cord oscillations, which may provide additional helpful information for diagnosis.
Design
The schematic diagram of the fiber-based swept source OCT system is shown in Fig. 2. The output light from a swept light source at 1310 nm with a FWHM bandwidth of 100 nm and output power of 5 mW was split into the reference and sample arms by a 1×2 coupler (of 20∕80 split ratio). The GRIN lens–based OCT probe was connected to the sample arm with 80% power from the source. The light source was operated at a sweeping rate of 20,000 Hz. The reference power was attenuated by an adjustable neutral density attenuator for maximum sensitivity. The measured sensitivity of the OCT system with an ideal partial reflector as the sample was 108 dB. Two circulators were used in both the reference and sample arms to redirect the backreflected light to a 2×2 fiber coupler (50∕50 split ratio) for balanced detection. Dispersion compensation is important to achieve high resolution. The dispersion can be measured with a mirror as a sample by constructing the complex representation of the spectral fringe pattern and correcting the phase as a function of the wave number.8 The measured axial resolution of 8 μm was close to the theoretical axial resolution of 7.5 μm since the spectrum of the swept light source is nearly Gaussian shaped. The lateral resolution, which is determined by the OCT probe’s focus spot, was measured to be 25 μm.
In laryngeal endoscopy, the depth of the larynx varies remarkably from patient to patient. Changing the optical path length of the reference arm to match a variable working distance is difficult. The most convenient solution is to maintain a constant optical delay in the sample arm while tuning the working distance to ensure that the sample beam is always focused into the vocal cord. The device must quickly adjust to image the vocal cords as it changes position within the larynx. We use an enhanced version of a previously reported GRIN lens–based probe7 to fulfill constant optical delay dynamic focusing. The long GRIN lens used in this design can be considered as one pitch and an optical relay for visible wavelength. However, for a 1310-nm wavelength, which is the center wavelength of the OCT light source, the GRIN lens is closed to one pitch but cannot be considered as an ideal optical relay any more, especially when the average working distance of the probe (or the beam coming out of the probe tip) reaches about 65 mm for laryngeal imaging. In order to achieve an ideal optical relay, the GRIN lens is used with a group of lenses L1, L2 to form a so-called optical-ballast within a 4f optical system. The composite 4f optical system has a magnification of one and can be considered as an optical relay; the optical delay of the focal point remains constant during adjustment of the working distance.
A carriage holds the OCT device and the rigid video endoscope (Carl Zeiss Ltd) together in a “double-barreled” configuration. In order to identify the scanning point (area) during an OCT examination, an aiming beam should be coupled into the system, because the OCT light source is in the nonvisible infrared spectrum. Previously, a 2×1 coupler was used in the sample arm to couple a green aiming beam from a 532-nm solid-state laser. The 1.3-μm OCT beam will need to pass through the 2×1 coupler back and forth (twice) before reaching the detector. In order to achieve the best image quality for OCT imaging without too much sacrifice of the OCT power, the suggested coupling efficiency for the green light would have to be small. Normally, a very small portion (<10%) of the green light was allowed for aiming purpose, which was oftentimes not bright enough with the background incandescent lamp on for the video endoscope imaging (Fig. 3).
In our enhanced probe, a dichroic mirror–based design was used to solve the “compromise” between high OCT imaging power and bright illumination of the aiming beam. The sample beam from the OCT system is collimated, passes through a dichroic mirror and a focusing lens L3, and then reflects 90 deg to the fixed lens group by a scanning galvo [Fig. 2b]. The sample beam is now coupled to the probe with about 95% efficiency. Over 85% of the green beam is also coupled into the system through another channel of the dichroic mirror for aiming purposes. The fiber, the two collimators, and focusing lens L3 are assembled as one component and can be moved back and forth along the propagation direction for distance adjustment by the physician during the examination. The typical range of scanning (working) distance is about 40 mm. Two customized prisms are attached at the proximal and distal tips of the GRIN lens for beam deflection. During the examination, both of the dual-channel rigid endoscope and OCT signals are digitized and displayed on a single monitor (Fig. 4).
Method
During the examination of the vocal cords, the patient is asked to sit up straight and hold his or her tongue out and downward with gauze; this is for the precise purpose of clearing the way for the probe. Before the probe is inserted, it is mildly heated with a small blow dryer to prevent the optical components from fogging up when exposed to internal body temperatures. The probe is then inserted through the oral cavity and centered several centimeters above the larynx. Once we obtain a clear vision of the vocal cords as well as a good position of the OCT aiming beam, the patient is asked, as with the conventional stroboscopic examination, to phonate in order to produce different movements and positions of the vocal cords. This is illustrated by the endoscope image in Fig. 3b; it is evident that the vocal cords come together, which makes imaging easier for the otolaryngologist. During the whole procedure, we are recording both the OCT images as well as the laryngoscopic images.
Figure 5 shows the cross-sectional images of vibrating vocal cords of male and female volunteers during examination. The epithelium and basement membrane can be clearly identified. The images are comparable with images obtained in anesthetized patients during surgical endoscopy. The vocal fundamental frequency in females is approximately 200 Hz, whereas in males, it is approximately 120 Hz.9 This is a perfect fit for the measured OCT images shown in Fig. 5. Since the OCT imaging speed is 40 frames per second and 3 cycles are observed per frame, Fig. 5a corresponds to a frequency of approximately 120 Hz (40×3=120). On the other hand, Fig. 5b contains up to 5 vibrations per cycle and corresponds to a vibration frequency of 200 Hz (40×5=200). The precise dynamic vibration amplitudes can also be measured based on the OCT images. Since the total imaging depth in Fig. 5 is 2.6 mm, the estimated maximum vibration amplitudes in Figs. 5a, 5b are 1.2 mm and 0.59 mm, respectively.
Conclusion
We have demonstrated video-rate in vivo laryngeal imaging at 40 frames per second during a typical office-based laryngoscopy examination with a swept-source OCT system. Dynamic vibration of the vocal folds is recorded to provide not only the high-resolution cross-sectional tissue structures but also important vibration parameters, such as the frequency and amplitude of the vocal cord, which provide important information for clinical diagnosis and treatment as well as additional research in speech. Office-based OCT is a promising new imaging modality to image the larynx. Having the advantage of being performed without the need for general anesthesia or tissue removal, it adds a level of practicality and ease of use. Office-based OCT has the potential to guide surgical biopsies, direct therapy, and monitor disease. The future success of this device is deeply rooted in the amount of clinical volunteers that we can use the device on. Therefore, it is suggested that the device be more aesthetically pleasing, because too often people are reluctant to participate in this study because of the rudimentary nature of the device, with fiber-optic cables protruding and DC power supply running.
Acknowledgments
This work was supported by the National Institutes of Health (DC 006026, CA 91717, EB 00293, RR 01192, RR00827), National Science Foundation (BES-86924), Flight Attendant Medical Research Institute (32456), State of California Tobacco Related Disease Research Program (12RT-0113), Air Force Office of Scientific Research (F49620-00-1-0371), Undergraduate Research Opportunities Program, and the Henry Samueli School of Engineering Undergraduate Research Fellowship. Support from the Beckman Laser Institute Inc. Foundation is also gratefully acknowledged.
References
- Huang D., Swanson E. A., Lin C. P., Schuman J. S., Stinson W. G., Chang W., Hee M. R., Flotte T., Gregory K., Puliafito C. A., and Fujimoto J. G., “Optical coherence tomography,” Science 254, 1178–1183 (1991). 10.1126/science.1957169 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wong B. J. F., Jackson R. P., Guo S., Ridgway J. M., Mahmood U., Su J., Shibuya T. Y., Crumley R. L., Gu M., Armstrong W. B., and Chen Z., “In vivo optical coherence tomography of the human larynx: normative and benign pathology in 82 patients,” Laryngoscope 115, 1904–1911 (2005). 10.1097/01.MLG.0000181465.17744.BE [DOI] [PubMed] [Google Scholar]
- Sergeev A. M., Gelikonov V. M., Gelikonov G. V., Feldchtein F., Kuranov R., Gladkova N., Shakhova N., Snopova L., Shakhov A., Kuznetsova I., Denisenko A., Pochinko V., Chumakov Yu., and Streltzova O., “In vivo endoscopic OCT imaging of precancer and cancer states of human mucosa,” Opt. Express 1, 432–440 (1997). 10.1364/OE.1.000432 [DOI] [PubMed] [Google Scholar]
- Shakhov A. V., Terentjeva A. B., Kamensky V. A., Snopova L. B., Gelikonov V. M., Feldchtein F. I., and Sergeev A. M., “Optical coherence tomography monitoring for laser surgery of laryngeal carcinoma,” J. Surg. Oncol. 77, 253–258 (2001). 10.1002/jso.1105 [DOI] [PubMed] [Google Scholar]
- Luerssen K., Lubatschowski H., Gasse H., Koch R., and Ptok M., “Optical characterization of vocal folds with optical coherence tomography,” Proc. SPIE 5686, 328–332 (2005). 10.1117/12.592630 [DOI] [Google Scholar]
- Guo S., Hutchison R., Jackson R. P., Kohli A., Sharp T., Orwin E., Haskell R., Chen Z., and Wong B. J. F., “Office-based optical coherence tomographic imaging of human vocal cords,” J. Biomed. Opt. 11, 030501 (2006). 10.1117/1.2200371 [DOI] [PubMed] [Google Scholar]
- Guo S., Yu L., Sepehr A., Perez J., Su J., Ridgway J. M., Vokes D., Wong B. J., and Chen Z., “Gradient-index lens rod based probe for office-based optical coherence tomography of the human larynx,” J. Biomed. Opt. 14(1), 014017 (2009). 10.1117/1.3076198 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wojtkowski M., Srinivasan V., Ko T., Fujimoto J., Kowalczyk A., and Duker J., “Ultrahigh-resolution, high-speed, Fourier domain optical coherence tomography and methods for dispersion compensation,” Opt. Express 12, 2404–2422 (2004). 10.1364/OPEX.12.002404 [DOI] [PubMed] [Google Scholar]
- Noordzij J. P. and Ossoff R. H., “Anatomy and physiology of the larynx,” Otolaryngol. Clin. North Am. 39(1), 1–10 (2006). 10.1016/j.otc.2005.10.004 [DOI] [PubMed] [Google Scholar]