Abstract
Purpose
The purpose of this study was to ascertain the amount of phase asymmetry of the vocal fold vibration in normophonic speakers via visualization techniques and compare findings for habitual and pressed phonations.
Method
Fifty-two normophonic speakers underwent stroboscopy and high-speed videoendoscopy (HSV). The HSV images were further processed into four visual displays: HSV playbacks, digital kymography (DKG) playbacks, mucosal wave kymography playbacks and static kymographic images of the medial line from the DKG playback. Two types of phase asymmetries, left-right and anterior-posterior, were rated on a scale from 1 to 5. Objective measures of left-right phase asymmetry were obtained.
Results
The majority of normophonic speakers (81%) were noted to display anterior-posterior asymmetry; however 66% of those were characterized as mild. Seventy-nine percent of participants were noted to display left-right asymmetry, however 72% of those were mild. A moderate relationship between the objective measures and subjective ratings was found.
Conclusions
Most normophonic speakers exhibit mild left-right and anterior-posterior asymmetries for both habitual and pressed phonations. Asymmetries were noted more often during habitual than pressed phonations, and when visualized by HSV and kymography than stroboscopy. Differences between objective measures and visual judgments support the need to quantify vocal fold vibratory features.
Keywords: Voice Assessment, Stroboscopy, High-Speed Videoendoscopy, Kymography, Symmetry, Vocal Fold Vibration
Introduction
Asymmetry of vocal fold vibration is a clinically significant diagnostic indicator. Asymmetry of the glottal cycle can be regarded in several ways. Typically we think of asymmetry in the left-right dimension evidenced by the commonness of judging amplitude and phase differences between the left and right vocal folds in stroboscopic clinical evaluations. A systematic categorization of asymmetry was only recently accomplished when Švec, Šram, & Schutte (2007) differentiated four aspects of asymmetry. In addition to left-right amplitude and phase asymmetries, Švec, Šram, & Schutte’s protocol for assessing glottal asymmetry also includes left-right frequency differences and axis shifts.
Most published research focuses on amplitude asymmetries and nearly all of the research focuses on left-right and not anterior-posterior vibratory differences. The anterior-posterior dimension of asymmetry holds clinical importance and has been discussed, though not well-studied. The most well-known instance of anterior-posterior phase asymmetry is the zipper-effect during vocal fold closure (Granqvist, Hertegård, Larsson, Sundberg, 2003). Of the four aspects of asymmetry described by Švec, Šram, & Schutte (2007), phase asymmetry, not amplitude or frequency differences, is the most clinically viable aspect to apply to the anterior-posterior dimension. Anterior-posterior phase asymmetries are the most frequent type of anterior-posterior asymmetry that we have seen clinically in our evaluation of vocal fold vibration via high-speed videoendoscopy (HSV). Left-right phase symmetry can be defined as the vocal folds reaching maximal glottal opening at the same phase in the glottal cycle. Anterior-posterior phase symmetry can be defined as the anterior and posterior portion of one vocal fold reaching maximal glottal opening at the same time within the glottal cycle. Obviously, phase asymmetry cannot be measured when left-right vocal fold frequency differences are present. In addition, left-right frequency differences cannot be studied stroboscopically due to the inherent limitations of stroboscopy as a technique. Stroboscopy is the “gold standard” for clinical voice evaluation, thus, no evidence-based clinical data on frequency differences is available. Since there is a moderate volume of research on amplitude asymmetries, and less appeal for clinical applications of frequency differences and axis shifts, left-right and anterior-posterior phase asymmetries have naturally become the main direction for this investigation.
Endoscopic methods for assessing asymmetrical vocal fold vibration
Stroboscopy (Bless, Hirano,& Feder, 1987; Wendler, 1992; Kitzing, 1985), high-speed videoendoscopy (HSV) (Kiritani, Honda, & Hirose, 1986; Hirose, 1988; Hertegard, Larsson, & Wittenberg, 2003; Deliyski, Petrushev, Bonilha, Gerlach, Martin-Harris, & Hillman, 2008), videokymography (VKG) (Švec, Šram,& Schutte, 1999; Švec, & Schutte, 1996; Švec, Šram, & Schutte, 2007; Qui, & Schutte, 2007), and HSV facilitative playbacks (Deliyski, Petrushev, Bonilha, Gerlach, Martin-Harris, & Hillman, 2008; Shaw & Deliyski, 2008) are methods currently available to study vocal fold vibration. Stroboscopy uses a strobe light synchronized to the rate of vocal fold vibration extracted via laryngeal contact microphone or electroglottography (EGG). This synchronization allows pictures of the vocal folds to be taken at specific phases across different glottal cycles and sequenced to represent artificial slow motion that appears as a complete glottal cycle (Deliyski et al., 2008). The pictures are taken across many cycles in a different location for each cycle. Thus, stroboscopy does not sample actual points within one cycle. This limitation of stroboscopy decreases the reliability of phase asymmetry judgments, which by definition is judged when the glottis is maximally open. Sampling multiple points within a cycle of vibration is necessary for most clinically relevant aspects of vibration (Eysholdt, Tigges, Wittenberg, & Proschel, 1996). HSV and VKG have the ability to sample within a cycle, and thus allow more confidence for judging the true vibratory behavior. In addition to the artificial slow motion, stroboscopy also has pitch tracking problems so that it is not useful for persons with more severe voice problems. Additionally, stroboscopy cannot synchronize to fast events or events with quickly changing dynamics.
HSV allows vocal fold vibration to be recorded at 2,000 to 10,000 frames per second (fps). For a male phonating at 100 Hz recorded at 2,000 fps, we would be able to view 20 pictures within each cycle of vibration. For a female phonating at 200 Hz, we would be able to view 10 samples within each cycle of vibration. HSV also samples from each cycle, thus not skipping important details regarding cycle to cycle changes. HSV does not rely on pitch tracking, as stroboscopy does, and thus is useful for persons with severe voice disorders and is useful for viewing fast, dynamic events. The high temporal resolution of HSV and the true spatial-temporal registration of the data make HSV desirable for clinical use. This is especially true for features such as symmetry that rely on intra-cycle information. However, the typical HSV playback has its disadvantages. Assessing features of vocal fold vibration relies on accurately noting changes in vibration over time, which is not easy to do when a clinician is relying on their memory of specific vibratory features over a considerable number of cycles. The use of kymography allows for the assessment of these features, which are not easily visualized in stroboscopy or HSV where the images must be compared over time in the presence of other competing features. Thus, it is advantageous to use kymography derived from HSV in addition to HSV.
Kymography uses a high-speed camera to sample one cross-section of the vocal folds over multiple cycles. VKG uses a line-scan camera where only the line selected during endoscopy is recorded. Kymography can also be derived from an HSV recording. When derived from a HSV recording it is termed digital kymography (DKG) and allows the user to choose which line of the vocal folds to scan off-line. Thus, the entire anterior to posterior length of the vocal folds is available for DKG visualization and analysis from the same sample of a cycle. This scan of the vocal folds from anterior to posterior is viewable in a movie format, termed DKG playback (Deliyski et al., 2008), allowing clinicians the ability to quickly view kymograms along the length of the vocal folds. DKG also allows for a single static HSV-derived kymographic image, mDKG, to be produced, thus increasing confidence in the accuracy and reliability of the measure, by decreasing the amount of extraneous information presented. Another similar technique, mucosal wave kymography (MKG) playback (Deliyski et al., 2008), has also been developed allowing to highlight the mucosal wave. MKG outlines the velocity of the vocal fold edges and encodes the opening of the vocal folds in green and the closing of the vocal folds in red. This playback, like DKG, is viewable in movie format from the posterior to the anterior portion of the vocal folds. Further information regarding HSV, DKG, mDKG, and MKG can be found in Deliyski et al. (2008). The caveat for utilizing kymography is the need for assurance of measuring the vibratory behavior of the vocal folds from the same cross-section over time. Due to the motion of the endoscopist (even one with a steady hand) and client motion, the use of endoscope motion compensation is necessitated. Motion compensation algorithms have been developed to overcome this source of error (Deliyski, 2005). However, these algorithms can only be applied on kymograms from full-image HSV recordings and not those from line scan cameras.
Research accomplished on asymmetry
Švec, Šram, & Schutte (2007) have published the most detailed categorization of glottal asymmetry to date. Other studies on the topic include: Verdonck-de Leeuw, Festen, and Mahieu (2001), Niimi and Miyaji (2000), Hertegard, Larsson, and Wittenberg (2003), Wittenberg, Tigges, Mergell, and Eysholdt (2000), and Eysholdt, Rosanowski, and Hoppe (2003). Additionally, studies of laryngeal asymmetry particularly of the arytenoid cartilage adduction have been accomplished by Lacina, 1970; Hirano, Kurita, Yokizane, & Hibi, 1998; Lindestad, Hertegard, & Bjorck, 2004. A brief review of these studies follows:
Verdonck-de Leeuw, Festen, and Mahieu (2001) used videostroboscopy and VKG with four clients. This study revealed that left-right phase differences and short-term amplitude and frequency modulation were viewable with VKG, stressing the importance of using a kymographic technique. The study discussed asymmetry as a general term and did not separately discuss phase differences. Niimi and Miyaji (2000) found asymmetry, as judged from HSV, to be positively correlated with auditory-perceptual judgments of increased hoarseness in 22 participants with known vocal fold lesions. This study found both amplitude and left-right phase asymmetries in their hoarse participants. Hertegard, Larsson, and Wittenberg (2003) quantified the closing speeds for each vocal fold pre- and post- surgical procedures providing an objective correlate of one aspect of vocal fold vibratory symmetry. The results demonstrated a decreased asymmetry in post- operative examinations. In addition to closing speed asymmetry, amplitude asymmetry was also mentioned in this paper. Wittenberg, Tigges, Mergell, and Eysholdt (2000) reported that they found hoarseness to be the result of left-right asymmetry, frequency irregularity, and, as assessed by viewing multiple lines, anterior-posterior asymmetry in vocal fold vibration. This is the first paper to discuss anterior-posterior asymmetry in addition to left-right phase asymmetry. Eysholdt, Rosanowski, and Hoppe (2003) found horizontal/right-left asymmetry to be related to polyps, Reinke’s edema, cysts or unilateral paralysis and anterior-posterior asymmetries to relate to heterogeneous distribution of muscular tension when analyzed with multi-line kymography from HSV. Additionally, arytenoid adduction asymmetries in the absence of voice complaints have been noted through laryngeal mirror, dissection, computer tomography, and endoscopy (Lacina, 1970; Hirano, Kurita, Yokizane, & Hibi, 1998; Lindestad, Hertegard, & Bjorck, 2004). It is not known how these arytenoid adduction asymmetries relate to asymmetry of vocal fold vibration patterns in vocally normal persons and clients with voice disorders.
The need for norms from HSV
Since an enhanced HSV methodology would eventually allow for greater clinical findings of vocal fold vibration abnormalities, it is important to first understand vocal fold vibration in normophonic speakers. Without knowledge of normal variations as visible via HSV, discriminating pathology-induced vocal fold vibratory behaviors is not clinically possible. Thus, important steps for the progression of the research and clinical investigation of symmetry include: determining the tools to use, acquiring normative data, and comparing normative data to pathologies.
The importance of establishing normative data is exemplified in two recent stroboscopic studies that have presented conflicting findings (Elias, Sataloff, Rosen, Heuer, & Spiegel, 1997; Heman-Ackah, Dean, & Sataloff, 2002). These publications indicate the need to investigate whether the presence of asymmetry may be common in persons without voice disorders. The first evidence for the commonness of asymmetry is from a study of in sixty-five singers without voice complaint prominent asymmetrical vocal fold movement was realized in 7.5%, highlighting the possibility of a normal amount of asymmetry (Elias, Sataloff, Rosen, Heuer, & Spiegel, 1997). In a second study of vocal fold structure and function of twenty singing teachers, vocal fold asymmetries were only noted in the singing teachers with voice complaints, and were not noted in the asymptomatic participants. (Heman-Ackah, Dean, & Sataloff, 2002). The conflicting outcome of these studies reflect the need to first study what is normal before attempting to decide what is pathological. Given that these studies used stroboscopy a technique with a longer clinical life and lower temporal resolution than HSV, the necessity to formulate parameters of typical vocal fold vibratory features in normophonic persons visualized via HSV is heightened. Obtaining a normative HSV database would allow researchers and clinicians to more clearly differentiate normal from pathological features. This database could also guide future research and technological development by delineating the parameters only associated with voice disorders.
Habitual vs. pressed phonation
Since stroboscopy has such limitations that do not allow for true intra-cycle sampling, it is less useful for answering basic research questions than HSV. One such basic research question associated with phase asymmetry concerns the differences in vocal fold vibration between habitual and pressed modes of phonation. Since pressed phonation may loosely be thought to simulate hyperfunctional voice use, an evaluation of differences in asymmetry between these modes becomes clinically significant. Such a study also helps researchers understand if normophonic participants simulating a hyperfunctional voice disorder use the same mechanism to phonate as persons who have a hyperfunctional voice disorder.
Visual ratings vs. objective measures
In addition to concerns for studying asymmetry of vocal fold vibration that are related to recording tools, there are many concerns regarding the evaluation and analysis of these recordings. Visual-perceptual ratings of vocal fold vibration have been frequently associated with low intra- and inter-rater reliability. One study discussed the use of stroboscopy as a research tool, primarily evaluating judgment reliability difficulties (Rosen, 2005). We have found some success in attaining high intra- and inter-reliability using consensus training of our raters prior to initiation of the experiment (Shaw & Deliyski, 2008; Bonilha & Deliyski, 2008; Bonilha, Aikman, Hines, & Deliyski, 2008). While intra- and inter-rater reliability are improved with this technique, accuracy of the raters is another dimension of difficulty using visual-perceptual ratings. With the advent of clinically implemented HSV systems, there is now the possibility to quantify vocal fold vibratory features. Due to the above discussed limitations of stroboscopy, this has not been available in a clinically viable format in the past. Objective measures allow us to assess the accuracy of visual ratings in known clinically important features of vocal fold vibration. A discrepancy between visual perceptual and objective measures for a given feature demonstrates the clinical need to create useable objective measures of that feature. However, prior to undertaking such a large task for vibratory features not routinely clinically judged, such as anterior-posterior phase asymmetry, we first need to establish whether the feature provides clinically significant information via visually-perceptual judgments.
Purpose
The purpose of this study was to preliminarily investigate left-right and anterior-posterior asymmetry from normophonic speakers through established and novel visual playbacks (Deliyski, Petrushev, Bonilha, Gerlach, Martin-Harris, & Hillman, 2007), as well as objective measures. The specific research questions were as follows:
What is the typical left-right and anterior-posterior asymmetry realized in normophonic speakers and does this change based on visualization technique? This was accomplished using stroboscopy, HSV, and HSV-facilitative playbacks.
Do left-right and anterior-posterior asymmetries vary with mode of phonation? This was accomplished by comparing habitual and pressed phonations via HSV.
How do visual judgments and objective measures of left-right asymmetry relate? The objective measures were accomplished via static medial kymograms derived from HSV, while visual judgments from all HSV playbacks and stroboscopy were included in the comparison.
Materials and Methods
Participants
The participants were recruited from the University of South Carolina, Columbia, SC and Charlotte, NC including students, and friends, family, and colleagues of the research team. The research team asked persons to participate according to the exclusion/inclusion criteria of the study detailed below. Participants were not excluded based on profession. A phone or in-person conversation with one or more members of the research team was the method of recruitment. This allowed for at least one judgment of perceptual voice quality prior to asking the person if they were interested in participating. Additionally, prior to asking a person to participate in the study, questions of perceived current or past voice problems were addressed. Since persons with poor voice quality would be later excluded from the experiment, they were not recruited. Twenty-four male and twenty-eight female participants ranging from 18 to 65 years of age participated in the study. The data collection, storage, and use were in accordance with human subjects regulations. The data for this study was recorded at the Presbyterian Hospital’s specialized Voice Center in Charlotte, NC. During the process of accepting participation in the study through the informed consent form, the participants completed short medical and voice history forms, as well as a modified voice quality self-assessment. The short medical history included information on: current and past medical problems, medications, operations and surgery procedures, and allergies. This was accomplished to identify persons who may have voice related problems and may not be aware of them. The modified Voice-Related Quality of Life (mV-RQOL) questionnaire derived from the V-RQOL (Hogikyan & Sethuraman, 1999) questionnaire was deemed normal if no questions were answered as indicating even a mild problem. In addition, participants answered that they had no current or previous voice complaints, and were cleared via a perceptual evaluation (normal vocal quality vs. abnormal vocal quality) by a speech-language pathologist. Furthermore, if laryngeal pathology (i.e. contact lesion, paralysis, spasmodic dysphonia) was found during endoscopy, the participant would have been excluded from the study. However, this did not occur during the data collection for this study. These exclusionary procedures ensured no reason to suspect laryngeal pathology in the participants.
Instrumentation and Procedures
Data collected for this study included digital recordings utilizing stroboscopy and HSV. Data from methods routinely used in the clinic and those that are new, allowed for a comparison between assessment methods. Data collection occurred in quiet rooms typically employed for assessment of voice clients in the hospital clinic. The pitch and loudness of all tasks was monitored for normality. Recordings of abnormal pitch and loudness, as identified by a speech-language pathologist, were discarded and re-recorded. Only one trial for each task per participant was recorded for further analysis. Participants were monitored for signs of vocal fatigue; however no participants exhibited those signs, which would have lead to exclusion from the study.
Endoscopy and Stroboscopy
Standard clinical procedures were utilized for endoscopy and stroboscopy. The locating of the vocal folds and the initial phonation were conducted with continuous halogen light. Stroboscopy was used to capture phonation at habitual pitch and intensity levels. A Digital Rhino-Laryngeal Stroboscopic System Model 9100B (KayPENTAX, Lincoln Park, NJ) coupled to a 70° rigid endoscope (KayPENTAX Model 9106) was used along with a laryngeal contact microphone to track vocal fold vibratory frequency. Pressed phonation samples were not recorded via stroboscopy.
High-Speed Videoendoscopy
KayPENTAX High-Speed Video System Model 9700 equipped with a camera that captures at 2,000 fps with 120×256 pixel spatial resolution and duration of 2.2 seconds was utilized. A 70° rigid endoscope, the same as that used in stroboscopy, and a 300 W constant Xenon light source (KayPENTAX Model 7152) were coupled with the system. Since high-speed cameras require an intense light source, the duration of light exposure was kept at a minimum to prevent tissue overheating. The recording of HSV was synchronized with an acoustic recording, captured via a head-mount condenser microphone AKG C420 (AKG Acoustics, Munich, Germany) coupled to a Computerized Speech Lab (CSL) (KayPENTAX Model 4400), to allow for perceptual judgments, acoustic measurements and comparisons between physical and acoustic events. Since acoustic analysis falls outside the scope of this particular study, the acoustic recordings are not further discussed. Participants were instructed to produce both habitual and pressed phonations of /i/. To achieve pressed phonation, participants were asked to phonate “as if lifting a heavy box”. Additionally, auditory examples of pressed phonation were provided. Habitual phonation was used to signify habitual pitch and loudness, while pressed phonation was elicited to provide a model of hyperfunctional voice use.
Image Processing
Image pre-processing of the HSV recordings included: motion compensation (Deliyski, 2005; Deliyski, Petrushev, Bonilha, Gerlach, Martin-Harris, & Hillman, 2007) and removal of reflection spots. The pre-processing allowed for valid and accurate results from the kymographic playbacks. The motion compensation techniques were necessary to secure that anatomical structures subjected to kymography are time-aligned. It has been noted that if endoscope motion is unaccounted for it may affect the validity of the data (Deliyski, 2005). Further, advanced image processing techniques, such as 3-D spline interpolation to effectively increase the temporal resolution to 8,000 fps, were utilized to convert the enhanced HSV playback into two additional facilitative movies: the digital kymography (DKG) and the mucosal wave kymography (MKG) playbacks (Deliyski, Petrushev, Bonilha, Gerlach, Martin-Harris, & Hillman, 2007). From stroboscopy, one type of image, the stroboscopy playback was rated. Stroboscopy provided a view of vocal fold vibration without true cycle-to-cycle information. From HSV, three playbacks were rated: the HSV playback, the DKG playback (Figure 1), and the MKG playback (Figure 2). Additionally, one static image of a medial line of the DKG playback was utilized, termed medial digital kymography (mDKG). HSV playback was defined as the typical time-domain playback of the recording after motion compensation. The HSV playback provided a visualization of vocal fold vibration inclusive of true cycle-to-cycle and intra-cycle information. The DKG playback (Figure 1) presents the HSV recording as a movie in the posterior-to-anterior dimension, instead of the temporal dimension. In the MKG playback (Figure 2), presented also as a movie from posterior to anterior, the image brightness relates to the speed of motion of the glottal edges, and the color shows the phase of motion, i.e. opening is green and closing is red. The kymographic playbacks provided a visualization of the left-right asymmetry of vocal fold vibration inclusive of inter- and intra-cycle variation. MKG provided a further differentiation, in the temporal domain, between the opening, closing, and closed, portions of the vibratory cycle, especially emphasizing the left-right asymmetry of mucosal wave.
Figure 1.
A medial position frame of a digital kymography (DKG) playback, which is a movie playing from posterior to anterior. The image on the left shows the line being scanned across the glottis. On the right, the corresponding kymographic image is shown.
Figure 2.
A medial position frame of a mucosal wave kymography (MKG) playback. This type of display allows for the temporal representation of the dynamics of the vocal fold edges during glottal opening and closing in consecutive glottal cycles of sustained phonation. The MKG image brightness relates to the speed of motion, and the color shows the phase of motion (opening is green and closing is red). The mucosal wave extent appears as a double-edged or thicker curve during the closing phase.
Visual Perceptual Judgments and Objective Measures
The images obtained from the fifty-two participants were visually evaluated and rated for left-right and anterior-posterior phase asymmetry by three voice specialists. Images from fifty-two participants in five different views for habitual phonation amounted to 260 images that were rated. From pressed phonations, 208 playbacks were judged. In addition, 20% of the images were randomly repeated to obtain intra-rater reliability. Therefore, 562 images were judged for asymmetry by each of the three judges. Left-right and anterior-posterior symmetry were rated on a five point scale were: 1= completely asymmetrical, 2= severely asymmetrical, 3=moderately asymmetrical, 4= mildly asymmetrical, and 5= symmetrical. This scale is based on common clinical rating methods. Visual and auditory perceptual ratings were accomplished through the ALVIN program (Hillenbrand & Gayvert, 2005), as seen in Figure 3. Additionally, the recordings were assessed for frequency differences between the left and right vocal folds from the DKG playback and mDKG static image. Any recordings displaying frequency differences would have been eliminated from the study since phase asymmetries can not be determined in the presence of frequency differences.
Figure 3.
Example of anterior-posterior symmetry visual-perceptual judgments of DKG playback made via the ALVIN program.
Three visual raters participated in this study. Rater 1 was a non-clinical voice scientist with substantial experience viewing stroboscopy and HSV recordings from a methodological and biomechanical perspective. Rater 2 was a clinical voice scientist with experience in both stroboscopy and HSV as well as research experience viewing HSV. Rater 3 was a clinical voice researcher who works full-time as a voice-focused SLP with substantial experience making judgments from stroboscopy. HSV was novel for Rater 3 at the time of the recording.
Only left-right and not anterior-posterior symmetry was rated from the mDKG static images. In addition to the visual-perceptual judgments, left-right asymmetry was also quantified from the mDKG static image. Vocal fold left-right relative asymmetry A (%) was measured over three cycles by taking the time differentials Δi between the onsets of the closing phase for the right and left vocal folds and dividing by the cycle periods Ti, as seen in Figure 4. The mean of the three cycles was used to improve measurement precision.
Figure 4.
Left-right relative asymmetry A (%) measured as the ratio of the sum of phase asymmetry in pixels Δ1, Δ2 and Δ3 in three consecutive glottal cycles and the sum T of the periods T1, T2 and T3 of the same three cycles. Examples of (A) typical and (B) increased asymmetry levels are illustrated.
Statistical Analysis
To compare HSV, DKG, mDKG, MKG, and stroboscopy, frequency of symmetrical and asymmetrical judgments for each rater was reported. Additionally, the percent of mildly asymmetrical judgments was reported. Correlation (Pearson) was used to determine the relationship between the objective measurements and subjective ratings. Percent agreement was utilized to establish inter- and intra-rater reliability. Correlation was considered low, mild, moderate and high for r < 0.25, 0.25 ≤ r < 0.5, 0.5 ≤ r < 0.75 and r ≥ 0.75, respectively. Percent agreement was considered significant at 75%.
Results
Left-Right Symmetry
Left-right phase asymmetry from habitual pitch phonations (Table 1) was realized in the majority of cases from all HSV playbacks. No left-right frequency differences were noted in the 104 recordings, thus all recordings were available for the investigation of phase asymmetries. Left-right asymmetries were rated less frequently from stroboscopy across the three visual-perceptual judges. The greatest portion of asymmetries rated via HSV playbacks were rated as mild. For pressed phonations (Table 2), left-right asymmetry was also realized in more than 50% of cases with the majority of cases demonstrating mild asymmetries. However, the incidence of left-right asymmetry was decreased for pressed relative to habitual phonations.
Table 1.
Frequency of left-right symmetry ratings from DKG, MKG, mDKG, HSV and stroboscopy for raters 1, 2, and 3 in percent for habitual phonations
Rater | Rank | DKG | MKG | mDKG | HSV | Strobe |
---|---|---|---|---|---|---|
1 | Asymmetrical (Mild) |
67 (45) |
74 (57) |
76 (43) |
48 (26) |
40 (33) |
Symmetrical | 33 | 26 | 24 | 52 | 60 | |
2 | Asymmetrical (Mild) |
76 (71) |
50 (48) |
60 (55) |
57 (43) |
19 (14) |
Symmetrical | 24 | 50 | 40 | 43 | 81 | |
3 | Asymmetrical (Mild) |
93 (69) |
88 (62) |
86 (57) |
86 (67) |
81 (60) |
Symmetrical | 7 | 12 | 14 | 14 | 19 |
Table 2.
Frequency of left-right symmetry ratings from DKG, MKG, mDKG and HSV for raters 1, 2, and 3 in percent for pressed phonations
Rater | Rank | DKG | MKG | mDKG | HSV |
---|---|---|---|---|---|
1 | Asymmetrical (Mild) |
59 (39) |
57 (41) |
61 (32) |
41 (23) |
Symmetrical | 41 | 43 | 39 | 59 | |
2 | Asymmetrical (Mild) |
62 (52) |
59 (57) |
61 (41) |
46 (40) |
Symmetrical | 38 | 41 | 39 | 54 | |
3 | Asymmetrical (Mild) |
91 (50) |
98 (48) |
93 (52) |
73 (27) |
Symmetrical | 9 | 2 | 7 | 27 |
Anterior-Posterior Symmetry
Due to the novelty of judging anterior-posterior symmetry of vocal fold vibration, overall anterior-posterior asymmetry, but not separate results for the right and left vocal folds, is reported. The novelty of judging this feature comes from both lack of familiarity and lack of knowledge regarding the typicality of this feature in persons with and without voice disorders. Anterior-posterior asymmetry was noted for the majority of habitual and pressed phonations across playbacks as displayed in Table 3 and Table 4. Asymmetries were less likely to be noted through stroboscopy. The majority of anterior-posterior asymmetries were judged as mild. Differences in observing anterior-posterior asymmetries between habitual and pressed phonations were insignificant.
Table 3.
Frequency of anterior-posterior symmetry ratings from DKG, MKG, HSV and stroboscopy for raters 1, 2, and 3 in percent for habitual phonations
Rater | Rank | DKG | MKG | HSV | Strobe |
---|---|---|---|---|---|
1 | Asymmetrical (Mild) |
56 (56) |
62 (61) |
70 (44) |
52 (51) |
Symmetrical | 44 | 38 | 30 | 48 | |
2 | Asymmetrical (Mild) |
89 (86) |
80 (80) |
83 (73) |
79 (79) |
Symmetrical | 11 | 20 | 17 | 21 | |
3 | Asymmetrical (Mild) |
100 (57) |
100 (56) |
79 (55) |
89 (70) |
Symmetrical | 0 | 0 | 21 | 11 |
Table 4.
Frequency of anterior-posterior symmetry ratings from DKG, MKG and HSV for raters 1, 2, and 3 in percent for pressed phonations
Rater | Rank | DKG | MKG | HSV |
---|---|---|---|---|
1 | Asymmetrical (Mild) |
40 (38) |
55 (48) |
63 (47) |
Symmetrical | 60 | 45 | 37 | |
2 | Asymmetrical (Mild) |
92 (86) |
88 (85) |
74 (61) |
Symmetrical | 8 | 12 | 26 | |
3 | Asymmetrical (Mild) |
100 (43) |
100 (30) |
87 (63) |
Symmetrical | 0 | 0 | 13 |
Inter- and Intra-rater Reliability
High inter- and intra-rater reliability were established for judgments across the five playbacks. For the three visual-perceptual raters, intra-rater reliability for asymmetry ranged from 90 to 100% agreement within one scalar level, with a mean of 99%. Similarly, inter-rater reliability for asymmetry ranged from 75 to 100% agreement within one scalar level, with a mean of 95%. The differences in rating asymmetry seemed to be directly related to both the duration of experience rating vocal fold vibration with HSV and the use of vocal fold visualization, predominately stroboscopy, for client care as part of daily job responsibilities.
Objective Measures of Left-Right Asymmetry
Objective measures followed the same general trend of the visual-perceptual judgments of left-right asymmetry. That is, the majority of normophonic speakers had measurable left-right relative asymmetries ranging from 0 to 20%, as seen in Figure 5. Furthermore, deviations were towards smaller left-right asymmetries for pressed than habitual phonations; however the large majority of cases measured asymmetries below 10% for both modes of phonation.
Figure 5.
Histogram of the distribution of the left-right relative asymmetry values A (%) objectively measured from habitual and pressed phonations.
Correlations between objective measures of asymmetry and visual ratings of left-right asymmetry were low to mild for stroboscopy, 0.05 to 0.32. Mild to moderate correlations were noted for MKG (0.35-0.67) and HSV (0.47-0.71), while mild to high correlations were noted for DKG (0.26-.76) and mDKG (0.40-0.70), as seen in Table 5.
Table 5.
Correlations (Pearson) between objective measures of asymmetry and subjective ratings of left-right symmetry for habitual and pressed phonations
Rater | Type | DKG | MKG | mDKG | HSV | Strobe |
---|---|---|---|---|---|---|
1 | Habitual | -0.76 | -0.54 | -0.67 | -0.47 | -0.32 |
Pressed | -0.55 | -0.55 | -0.70 | -0.71 | ||
2 | Habitual | -0.53 | -0.67 | -0.64 | -0.47 | -0.05 |
Pressed | -0.42 | -0.63 | -0.52 | -0.71 | ||
3 | Habitual | -0.26 | -0.35 | -0.40 | -0.64 | -0.17 |
Pressed | -0.40 | -0.43 | -0.61 | -0.69 |
Discussion
Visual judgments
The results confirm that it is indeed typical for there to be both left-right and anterior-posterior asymmetry for normophonic speakers as assessed visually through stroboscopic, HSV, and HSV playback techniques. These results concur with findings (Elias, Sataloff, Rosen, Heuer, & Spiegel, 1997) that demonstrate asymmetry in singers without vocal complaints. The ability to judge both left-right and anterior-posterior asymmetries is consistent with previous findings (Wittenberg, Tigges, Mergell, & Eysholdt, 2000; Eysholdt, Rosanowski, & Hoppe, 2003), and validates the study of anterior-posterior asymmetries. The prevalence of asymmetry is interesting in light of previous findings (Lindestad, Hertegard, & Bjorck, 2004) that 70% of normophonic speakers exhibited laryngeal adduction asymmetries. It is possible that these typical laryngeal adduction asymmetries are associated with the vibratory asymmetries revealed in the current study.
While the majority of habitual phonations were characterized by left-right and anterior-posterior asymmetries, the asymmetries were generally mild. Major differences between the HSV-derived playbacks were not apparent. Asymmetries were less likely to be rated through stroboscopy. This is highly clinically significant in that stroboscopy does not reveal the asymmetries which may be important in clinical decision making for persons with voice disorders. The finding of increased asymmetries with HSV also indicates the need for thoughtful interpretation of vocal fold vibratory recordings when deciding whether the asymmetry is abnormal or falls within normal limits. Clinicians who are familiar with recordings from stroboscopy will likely need training and well-developed norms to apply their knowledge gained from stroboscopy to the higher temporal resolution recordings from HSV. Overall, the amount of normophonic persons exhibiting asymmetries stresses the need to use caution when evaluating the normality of asymmetry in clients to prevent misdiagnosis or over-diagnosis as a result of being able to see increased asymmetry via HSV. Furthermore, the lack of left-right frequency differences in normophonic speakers indicates that this may be a voice disorder-specific parameter. Future studies in amplitude and phase asymmetries in persons with voice disorders would need to account for the frequency differences which preclude the ability to measure amplitude and phase asymmetries.
The difference in asymmetry judgments across raters emphasizes the importance of developing objective measurements for asymmetry of vocal fold vibration. This is especially true because the rater whose judgments deviated the most from the mean ratings and the objective measures was the rater with the most experience judging vocal fold vibratory features from stroboscopy and the least experience with HSV. This further reinforces the need for additional training and familiarity with HSV recordings prior to making clinical decisions regarding phase asymmetry. Without objective measurements or visual norms for HSV, there is a tendency to judge all vibratory irregulaties as abnormal. Additionally, without objective measures, it is difficult to pinpoint the specific problem in visual judgments of phase asymmetry.
Differences between habitual and pressed phonation were slight. It appeared that asymmetries in pressed phonation were of a smaller magnitude. This presence of smaller magnitude, but not a high incidence of asymmetry in pressed versus habitual phonation suggests that there are different mechanisms driving the modes. One hypothesis as to the decreased magnitude of asymmetries in pressed phonation is that increased muscular effort impedes the degree to which the system can vary from symmetrical. That is, the muscular effort places limitations on the movement of the arytenoids and therefore the vocal folds. The fewer asymmetries revealed in pressed versus habitual phonation indicates that phase asymmetries may be obscured by this type of hyperfunctional voice use. Given this finding, clinically important information regarding phase asymmetries may be revealed only after the client has undergone some treatment to decrease muscle tension. Thus, a follow-up endoscopy after decreasing muscle tension would reveal phase asymmetries and provide important information for clinical decision making.
Objective measures
Objective measures were mildly to moderately correlated with subjective ratings of left-right asymmetry demonstrating the importance of further efforts to quantify vocal fold vibratory features from HSV recordings. Correlation was highest for ratings from DKG, MKG, and mDKG. It is likely that the moderate-to-high correlation for DKG and mDKG is due to the fact that the asymmetry measures were made from the mDKG images, and thus coinciding the best with the ratings from those images. The moderate correlation of MKG with the objective measures is likely due to the added information from highlighting the mucosal wave. Additionally, the stronger correlations of MKG, DKG, and mDKG ratings and objective measures can be contributed to better visual presentation of the temporal information through kymography. Correlation was lowest for ratings from stroboscopy, with HSV playback ratings falling between those of the stroboscopic and kymographic playbacks and images. However, no correlations were above 0.80 emphasizing the limitations of our visual perceptual system for assessing this feature. The relationship between objective and subjective analyses of left-right asymmetry was similar for habitual and pressed phonations. This relationship is highlighted by the agreement of objective measures and visual ratings that there is decreased magnitude of left-right asymmetry in pressed phonation versus habitual phonation. Further development of techniques to quantify vocal fold vibratory features should include left-right and anterior-posterior asymmetry. Additionally, using automated methods, a larger number of cycles could be used to increase the accuracy of the quantification of asymmetry.
Conclusions
The majority of normophonic speakers exhibit left-right and anterior-posterior asymmetries for both habitual and pressed phonations when rated and measured from laryngeal videoendoscopic recordings. Typically these asymmetries are mild. Pressed phonation demonstrated fewer asymmetries than habitual phonations both visually and objectively. Left-right asymmetries are detected more readily via HSV than stroboscopy, and are slightly more notable from kymography than HSV playback alone. Discrepancies between objective measures and visual-perceptual ratings of left-right symmetry reveal the necessity for future quantitative analysis techniques to strengthen both research and clinical applications of laryngeal visualization. Future investigations should compare these findings to those from a variety of pathologies and further study the relationship between visual-perceptual judgments and objective measures of both left-right and anterior-posterior asymmetry.
Acknowledgements
This project was supported by Research Grant No. 11560-KA01 funded by the University of South Carolina Research Foundation and Research Grant No. R01 DC007640 funded by the National Institute of Deafness and Other Communication Disorders. The authors express their appreciation to Cara Sauder for her help with data collection; and to Drs. Allen Montgomery, Hiram McDade, and Gary Allen who provided insight to strengthen this study.
Footnotes
Portions of this study have been presented at the Voice Foundation’s 34th Annual Symposium: Care of the Professional Voice, Philadelphia, PA, June 2005.
References
- Bless DM, Hirano M, Feder RJ. Videostroboscopic evaluation of the larynx. Ear Nose Throat Journal. 1987;66(7):289–96. [PubMed] [Google Scholar]
- Bonilha H, Deliyski D. Period and Glottal Width Irregularities in Vocally Normal Speakers. Journal of Voice. 2008 doi: 10.1016/j.jvoice.2007.03.002. in press. [DOI] [PubMed] [Google Scholar]
- Bonilha H, Aikman A, Hines K, Deliyski D. Vocal Fold Mucus Aggregation in Vocally Normal Speakers. Logopedics Phoniatrics Vocology. 2008 doi: 10.1080/14015430701875588. in press. [DOI] [PubMed] [Google Scholar]
- Deliyski DD. Endoscope motion compensation for laryngeal high-speed videoendoscopy. Journal of Voice. 2005;19(3):485–496. doi: 10.1016/j.jvoice.2004.07.006. [DOI] [PubMed] [Google Scholar]
- Deliyski DD, Petrushev PP, Bonilha HS, Gerlach TT, Martin-Harris B, Hillman RE. Clinical implementation of laryngeal high-speed videoendoscopy: challenges and evolution. Folia Phoniatrica et Logopaedica. 2008;60(1):33–44. doi: 10.1159/000111802. [DOI] [PubMed] [Google Scholar]
- Elias ME, Sataloff RT, Rosen DC, Heuer RJ, Spiegel JR. Normal strobovideolaryngoscopy: variability in healthy singers. Journal of Voice. 1997;11(1):104–107. doi: 10.1016/s0892-1997(97)80030-6. [DOI] [PubMed] [Google Scholar]
- Eysholdt U, Tigges M, Wittenberg T, Proschel U. Direct evaluation of high-speed recordings of vocal fold vibrations. Folia Phoniatrica et Logopaedica. 1996;48(4):163–170. doi: 10.1159/000266404. [DOI] [PubMed] [Google Scholar]
- Eysholdt U, Rosanowski F, Hoppe U. Vocal fold vibration irregularities caused by different types of laryngeal asymmetry. European Archives of Oto-Rhino-Laryngology. 2003;260(8):412–417. doi: 10.1007/s00405-003-0606-y. [DOI] [PubMed] [Google Scholar]
- Granqvist S, Hertegård S, Larsson H, Sundberg J. Simultaneous analysis of vocal fold vibration and transglottal airflow: exploring a new experimental setup. Journal of Voice. 2003;17(3):319–330. doi: 10.1067/s0892-1997(03)00070-5. [DOI] [PubMed] [Google Scholar]
- Heman-Ackah YD, Dean CM, Sataloff RT. Stroboscopic findings in singing teachers. Journal of Voice. 2002;16(1):81–86. doi: 10.1016/s0892-1997(02)00075-9. [DOI] [PubMed] [Google Scholar]
- Hertegard S, Larsson H, Wittenberg T. High-speed imaging: applications and development. Logopedics Phoniatrics Vocology. 2003;28(3):133–139. doi: 10.1080/14015430310015246. [DOI] [PubMed] [Google Scholar]
- Hillenbrand JM, Gayvert RT. Open source software for experimental design and control. Journal of Speech, Language, and Hearing Research. 2005;48(1):45–60. doi: 10.1044/1092-4388(2005/005). [DOI] [PubMed] [Google Scholar]
- Hirano M, Kurita S, Yokizane K, Hibi S. Asymmetry of the laryngeal framework: A morphologic study of cadaver larynges. Annuals Otolology Rhinology Laryngology. 1989;98(2):135–140. doi: 10.1177/000348948909800210. [DOI] [PubMed] [Google Scholar]
- Hirose H. High-speed digital imaging of vocal fold vibration. Acta Oto-laryngologica Supplement. 1988;458:151–153. doi: 10.3109/00016488809125120. [DOI] [PubMed] [Google Scholar]
- Hogikyan ND, Sethuraman G. Validation of an instrument to measure Voice-Related Quality of Life (V-RQOL) Journal of Voice. 1999;13(4):557–569. doi: 10.1016/s0892-1997(99)80010-1. [DOI] [PubMed] [Google Scholar]
- Kiritani S, Honda K, Hirose H. Observation of pathological vocal fold vibrations using a high-speed digital image recording-system. Folia Phoniatrica. 1986;38(56):317–318. [Google Scholar]
- Kitzing P. Stroboscopy- A pertinent laryngological examination. Journal of Otolaryngology. 1985;14(3):151–157. [PubMed] [Google Scholar]
- Lacina O. Die adduktionelle Asymmetrie des Kehlkopfes bei den Sängern (Adduction asymmetry of the larynx in singers) Folia Phoniatrica et Logopaedica. 1970;22(2):100–106. [PubMed] [Google Scholar]
- Lindestad P-A, Hertegard S, Bjorck G. Laryngeal adduction asymmetries in normal speaking subjects. Logopedics Phoniatrics Vocology. 2004;29(3):128–134. doi: 10.1080/14015430410017009. [DOI] [PubMed] [Google Scholar]
- Niimi S, Miyaji M. Vocal fold vibration and voice quality. Folia Phoniatrica et Logopaedica. 2000;52(13):32–38. doi: 10.1159/000021510. [DOI] [PubMed] [Google Scholar]
- Qui Q, Schutte HK. Real-time kymographic imaging for visualizing human vocal-fold vibratory function. Review of Scientific Instruments. 2007;78:024302. doi: 10.1063/1.2430622. [DOI] [PubMed] [Google Scholar]
- Shaw H, Deliyski D. Mucosal Wave: A Normophonic Study Across Visualization Techniques. Journal of Voice. 2008;22(1):23–33. doi: 10.1016/j.jvoice.2006.08.006. [DOI] [PubMed] [Google Scholar]
- Švec JG, Schutte HK. Videokymography: High-speed line scanning of vocal fold vibration. Journal of Voice. 1996;10(2):201–205. doi: 10.1016/s0892-1997(96)80047-6. [DOI] [PubMed] [Google Scholar]
- Švec JG, Šram F, Schutte HK. Videokymography: a new high-speed method for the examination of vocal-fold vibrations. Otorinolaryngologie a Foniatrie. 1999;48:155–162. [Google Scholar]
- Švec JG, Šram F, Schutte HK. Videokymography in voice disorders: what to look for? Annals of Otology, Rhinology & Laryngology. 2007;116(3):172–180. doi: 10.1177/000348940711600303. [DOI] [PubMed] [Google Scholar]
- Verdonck-de Leeuw IM, Festen JM, Mahieu HF. Deviant vocal fold vibration as observed during videokymography: the effect on voice quality. Journal of Voice. 2001;15(3):313–322. doi: 10.1016/S0892-1997(01)00033-9. [DOI] [PubMed] [Google Scholar]
- Wendler J. Stroboscopy. Journal of Voice. 1992;6(2):149–154. [Google Scholar]
- Wittenberg T, Tigges M, Mergell P, Eysholdt U. Functional imaging of vocal fold vibration: digital multislice high-speed kymography. Journal of Voice. 2000;14(3):422–442. doi: 10.1016/s0892-1997(00)80087-9. [DOI] [PubMed] [Google Scholar]