Abstract
For the analysis of vocal fold dynamics, sub- and supra glottal influences must be taken into account, as recent studies have shown. In this work, we analyze the influence of changes in the epi-laryngeal area on vocal fold dynamics. We investigate two excised female larynges in a hemi-larynx set-up combined with a synthetic vocal tract consisting of hard plastic and simulating the vowel /a/. Eigenmodes, amplitudes, and velocities of the oscillations, the sub-glottal pressures and sound pressure levels of the generated signal are investigated as function of three distinctive epi-laryngeal areas (28.4 mm2, 71.0 mm2 and 205.9 mm2).
The results showed that the sound-pressure level is independent of the epi-larynx cross-section and exhibits a non-linear relation to the insufflated air flow. The sub-glottal pressure decreased with an increase in the epi-laryngeal area and displayed linear relations to the air flow.
The principal eigenfunctions from the vocal fold dynamics exhibited lateral movement for the first Eigenfunction and rotational motion for the second Eigenfunction. In total, the first two Eigenfunctions (EEF) covered a minimum of 60% of the energy, with an average of more than 50% for the first Eigenfunction. Correlations to epilarynx areas were not found. Maximal values for amplitudes (up to 2.5 mm) and velocities (up to 1.57 mm/ms) changed with varying epilaryngeal area, but did not show consistent behaviour for both larynges.
We conclude that the size of the epi-laryngeal area has significant influence on vocal fold dynamics, but does not significantly affect the resultant sound-pressure level.
1. Introduction
Linear source-filter theory is commonly used to model voice production: the primary voice signal, generated in the larynx, is filtered by the vocal tract, thus producing voice (Fant, 1971). Relying upon the assumptions of this theory, both the voice source and vocal tract may be isolated, and investigated independently to study influences on voice. However, recent studies have theorized that the coupling effect between structural dynamics, air flow and acoustics induced by sub- and supraglottal structures may be more significant than previously supposed (Titze and Story, 1997; Titze, 2008).
Indeed, the influence of these structures on the resultant air flow and acoustics has been quantified in several studies. Zañartu et al. (2007) used a numerical two-mass model, which incorporated a wave reflection analog technique, to investigate the dependence of fluid-structure interactions on sub- and supra-glottal structures, as well as fluid-acoustic and fluid-structure interactions. The studies showed that supra-glottal areas had a greater influence on the stability of vocal folds dynamics than sub-glottal areas. Additionally, the authors found fluid-acoustic interactions to be more significant than fluid-structure interactions. Drechsel and Thomson (2008) showed that the false vocal folds stabilized the air flow exiting the glottis to the center of the vocal tract. The findings of Becker et al. (2009) suggested that the supra-glottal area reduces the pressure drop from the sub- to supra-glottal area, thus reducing the vibrational amplitudes of the vocal folds. This was confirmed in a water-flow model by Triep and Brücker (2010). Furthermore, the supraglottal structures, specifically the ventricular folds, reduced the asymmetric features of the glottal jet and redirected it in an axial direction. Zhang et al. (2009) investigated the influence of various sub- and supraglottal tract configurations and vocal fold parameters on phonation onset. It was shown that the sub-glottal tract had a significant influence on the phonation onset pressure and frequency. Based on these findings, the authors concluded that certain configurations and parameters facilitated vocal fold vibration.
In contrast to the foregoing studies which employed numerical or physical models, Döllinger et al. (2006) presented a study of a single human larynx with an artificial vocal tract. The results quantified the influence of the supra-glottal structures on phonation onset as well as sustained phonation. Additionally, the authors confirmed the findings by Zhang et al. (2009) regarding the dependence of vocal fold vibrations on vocal tract configuration.
Building upon this study, we investigate the effects of various epi-laryngeal configurations on the vocal fold dynamics of two female larynges. For this purpose, velocities, amplitudes and empirical Eigenfunctions were extracted from the medial surface dynamics of the vocal folds. Additionally, the sound-pressure level and sub-glottal pressure of the primary voice signal were analyzed.
2. Method
The larynges were investigated in a hemi-larynx setup, which has been described previously (Döllinger and Berry, 2006a; Döllinger et al., 2006). Briefly, one side of a larynx is removed between the epi-larynx and just below the vocal folds. This yields an unobstructed view towards the medial surface of the remaining vocal fold. The trachea is mounted onto a steel tube, allowing airflow to be insufflated - similar to the human breath from the lungs. A glass plate is positioned at the glottal midline, see Fig. 1(a). An equal-sided 90° prism is positioned with its hypotenuse pressed against the glass plate, yielding stereo images of vocal fold vibration when viewed through a camera. Following the method described in Döllinger and Berry (2006a), a regular grid of suture points is sewn onto the vocal fold surface to serve as marker points. In Fig. 1(b) a frame that was extracted from a captured movie, is depicted. Prominent is the prism's vertically running front edge in the middle of the image. In the upper third an equal sided cube of 5 mm side length is glued to the prism. This cube serves as a calibration object. In the lower 2-thirds the medial vocal fold surface is visible. The regular grid of suture points spans an area of approximately 100 mm2, with a distance of approximately 10 mm between the first and last suture in the top-most row of the grid. For the reconstruction of the 3D coordinates of each suture point, the equation
Figure 1.

Schematics depicting the experimental setup. In 1(a) the hemi-larynx, the glass prism and the high-speed camera are seen from above. In 1(d) the artificial vocal tract that was mounted onto the hemi-larynx, is sketched. The configuration of the vocal tract resembles the phonation of the vowel /a/. With hard-rubber inserts of different sizes the cross-sectional area of the epi-larynx was varied.
| (1) |
is solved, where vrec = [x1 y1 x2 y2]T and v3D = [x y z 1]T. The indices 1 and 2 stand for the pixel coordinates in the left and right image halves, respectively. F is the mapping matrix. Resulting coordinates are then assembled to obtain the 3D-medial vocal fold surface, as depicted in Fig. 1(c). For further details see (Döllinger and Berry, 2006a).
Parameters such as the air flow V̇, sub-glottal pressure Psub, sound pressure level SPL and vocal fold dynamics are recorded. The dynamics of the vocal folds are processed and several parameters extracted: empirical Eigenfunctions, amplitude, velocity and frequency of the vibration.
The vocal tract is created with hard plastic boxes. It is divided into three sections: the epi-larynx at the inferior end, the pharynx in the middle and the oral region at the superior end, as depicted in Fig. 1(d). This design resembles the configuration of the vocal tract, when phonating the vowel /a/. All dimensions were taken from Montequin (2003). The artificial vocal tract is mounted approximately 1 cm above the hemilarynx, fixed by a rubber clamp. The clamp itself is mounted flush on the horizontally cut thyroid cartilage. To account for minor gaps, vaccum grease is applied. The size of the cross-section of the epi-laryngeal tube (ET) is varied by hard rubber inserts. Three different cross-sections were chosen: ET1 at 28.4 mm2, ET2 at 71.0 mm2 and ET3 at 205.9 mm2.
Two larynges were investigated in this study, both female and of 73 (L1) and 58 (L2) years of age. Due to limited lighting conditions and the anatomy of larynx L2, not all of the 30 mounted suture points on the vocal fold were detected and reconstructed. For the calculation of the EEFs for larynx L2, the smallest shared suture pattern was chosen which was visible in all recordings of L2.
3. Results
3.1. Vocal Fold Dynamics
For the following investigations, the data were analyzed for constant Psub (3 kPa) across varying epilarynx cross-sectional areas ET. The time signals of the suture points were filtered by extracting all Eigenfunctions, until 95% signal strength were reconstructed. For example, surface plots for averaged maximum velocity and the averaged maximum amplitude for larynges L1 and L2 are depicted in Fig. 2 and Fig. 3. The upper and lower sub figures display the velocity and amplitude surface plots, respectively. The values are obtained by averaging over the maximum values of each oscillation cycle. The left and right columns identify the epi-larynx configurations ET1 and ET3. These were chosen, as they represent the minimum and maximum cross-sectional areas that were investigated. The surface plots depict the values measured for each suture point of the regular grid on the vocal fold. Higher values are color coded in brighter colors, lower values in darker colors. Sutures positioned at row m and column n are denoted as suture RmCn. The grid is sewn in a manner such that row 5 lies along the vocal fold edge, see Fig. 1(b).
Figure 2.

Exemplary surface plots of averaged maximum velocity (top) and averaged maximum amplitude (bottom) over the vocal fold surface for L1 at Psub = 3 kPa. In the left and right columns the information for ET1 (narrowest epi-larynx cross-section) and ET3 (widest epi-larynx cross-section) is plotted, respectively. Hot spots are visible in C3 and C4 for the velocity and around sutures R5C2-3 for amplitudes. Evident is also the drop in maximum velocity from ET1 → ET3 around sutures R4C3-4 at a stable amplitude.
Figure 3.

Exemplary surface plots of averaged maximum velocity (top) and averaged maximum amplitude (bottom) over the vocal fold surface for L2 at Psub = 3 kPa. In the left and right columns the information for ET1 (narrowest epi-larynx cross-section) and ET3 (widest epi-larynx cross-section) is plotted, respectively. The information is limited due to the amount of detectable suture points. Hot spots for the velocity are visible in C3-4. This area also presents the greatest changes in the maximum velocities. As is the case for L1, the amplitude remains stable when varying the cross-sectional area from ET1 → ET3.
Both larynges present the highest velocities for the sutures around the third and fourth column, just below the vocal fold edge. For configuration L1/ET3 the suture point R2C5 displays a clear deviation to values of surrounding suture points. With respect to amplitude, differences exist between both larynges. For L1, the areas of highest amplitudes are concentrated around R5C2 and R5C3. For L2, the areas of highest amplitudes are distributed over the entire reconstructed surface.
In a comparison of the different epi-larynx configurations (i.e. the left and right columns of Fig. 2 and Fig. 3), with respect to velocity (upper sub-figures) and amplitude (lower sub-figures), no apparent differences are observable for L1. Only a slight decrease is visible in the velocity surface plot around sutures R3-4C3-4. The amplitude at suture R5C2 decreases slightly, as well. However, L2 clearly displays reduced velocity values for ET3 with the amplitude approximately equal to that of ET1. The maximum values of velocity for L1/ET1 and ET3 amount to 0.49 mm/ms and 0.48 mm/ms and maximum amplitudes of 2.50 mm and 2.28 mm, respectively. The maximum velocities and amplitudes reach 1.57 mm/ms and 1.12 mm/ms at maximum amplitudes of 2.21 mm and 2.20 mm for L2/ET1 and L2/ET3, respectively.
For EEF analysis, for comparability only those sutures were considered which were visible in all recordings for each larynx. For Larynx L1, the vibrations of 26 sutures were always visible, Fig. 2. For L2, only the vibrations of sutures of rows 3, 4 and 5 (i.e. 15 sutures) were always visible. For comparing recording, considering the same amount of sutures is important, since the percentage of the EEF changes based on the amount of sutures (i.e. considered trajectories). This will be demonstrated in the following paragraph, using data of L1. For the three ETs at equal subglottal pressure of 3 kPa, the calculated EEFs are compared when based on all detected suture points and when rows 1,2 and 6 are omitted. The omission resembles the smallest available suture pattern for L2, as depicted in the right column of Fig. 3. The results are presented in Table 1. It becomes evident, that in case of the omitted rows, the percentage of the first Eigenmode rises an average by 7%, however the second Eigenmode loses about 1%. As a consequence, the pattern of suture points that are analysed, has to be chosen in a manner, so that it is visible in all investigated cases.
Table 1.
Comparison of first and second Eigenmodes of L1 between data based on all sutures and data based on rows 3, 4 and 5. The data was obtained for comparable results at Psub = 3 kPa. The values for the third EEF are only presented, when EEF1 and EEF2 combined do not cover more than 95% of the signal strength.
| ET | all sutures(%) | rows 3,4,5(%) | |||||
|---|---|---|---|---|---|---|---|
| single | Σ | single | Σ | ||||
| ET1 | EEF1 | 76 | 82 | ||||
| EEF2 | 17 | 93 | 15 | 97 | |||
| EEF3 | 2 | 95 | |||||
| ET2 | EEF1 | 74 | 81 | ||||
| EEF2 | 17 | 91 | 15 | 96 | |||
| EEF3 | 2 | 93 | |||||
| ET3 | EEF1 | 69 | 76 | ||||
| EEF2 | 19 | 89 | 19 | 95 | |||
| EEF3 | 4 | 92 | |||||
In Tables 2 and 3, the percentages of the first empirical Eigenfunction (EEF1), the second empirical Eigenfunction (EEF2), and the sum of EEF1 and EEF2 (EEF1+2) are listed for the two larynges L1 and L2 over the three epi-larynx configurations ET1, ET2 and ET3 and applied flow values. EEF1 captured a minimum of 54 % and a maximum of 76 % of the variance for L1, and a minimum of 42 % and a maximum of 57 % of the variance for L2. On average, EEF1 captured 20 % more of the variance of the vibrations of L1 than of L2. The values for the Eigenmodes do not exhibit any obvious correlations. Additionally, the values for the Eigenmode frequencies fEEF1 for EEF1 increase with increasing flow for L1 and L2. They always match the fundamental frequency of the vocal fold vibrations.
Table 2.
Values for V̇, Psub, the epi-larynx configuration, EEF1, EEF2, EEF1+2, EEF3 and the frequency of EEF1. The value of ET denotes the epi-larynx configuration ranging from 1 for the narrowest to 3 for the widest configuration. The highlighted values depict configurations that were used for the direct comparisons in velocity and amplitude.
| L1 | |||||||
|---|---|---|---|---|---|---|---|
| ET | Flow (ml/s) |
Psub (kPa) |
EEF1 (%) |
EEF2 (%) |
EEF1+2 (%) |
EEF3 (%) |
fEEF1 (Hz) |
| ET1 | 150 | 1.5 | 54 | 25 | 79 | 8 | 110 |
| 192 | 2 | 46 | 34 | 80 | 3 | 140 | |
| 346 | 3 | 76 | 17 | 85 | 2 | 170 | |
| ET2 | 205 | 1.5 | 58 | 28 | 86 | 3 | 120 |
| 240 | 2 | 55 | 32 | 87 | 4 | 150 | |
| 395 | 3 | 74 | 17 | 93 | 2 | 170 | |
| ET3 | 630 | 3 | 69 | 19 | 89 | 4 | 160 |
| 995 | 4 | 69 | 11 | 80 | 6 | 180 | |
Table 3.
Values for V̇, Psub, the epi-larynx configuration, EEF1, EEF2, EEF1+2, EEF3 and the frequency of EEF1. The value of ET denotes the epi-larynx configuration ranging from 1 for the narrowest to 3 for the widest configuration. Note that rows 1,2 and 6 were omitted from the analysis when present. The highlighted values depict configurations that were used for the direct comparisons in velocity and amplitude.
| L2 | |||||||
|---|---|---|---|---|---|---|---|
| ET | Flow (ml/s) |
Psub (kPa) |
EEF1 (%) |
EEF2 (%) |
EEF1+2 (%) |
EEF3 (%) |
fEEF1 (Hz) |
| ET1 | 491 | 2 | 56 | 18 | 74 | 5 | 140 |
| 610 | 3 | 56 | 21 | 77 | 5 | 160 | |
| ET2 | 444 | 2 | 42 | 21 | 63 | 5 | 150 |
| 563 | 3 | 49 | 25 | 74 | 5 | 170 | |
| 720 | 4 | 43 | 17 | 60 | 7 | 200 | |
| ET3 | 515 | 2 | 42 | 26 | 68 | 6 | 150 |
| 540 | 2.5 | 56 | 23 | 79 | 4 | 160 | |
| 620 | 3 | 57 | 18 | 75 | 5 | 170 | |
For the following considerations EEF3 and higher Eigenfunctions are ignored, as they do not contribute significantly to vocal fold movement. Consequently, for the following considerations, the vocal fold dynamics are described entirely by EEF1+2.
As mentioned before, Fig. 2 and Fig. 3 indicate that the highest amplitudes and velocities are to be expected in C3 and C4. Thus, the focus lies on the points of the third and fourth column of the larynges for the following considerations.
For example, in Fig. 4 the cross-section of the vocal fold L2 at C3 for EEF1+2 is depicted as a solid line for seven sequential frames of one oscillation cycle. The frames were taken at even intervals of 1 ms from the movement of L2 with ET1 (top row) and ET3 (bottom row) configuration. The arrows mark the amount and direction of deflection due to EEF1+2. The circles indicate the rest positions of the points. The slash-dotted line stands for the position of the vocal fold solely due to motion from EEF1. As can clearly be seen, EEF1 describes a motion that moves the vocal fold surface in direction of the initial surface normal, i.e. in relation to the figures to the right and slightly downward.
Figure 4.

Exemplary movement of the vocal fold surface along C3 for L2/ET1 (top)and L2/ET3 (bottom) for one oscillation cycle. The solid line stands for the vocal fold surface due to EEF1+2. The slash-dotted line symbolizes the position of the vocal fold surface solely due to EEF1. EEF1 moves the vocal fold surface in a perpendicular direction to the initial surface position from t = 0ms, while EEF2 adds a rotation around a longitudinal axis. This leads to a wave-like motion of the medial vocal fold surface. Similar behavior was identified for the other obtained data.
EEF2 exhibits a rotational movement around a longitudinal axis, which first moves the most inferior suture on the vocal fold in a lateral direction towards the middle (t=1 ms). Then the vocal fold surface tilts around this axis at t=2 ms, and thus moves the superior end towards the middle. Additionally, the distance between the lower points remains approximately constant, while the distance between the upper two points varies with a minimum at t=2 ms and a maximum at t=5 ms. Similar results were visible for L1 and the other epiglottal configurations as well.
From Tables 2 and 3 it can be seen that the frequencies of the Eigenmodes EEF1 were calculated to be 170 Hz for configurations L1/ET1-ET2 and 160 Hz for L2/ET1-ET2 at equal Psub = 3 kPa. For L1/ET3 and L2/ET3 the frequencies show a slight decrease and increase, respectively.
In Fig. 5, the maximum amplitude values for C3 and C4, depending on the epi-laryngeal diameter, are plotted in the top row. L1 and L2 display a steady or slightly increasing trend in amplitude for increasing epi-larynx area. The maximum velocities for each column are plotted in bottom row. The velocities increase for L1 from ET1 to ET2, but remain steady from ET2 to ET3. For L2 the maximum velocities decrease in general, with a distinct dip for ET2, similar to the drop in amplitudes.
Figure 5.

Exemplary maximum values of amplitude and velocity for columns three and four of larynx L1 and L2 at Psub = 3 kPa. The amplitudes remain stable over the varying cross-sectional areas, however the velocities decrease. It has to be noticed that the L2/ET2 configuration shows a drop in amplitudes for both columns. Y and Z stand for the lateral and vertical direction, respectively.
3.2. Phonation Parameters
In Fig. 6 PSub (top row) and SPL (bottom row) are plotted as a function of V̇ for larynges L1 and L2. As can be seen for PSub, the sub-glottal pressure decreases at a constant flow rate for an increasing epi-laryngeal area (ET1 → ET3) for both larynges. However, PSub and V̇ exhibit a good linear relation for L1, when the epi-larynx area is kept constant. The dotted lines resemble linear regression lines that were fit to the given data for each ET-case, according to Alipour et al. (1997). This relation is visible for L2 as well, however the variability about the regression line is higher than for L1, due to outliers.
Figure 6.

PSub (top row) and SPL (bottom row) vs. air flow V̇ for larynges L1 (left column) and L2 (right column) with varying epi-laryngeal tubes. In general, the sub-glottal pressure Psub jumps clearly in response to the changing epi-laryngeal structures. The data displays a linear relation for each epi-larynx configuration between Psub and V̇, especially in the case of larynx L1. In contrast to Psub, the SPL stays relatively stable across the cross-section variations with a non-linear behavior.
In the bottom row of Figs. 6 the SPL values are plotted in dependency of V̇. The SPL reaches a level of ≈ 73 – 95 dB(A) for L1 and ≈ 91 – 106 dB(A) for L2. The measured data displays a behavior that is independent of the epi-larynx configuration, especially visible in the ranges of 200 ml/s to 400 ml/s for L1 and 400 ml/s to 600 ml/s for L2. This is emphasized by fitting a logarithmic curve into the data across all epi-larynx configurations. This curve follows the data and displays a trend that fades into saturation, which is visible in both larynges.
4. Discussion
4.1. General
The effect of different supra-glottal, i.e. epi-laryngeal, structures on vocal fold dynamics and on the primary voice signal was investigated. For this purpose, an artificial vocal tract was mounted upon two human hemi-larynges. The vocal fold of the hemi-larynx was brought to oscillation by insufflating air. The cross-sectional area of the epi-larynx was varied with tube inserts. Parameters, such as the subglottal pressure PSub and sound pressure level SPL were measured. Additionally, 3D measurements of vocal fold vibration were also obtained. Currently, in most publications, the investigation of the influence of epilaryngeal structures is limited to the influence of the false vocal folds on airflow (Drechsel and Thomson, 2008; Zañartu et al., 2007) and vocal folds' dynamics (Luo et al., 2008) using numerical models. In contrast, the results presented here were obtained from in-vitro laboratory experiments.
4.2. Vocal Fold Dynamics
Both larynges' vocal fold displayed areas of high velocities in the columns three and four. Moreover, both larynges exhibited a decrease in velocity as a function of an increased epi-laryngeal cross-sectional area. The maximum velocities of approximately 0.5 mm/ms for L1 and 1.6 mm/ms for L2 lie within published values (Döllinger et al., 2006, 2005). The maximum amplitudes of L1 (ET1: 2.5 mm, ET3: 2.28 mm) and L2 (ET1: 2.21 mm, ET2: 2.2 mm) lie within the minimum and maximum values of Döllinger et al. (2006); Döllinger and Berry (2006b); Boessenecker et al. (2007). This could result from the lower pressure drop from the sub-to the supra-glottal area due to the vocal tract, as stated by Triep and Brücker (2010). Consequently, this confirms the findings of Becker et al. (2009).
L1 and L2 show consistent results with respect to the distribution of maximum velocities over the vocal fold surface (see Fig. 2 and 3). The maxima lie in the columns 3 and 4, which are positioned in the medial region of the vocal fold. With respect to amplitude, L1 displays maxima around the sutures R5C2-3. For L2 the amplitude maxima are distributed over the entire surface. This suggests that the vocal fold of L2 does not move as uniformly as the vocal fold L1. This is consistent with the fact that the extracted values of EEF1 in average were less dominant for L2 than for L1 (see Tables 2 and 3).
In general, more than 60% of the energy was captured by EEF1 and EEF2. In average, around 5% were captured in EEF3. An omission of the rows 1, 2 and, if existing, row 6 resulted in a decreased percentage of EEF1 (see Table 1) and EEF2 remained stable. This shows that the main vibrational behaviour is still kept in EEF1 and EEF2. However, the same amount of trajectories as well as positions should be considered, when comparing different vibrations. An analysis of empirical eigenfunction values of both larynges showed that EEF1 captured the lateral movement of the vocal folds in opening and closing the glottis. EEF2 added a rotational movement which was out-of-phase with EEF1, so that the combined movement yielded a wavelike motion on the vocal fold surface, as displayed in Fig. 4. This phenomenon has been observed previously by Döllinger et al. (2005) and Berry et al. (2001).
The maximum velocities of C3 and C4 remain rather stable or drop from ET1 to ET3, while the amplitudes display a rising trend (see Fig. 5) for a constant subglottal pressure. However, the frequency of the EEF1 remains stable over the varying epilaryngeal areas at a constant pressure. As velocity is the derivative of the amplitude, an increase in amplitude would imply an increase in velocity and vice versa at constant frequency (see Table 2 and Table 3). An explanation to the depicted trends of velocities is illustrated in Fig. 7, where the amplitudes and velocities of a sinusoidal (solid) and the square-root of a sinusoidal (slashed) signal are compared to each other at equal frequency and maximum velocity. The slashed signal has a higher rate of change in the transition, thus displaying the higher maximum amplitude. Applied to the case of the vocal folds, the wider sinus-wave would result in a glottis that would display the higher amplitude values. (compare to Fig. 5 for L1 and L2).
Figure 7.

Comparison of amplitude and velocity of a sinusoidal (solid) to the square-root of a sinusoidal signal (slashed). Latter displays the higher maximum amplitude. However, both signals have the same base frequency and maximum velocity. In reference to the experimental data the solid line resembles the vocal fold dynamics with ET1 and the slashed line with ET3.
4.3. Phonation Parameters
A constant sub-glottal pressure PSub leads to an increase of the flow rate when increasing the epi-larynx area, see top row of Fig. 6. For a constant cross-sectional area, PSub and V̇ exhibit linear relations. This confirms earlier work by Alipour et al. (1997), where the relation was investigated as a function of human vocal fold adduction. On the other hand, if the flow rate V̇ were kept constant, an increase in the cross-section would lead to a distinctive drop in the sub-glottal pressure PSub. Here, the applied values for PSub lie above previous reported values by Holmberg et al. (1989), Alipour et al. (1997) and Sidlof et al. (2008). This could be due to the different epi-laryngeal configurations (e.g., cross-section or length of the pharynx) and the aerodynamically less-advantageous transition from the hemi-larynx to the vocal tract.
Larynx L2 exhibits a drop with respect to PSub for a certain range of flow and the logarithmic curve does not fit as well to the measured SPL data as for L1. However, this data concurs with the observed drop of oscillation amplitude, depicted in Fig. 5.
In contrast to the flow V̇ or sub-glottal pressure PSub, both larynges produced a SPL independent of the epi-laryngeal structure, as can be taken from the bottom row of diagram 6. Let's assume that V̇ and PSub were input parameters to a voice production model, and that SPL was an output of the model. For the experiments, the air flow V̇ was regulated and PSub reacted to the increasing epi-larynx area similar to the voltage of an electric circuit with controlled current: it dropped when the impedance of the circuit dropped. This is consistent with a linear relationship between PSub and V̇. However, the resulting SPL clearly illustrates a nonlinear relationship between input (V̇, Psub) and output parameters (SPL), as suggested by Titze (2008).
5. Conclusion
As the results were obtained from only two larynges, the results must be considered preliminary. However, both larynges displayed similar behavior as a function of changing epi-laryngeal area. The SPL remained stable as a function of epi-laryngeal cross-sectional area. In contrast, the sub-glottal pressure dropped as the epi-larynx area increased. The behavior of the SPL displayed non-linear characteristics, as previously theorized Titze (2008). The results confirmed findings by Zañartu et al. (2007), that the supra-glottal area may exhibit a significant influence on the vocal fold dynamics. For therapeutic and clinical application, the findings suggest to perhaps consider the epilaryngeal configuration, since a narrow epilarynx favours phonation (i.e. same acoustic intensity with less sub glottal pressure). The organic properties of laryngeal tissues limit the amount of testing that may be performed, as tissue properties tend to change with time due to dehydration and other artefacts. Hence, physical models, such as e.g. silicon vocal folds, may also be applied in future studies to further investigate the influence of epi-laryngeal structures on vocal fold vibration. However, a comparable investigation with methods as e.g. proposed by Luegmair et al. (2010) of in-vivo vocal folds' dynamics, will yield the most reliable results.
Acknowledgments
This work was supported by the German Research Council (DFG), grant no. FOR894/2 Strömungsphysikalische Grundlagen der Menschlichen Stimmgebung. Dr. Berry's effort was funded by the National Institutes of Health (NIH) grant no. R01 DC03072.
Footnotes
This document is a collaborative effort.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- 1.Fant G. Acoustic Theory of Speech Production: With Calculation based on X-Ray Studies of Russian Articulations. 1. Mouton de Gruyter; 1971. [Google Scholar]
- 2.Titze IR, Story BH. Acoustic interactions of the voice source with the lower vocal tract. J Acoust Soc Am. 1997;101(4):2234–2243. doi: 10.1121/1.418246. [DOI] [PubMed] [Google Scholar]
- 3.Titze IR. Nonlinear source–filter coupling in phonation: Theory. J Acoust Soc Am. 2008;123(5):2733–2749. doi: 10.1121/1.2832337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zañartu M, Mongeau L, Wodicka GR. Influence of acoustic loading on an effective single mass model of the vocal folds. J Acoust Soc Am. 2007;121(2):1119–1129. doi: 10.1121/1.2409491. [DOI] [PubMed] [Google Scholar]
- 5.Drechsel JS, Thomson SL. Influence of supraglottal structures on the glottal jet exiting a two-layer synthetic, self-oscillating vocal fold model. J Acoust Soc Am. 2008;123(6):4434–4445. doi: 10.1121/1.2897040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Becker S, Kniesburges S, Müller S, Delgado A, Link G, Kaltenbacher M, Döllinger M. Flow-structure-acoustic interaction in a human voice model. J Acoust Soc Am. 2009;125(3):1351–1361. doi: 10.1121/1.3068444. [DOI] [PubMed] [Google Scholar]
- 7.Triep M, Brücker C. Three-dimensional nature of the glottal jet. J Acoust Soc Am. 2010;127(3):1537–1547. doi: 10.1121/1.3299202. [DOI] [PubMed] [Google Scholar]
- 8.Zhang Z, Neubauer J, Berry DA. Influence of vocal fold stiffness and acoustic loading on flow-induced vibration of a single-layer vocal fold model. J Sound Vib. 2009;322(1-2):299–313. doi: 10.1016/j.jsv.2008.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Döllinger M, Berry DA, Montequin DW. The influence of epilarynx area on vocal fold dynamics. Otolaryngol Head Neck Surg. 2006;135(5):724–729. doi: 10.1016/j.otohns.2006.04.007. [DOI] [PubMed] [Google Scholar]
- 10.Döllinger M, Berry DA. Computation of the three-dimensional medial surface dynamics of the vocal folds. J Biomech. 2006a;39(2):369–374. doi: 10.1016/j.jbiomech.2004.11.026. [DOI] [PubMed] [Google Scholar]
- 11.Montequin DA. University of Iowa; Iowa City, IA: 2003. Developing a methodology to study the effect of the epilarynx tube on phonation threshold pressure and driving pressure. Ph D thesis. [Google Scholar]
- 12.Alipour F, Scherer RC, Finnegan E. Pressure-flow relationships during phonation as a function of adduction. J Voice. 1997;11(2):187–194. doi: 10.1016/s0892-1997(97)80077-x. [DOI] [PubMed] [Google Scholar]
- 13.Luo H, Mittal R, Zheng X, Bielamowicz SA, Walsh RJ, Hahn JK. An immersed-boundary method for flow-structure interaction in biological systems with application to phonation. J Comp Phys. 2008;227(22):9303–9332. doi: 10.1016/j.jcp.2008.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Döllinger M, Berry DA, Berke GS. Medial surface dynamics of an in vivo canine vocal fold during phonation. J Acoust Soc Am. 2005;117(5):3174–3183. doi: 10.1121/1.1871772. [DOI] [PubMed] [Google Scholar]
- 15.Döllinger M, Berry DA. Visualization and quantification of the medial surface dynamics of an excised human vocal fold during phonation. J Voice. 2006b;20(3):401–413. doi: 10.1016/j.jvoice.2005.08.003. [DOI] [PubMed] [Google Scholar]
- 16.Boessenecker A, Berry DA, Lohscheller J, Eysholdt U, Döellinger M. Mucosal wave properties of a human vocal fold. Acta Acustica united with Acustica. 2007 Sep-Oct;93:815–823(9). [Google Scholar]
- 17.Berry DA, Montequin DW, Tayama N. High-speed digital imaging of the medial surface of the vocal folds. J Acoust Soc Am. 2001;110(5):2539–2547. doi: 10.1121/1.1408947. [DOI] [PubMed] [Google Scholar]
- 18.Holmberg EB, Hillman RE, Perkell JS. Glottal airflow and transglottal air pressure measurements for male and female speakers in low, normal, and high pitch. J Voice. 1989;3(4):294–305. doi: 10.1121/1.396829. [DOI] [PubMed] [Google Scholar]
- 19.Sidlof P, Svec JG, Horcek J, Vesel J, Klepcek I, Havlk R. Geometry of human vocal folds and glottal channel for mathematical and biomechanical modeling of voice production. J Biomech. 2008;41(5):985–995. doi: 10.1016/j.jbiomech.2007.12.016. [DOI] [PubMed] [Google Scholar]
- 20.Luegmair G, Kniesburges S, Zimmermann M, Sutor A, Eysholdt U, Döllinger M. Optical reconstruction of high-speed surface dynamics in an uncontrollable environment. Medical Imaging, IEEE Transactions on. 2010;29(12):1979–1991. doi: 10.1109/TMI.2010.2055578. [DOI] [PubMed] [Google Scholar]
