Abstract
The method of tailored Green’s functions advocated by Doak (Proceedings of the Royal Society A254 (1960) 129 – 145.) for the solution of aeroacoustic problems is used to analyse the contribution of the mucosal wave to self-sustained modulation of air flow through the glottis during the production of voiced speech. The amplitude and phase of the aerodynamic surface force that maintains vocal fold vibration are governed by flow separation from the region of minimum cross-sectional area of the glottis, which moves back and forth along its effective length accompanying the mucosal wave peak. The correct phasing is achieved by asymmetric motion of this peak during the opening and closing phases of the glottis. Limit cycle calculations using experimental data of Berry et al. (Journal of the Acoustical Society of America 110 (2001) 2539 – 2547.) obtained using an excised canine hemilarynx indicate that the mechanism is robust enough to sustain oscillations over a wide range of voicing conditions.
Keywords: Philip Doak, phonation, vocal folds, voiced speech, vortex sound
1. Introduction
Philip Doak [1] was one of the first to propose the use of ‘tailored’ Green’s functions in applications of Lighthill’s [2] theory of aerodynamic sound to problems involving flow sources near solid boundaries. It is now well understood how structural elements comparable or smaller in size than the characteristic wavelength of the sound can significantly increase the efficiency of sound production, and that this gain is most usefully expressed in terms of a Green’s function that incorporates both the geometrical and mechanical properties of the structure.
Doak later advocated the adoption of the total enthalpy B as the fundamental acoustic variable, instead of Lighthill’s perturbation density (see, e.g. [3]). When this is done a principal source of aerodynamic sound is identified as the vorticity [4 – 6]. More generally, the sources are then confined to those regions where the vorticity ω ≠ 0 and ∇s ≠ 0, where s is the entropy. In the important approximation where the source flow is effectively homentropic (an homogeneous medium with no combustion) Lighthill’s equation becomes
(1) |
where ρ, c, v are the fluid density, sound speed and velocity (with ω = curl v), and where ρ ≡ ρ(p) can be expressed as a function of the pressure p and
(2) |
Bernoulli’s equation implies in the absence of vorticity and moving boundaries that B = constant throughout the flow; and B may therefore be assumed to vanish in the absence of sound. When the Mach number M ~ v/c is small local mean values of ρ and c differ from their uniform respective values ρo and co by terms of relative order M2 ≪ 1. Equation (2) then takes the simplified form
(3) |
Outside the source flow the unsteady motion is entirely irrotational. It can be represented by a velocity potential ϕ(x, t) that satisfies B = −∂ϕ/∂t. In the far field the acoustic pressure p is given by
(4) |
If the mean flow is stationary at infinity p = ρoB, where ρo is the corresponding mean density.
The solutions of aeroacoustic problems in the presence of high speed moving boundaries are frequently obtained numerically by means of an appropriate extension of Kirchhoff’s surface integral representation [7], in particular high-speed fan and rotor noise can be expressed in terms of surface terms derived by Ffowcs Williams and Hawkings [8 – 10]. These solutions usually involve the free space Green’s function [6, 9, 11]. However, radiation from surfaces and sources at lower Mach numbers (M ~ 0.4 or less) are often treated more effectively in terms of the vortex sound equation (1), and it is then that the great utility of Doak’s recommended use of a tailored Green’s function becomes apparent.
Equation (1) is self-adjoint [12, 13] and Green’s function G(x, y, t, τ) is an ‘advanced potential’ that satisfies
(5) |
G(x, y, t, τ) is a disturbance that propagates as an ‘incoming’ wave as a function of (y, τ) towards the singularity at y = x on the right of the equation, arriving at τ = t and vanishing thereafter. It turns out to be convenient to require that ∂G(x, y, t, τ)/∂yn = 0 on any stationary or moving boundary S(τ) of the flow, where yn is a local normal coordinate on S directed into the fluid. The usual application of Green’s theorem using equations (1) and (5) then yields [7, 12, 13]
(6) |
where η is the shear coefficient of viscosity, V (τ) is the time dependent domain occupied by the fluid, and where Green’s function and source terms vanish respectively at τ = ±∞. In deriving this result the momentum equation has been used in the form
where only the shear component of viscosity is important close to S.
This equation is applied in this paper to discuss a ‘reduced complexity’ analysis of voiced speech. This is traditionally based on Fant’s equation governing the unsteady flow of air through the glottis [14 – 16]. During voicing a steady flow from the lungs is interrupted by the periodic opening and closing of the glottis by aerodynamically driven vibrations of the vocal folds. The air emerges in a succession of ‘puffs’ of volume velocity Q(t), which determines an equivalent monopole sound source radiating into the supraglottal vocal tract and thence from the mouth to listeners in free space.
The wavelengths of voiced speech mostly exceed the vocal tract diameter, and it is usual to model the supraglottal tract by a uniform, hard walled, circular cylindrical duct of length L and cross-sectional area A in which sound propagates back and forth as plane waves, and radiates from an open ‘mouth’ (Figure 1a). The subglottal tract is also regarded here as rigid and uniform with cross-section AL < A and terminating in the lung complex, where wave energy is absorbed without reflection, or reflected in accordance with a suitable impedance condition [17 – 21]. Similarly, the glottis is modelled by a ‘necked’ duct of v rectangular cross-section and streamwise length , which opens and closes at a nominal ‘free’ vocal fold frequency fo (~ 125 Hz for an adult male) [22, 23]. This is indicated in the upper part of Figure 2, which shows the sagittal section parallel to the mean flow, where the cross-hatched upper and lower glottis walls correspond to the medial surfaces of the vocal folds. The glottis has constant span ℓs ≫ ℓg (out of the plane of the paper in the figure) and a blade-like jet is formed (according to experiment [24]) by separation a short distance downstream of the point labelled A in Figure 2, where the glottis cross-sectional area takes its minimum value Ag(t). The Strouhal number foℓg/V ~ 0.05 at typical flow speeds V ~ 10 m/s. This is small enough for it to be assumed that the glottal flow passes through a continuous sequence of quasi-static states during a cycle of oscillation.
Basic reduced order equations for Ag(t) are derived from a ‘one’ or ‘two-mass’ approximation of a vocal fold [25 – 27]. The original single mass model cannot self-excite, however, except in the absence of damping or with acoustic feedback. The two-mass approximation represents the fold as two lumped masses connected by springs and dampers which are adjusted to provide the necessary asymmetry in surface pressure force to promote self-excited oscillations. Avanzini [28] has developed a quasi-one-mass model by assuming that the displacement of the ‘upper’ mass of the conventional two-mass model is proportional to that of the lower mass after a suitable time delay. This permits the two-mass model to be described by a single equation of motion for the lower mass.
The two-mass model is effective because it provides a rudimentary approximation to the influence on the surface force of the mucosal wave. This force depends on the pressure produced by lung contraction, and its modification by flow separation from the vocal folds and jet formation in the glottis outflow. Separation depends on the position within the glottis of its cross-sectional minimum Ag, which is governed by the mucosal wave propagating over the medial surfaces of the folds [29 – 37]. The separation point is sometimes expressed in terms of a ‘flow separation coefficient’ As/Ag, where As is the cross-sectional area downstream of the minimum glottis cross section at which separation occurs. Sidlof et al. [24] concluded from measurements of a model vocal fold that this point remains close to the narrowest cross-section during most of the vibration cycle, but that it can move significantly further downstream just before and after glottal closure. Because of the asymmetry provided by separation, a knowledge of As/Ag constitutes an essential input to numerical models. Thus, Deverge et al. [38] assume As/Ag = 1.2, and values of As/Ag ranging from 1.08 to 1.6 are used in [28, 39 – 42].
In this paper the time-dependent position of the separation point is obtained from measurements of Berry et al. [32] of mucosal waves on an excised canine hemilarynx. These are incorporated into the single mass-spring model of the glottis illustrated in Figure 2. The experiment involved imaging of the vocal fold medial surface along a coronal cross-section of the left vocal fold. Excised fold tissue responds passively to an applied subglottal pressure rise, but it was claimed that many aspects of the intact voice are duplicated in the experiment (chest like, falsettolike, frylike vibrations) including abnormalities observed in voice disorders. Previous measurements on an hemilarynx [43] also indicate that observations over a wide range of driving pressures yield results comparable to the full larynx.
The aeroacoustic theory of voicing for this simple model is discussed in Section 2. The Berry et al. [32] data are sufficiently precise for the time-dependent position of the glottal minimum area Ag to be determined as a two-valued function of the fractional open area Ag/Amax, respectively applicable during the opening and closing phases of the glottis, where Amax is the maximum open area attained during a cycle. This is discussed in Section 3. The result is used in Section 4 to calculate the unsteady normal driving force for use in the single-spring-mass equation of motion of the vocal folds. This equation must be solved simultaneously with the Fant equation governing the glottis volume velocity Q(t). Limit cycle solutions of the equations are then investigated (Section 5), for which the moving separation point provides the necessary phase variation to overcome system damping and maintain vocal fold vibration. The influence on the glottis of the back-reaction of standing waves in the supraglottal tract is discussed in Section 5.
2. Green’s function and the Fant equation
In the vocal tract the Mach number is small and variations in the mean air density can be neglected. We can therefore set ρ = ρo in equation (6), so that
(7) |
where ν = η/ρo is the kinematic viscosity.
2.1 Green’s function
Consider the evaluation of this integral formula at x in Figure 1a, in the supraglottal v tract where . Green’s function will be determined in the compact approximation (for characteristic wavelengths ), when only plane waves can propagate. To avoid unnecessary complications let us assume that the upper and lower tracts have the same cross-sectional area A = AL. Use rectangular coordinates x = (x1, x2, x3) with the origin at the centroid of the glottis and with x1 directed along the axis of symmetry. Within and close to the acoustically compact glottis Green’s function reduces to a solution of Laplace’s equation, which can be written
(8) |
where α(τ, x1, t), β(τ, x1, t) do not depend on y and ∇2Y (y, τ) = 0. Y (y, t) can be regarded as a velocity potential of an ideal flow from y1 < 0 to y1 > 0 through the glottis normalised to have unit speed in the positive y1 direction in the upper and lower tracts. In particular it is required that ∂Y/∂yn = 0 on all surfaces, including the instantaneous surface of the glottis, and Y can be normalised such that
(9) |
where ℓ̄= ℓ̄(τ) is a characteristic length ~ (A/Ag)ℓg.
It follows that the functional forms of G ≡ G(x1, y1, t, τ) when y lies within the upper and lower tracts must satisfy
(10) |
In these regions we have (see [44] for further details)
(11a) |
(11b) |
where f(τ − y1/co) is an incoming plane wave from the lungs, ko = ω/co, and
(12) |
in which H(·) is the Heaviside step function. The integrations with respect to ω pass above singularities on the real axis. Go(x1, y1, t, τ) is just the compact Green’s function for x, y in the upper tract when ∂Go/∂y1 = 0 at y1 = 0 (i.e. when flow through the glottis is blocked).
The representation (11b) satisfies ∂G/y1 → β as y1 → +0, and G = 0 at the ‘mouth’ y1 = L̂, where L̂ is the internal duct length L increased by an appropriate ‘open-end correction’ [45]. The conditions that G → α − ℓ̄, α respectively as y1 → ∓0, and ∂G/∂y1 → β as y1 → −0 supply the following consistency relations:
(13) |
(14) |
2.2 The Fant equation
It is not possible to simplify further the above equations to obtain an explicit representation of G, which depends implicitly on the time-dependent geometry of the glottis, and therefore on the solution of the acoustic problem! However, this coupling leads to the Fant equation in the guise of a further consistency condition that must be satisfied by the glottis volume velocity Q(t). This is derived by equating the results of two different calculations of the acoustic pressure: (i) in terms of the monopole source Q(t), and (ii) via the aerodynamic sound integral representation (7).
2.2.1 Prediction (i) via the monopole source
To do this we consider the sound field at an arbitrary point x in the upper vocal tract. We first regard the glottis opening into the tract as a monopole source distribution in the wall at x1 = 0, and then apply a special case of (7) using the Green’s function Go of equation (12) for a duct closed at x1 = 0. Only the surface integral on the right hand side of (7) is retained (with the viscous term discarded) to yield for case (i)
(15) |
This result is now expressed in terms of the coefficient β(τ, x1, t) by use of the consistency relation (14) and the third of equations (13):
(16) |
2.2.2 Prediction (ii) via the aerodynamic sound formula
Turning now to the full aeroacoustic representation (7), there are two principal sources corresponding to the two integrals on the right hand side, and we write
(17) |
where Bo and Bσ are the respective contributions from the surface and volume integrals.
Surface motions are significant in two regions. One of these is the wall region within and close to the glottis. The overall monopole contribution from this motion is small, because of tissue incompressibility, and is ignored. Other surface integral contributions from within the glottis can also be ignored [46]; in particular the viscous drag within the glottis is significant only near closure – the Reynolds number typically exceeds 103 during most of the open phase and detailed calculation [46] indicates that the net contribution of surface friction is uniformly small over a complete cycle of oscillation.
The remaining contribution is furnished by steady contraction of the lungs, which will be assumed to give a net volume flux Qo towards the glottis. The contraction is necessarily equivalent to a compact source centred on x1 ~ −ℓq, say, (Figure 1) so that
(18) |
where use has been made of the second of equations (13).
Next, Green’s function is given by equation (8) when y is in the compact region adjacent to the glottis, and therefore
(19) |
where the volume integral may be regarded as confined to the jet just downstream of the glottis (or just upstream if reverse flow should occur). Although turbulence (vortex) sources exist elsewhere in the vocal tract, they are essentially weak quadrupoles and contribute negligibly to the sound.
Combining the results (18), (19) we conclude that the aggregate aeroacoustic prediction (ii) within the upper tract is simply
(20) |
2.2.3 Fant equation
Predictions (16) and (20) for the sound in the upper tract are evidently equivalent provided
(21) |
This is the Fant equation for the volume velocity Q(t). Its more conventional form is [14 – 16]
(22) |
where ℓ = ℓ(t) ≡ (Ag/A)ℓ̄(t) is the effective length of the slug of fluid within and near the glottis that contributes to the inertia of the unsteady glottal flow [45]. The integrated term on the left represents the influence of the unsteady jet on the glottal flow, and can be evaluated when information is available about the jet vorticity and velocity distributions. The terms in the curly brackets on the right hand side are respectively the overall unsteady pressures just upstream and downstream of the glottis, which force the glottal volume flow over its effective cross-section Ag:
(23) |
An investigation based on equation (22) is often more convenient than full numerical treatments of the structural and compressible motions [41, 47 – 51], which are computationally intensive and frequently cannot be run for more than one or two voicing cycles. They have mostly neglected the back-reaction on the glottal flow of standing acoustic waves in the supra and subglottal tracts, by permitting sound to radiate towards the mouth and lungs without reflections. Back-reactions of this kind are easily incorporated into equation (22) [17, 18, 20, 21, 25, 44, 52 – 55], in our case via the second of equations (23).
2.2.4 Vocal fold equation
Equation (22) must be solved in conjunction with equations governing Ag(t) and ℓ(t). The simplest reduced order equation for Ag(t) is based on the single mass-spring system of the vocal folds indicated in Figure 2 [25]. With the origin at the centroid of the glottis, and x1 parallel to the vocal tract, the x2 and x3 axes may be taken respectively vertically upwards and out of the plane of the paper in Figure 2. The folds are assumed to vibrate symmetrically in the x2 direction, and the displacement ζ(t) of the upper fold (Figure 3) is taken to satisfy the damped-oscillator equation
(24) |
where m is the effective mass of the fold, α ≃ 0.1 is the structural damping ratio, Ω = 2πfo is the radian natural frequency determined by muscular adjustment of the folds, and F(t) is the normal force exerted by the air on the medial surface of the fold. The two folds touch and ‘close’ the glottis when ζ = ζo, say, which corresponds to the centre-line of the glottis, so that ζ > ζo during free motion. If ζo > 0 (as in Figure 3) and the folds are in their undisturbed rest positions, they are pressed tightly together with equal and opposite forces of magnitude mΩ2ζo. When ζo < 0 the glottis remains open in the rest position, and the folds are then separated by a distance 2|ζo|.
The driving force F(t) is governed by the pressure produced by lung contraction, modified by flow separation from the vocal folds and jet formation in the outflow. These events depend on the position within the glottis of the cross-sectional minimum Ag, which is influenced by the mucosal waves propagating over the medial surfaces of the folds [29 – 37]. In the case of the single mass-spring system of Figure 2, the position of Ag coincides with that of the maximum mucosal wave amplitude. Interactions with the opposite fold during collision also affect the volume velocity Q(t), and certain voice disorders have been identified with vocal fold abnormalities (nodules, polyps, etc) that can disrupt normal mucosal wave propagation [56].
3. Mucosal wave data
Berry et al. [32] measured fleshpoint motions on an excised canine hemilarynx using nine microsutures placed along the medial surface of a mid-coronal section of the left vocal fold, spaced approximately one mm apart. Fleshpoint motions were primarily confined to the static mid-coronal plane, the observed out-of-plane vibrations (parallel to the flow direction in Figure 2) were observed to be an order of magnitude smaller. The experiments were performed in the absence of a supraglottal tract over a range of subglottal pressures pI, up to a maximum of about 1600 Pa.
A periodic, chest-like vibration pattern was observed at pI ~ 800 Pa with fundamental frequency ~102 Hz. Data for this case displayed in Figure 8(a) of [32] has been digitised to provide an empirical relation between the glottis fractional minimum cross-section Ag/Amax and its distance ℓ2 from the downstream end of the glottis. For the simplified glottis of Figure 2, the overall glottal length ℓg is defined to be the observed value of ℓ2 at which Ag attains its maximum Amax.
The digitised data have been smoothed and ℓ2/ℓg plotted in Figure 4 as a two-valued function of Ag/Amax over a complete cycle, the dependence on Ag being different during the opening and closing phases. In the opening phase ℓ2 remains at or very close to the downstream end of the glottis until Ag exceeds about 80% of Amax, after which it recedes rapidly to the subglottal end. The variation when closing is relatively uniform and the folds exhibit a sharp, whip-like motion reminiscent of the chest or modal register [32]. The corresponding changes in glottal geometry at equal time intervals over a cycle are illustrated in Figure 5. The profiles are drawn by assuming that the nominally flat medial surface is displaced a distance y inwards, towards the glottal axis, according to the formula
(25) |
where xw = (distance/ℓg) along the x1 axis from the position ℓ2 of minimum glottis cross-section, h is the nominal peak amplitude of the mucosal wave (see Figure 3) and ℓw = 0.1. Thus, 2h is the mean glottis width in the upstream section in the fully closed state of Figure 5a. Although this width is arbitrary, its actual value has little or no significance for the calculations to be discussed below. The sequence of profiles in Figure 5 is similar to Figure 2.1 of Stevens [16], except that in Stevens’s closed state, corresponding to our Figure 5a, the region of contact of the vocal folds extends a distance of about 0.2ℓg from the supraglottal end of the glottis.
4. Equations of motion in the quasi-static approximation
4.1 The Q-equation
The Berry et al. [32] data are applied to the simplified case of Figure 1 where the subglottal and supraglottal tracts are hard-walled and have equal uniform cross-sectional areas A ≫ Amax. Steady contraction of the lung cavity is assumed to be equivalent to a uniform driving pressure pI = ρocoQo/A in the subglottal region, resulting in a volume flux Q(t) through the glottis and the propagation of plane sound waves into x ≷ 0 on either side of the glottis.
The unsteady pressure p1 just upstream of the glottis (where fo|x1|/co ≪ 1) is
(26) |
When the Strouhal number foℓg/V ≪ 1, the flow within and near the glottis is quasi-static and the steady form of Bernoulli’s equation is applicable in local regions of irrotational flow. It is assumed that flow separation just downstream of the minimum glottis section (A in Figure 2) produces a well defined jet of volume velocity Q(t) bounded by idealised free streamlines (the Reynolds number based on jet thickness ~ 103 typically). The unsteady pressure within the jet and in the glottis to the left of A is therefore given by
(27) |
where V (x, t) is the local flow speed. In the downstream region, outside the jet, the pressure is equal to p2.
The overall internal duct length L (~ 17 cm for an adult male) will be increased to L̄ = 20 cm (Table 1) to account for the acoustic end-correction at the mouth [45]. In the absence of dissipation in the duct and when radiation losses from the mouth are ignored, the pressure p2 is given in terms of Q by the second of equations (23). The integration path along the real ω axis passes above simple poles of the integrand at , (−∞ < n < ∞).
Table 1.
Parameter | Value |
---|---|
glottal length ℓg | 3 mm |
glottal span ℓs | 10 mm |
mucosal wave amplitude h | 1 mm |
mass of one vocal fold m | 0.5 × 10−4 kg |
nominal vibration frequency fo = Ω/2π | 125 Hz |
subglottal and supraglottal cross-sections A | 100π mm2 |
perimeter of the supraglottal tract ℓp | 20π mm |
effective length of the supraglottal tract L̄ | 20 cm |
density of air ρo | 1.23 kg/m3 |
speed of sound co | 359 m/s |
In practice, however, dissipation at the walls of the supraglottal tract and radiation losses from the mouth can be important. For the hard-wall, mechanical model of Figure 1a thermo-viscous damping occurs in the duct wall boundary layers, and usually exceeds the radiation damping at lower frequencies. When these effects are taken into account the resonance frequencies become complex and are given to a first approximation by [57]
(28) |
in which ℓp is the perimeter of the upper tract, and ν ≃ 1.5 ×10−5 m2/s, χ ≃ 2 × 10−5 m2/s are respectively the kinematic viscosity and thermometric conductivity of the air in the upper tract. Dissipation produced by flexing of wall-tissue by standing waves in the upper tract is actually much larger [16], but the simple model (28) will suffice to illustrate the influence of dissipation on standing waves in the upper tract.
When this correction is incorporated into the second of equations (23) we find
(29a) |
(29b) |
The results (26) and (29) may now be substituted into the Fant equation (22).
In the quasi-static approximation the vorticity integral in equation (22) is given to first order by [44, 58]
(30) |
which is applicable also in the case of reversed flow (towards the lungs) through the glottis. This quadratic term represents the influence of the unsteady jet on the glottal motion, and this quasi-static representation is applicable provided is small, where Uσ = Q/σAg is the free jet speed and s is the jet contraction ratio. Jet instability in the downstream region can be ignored because the integrand is only significantly different from zero close to the glottis. The contraction ratio σ ≃ 0.61 for a simple circular or rectangular aperture in thin wall, where the streamlines turn through about 90° at the aperture edge [13], but it will be much larger for a confined jet where separation takes place just downstream of the narrowest section Ag. Experiment [59] and ideal analytical modelling [23] suggest that a value closer to σ ≃ 1 is probably more appropriate, and we shall use this in what follows.
Finally, because the glottal region is compact, we can neglect ℓ̄∂Q/t in (22) compared to coQ.
Thus, collecting together the results (26), (29a) and (30), the Fant equation (22) is reduced to quasi-static form
(31) |
4.2 The surface force F(t)
The net normal force F(t) applied by the airflow to the vocal fold of Figure 3 (in the direction of increasing ζ) consists of two components produced by the air pressure acting on the sections of the glottis wall upstream and downstream of the separation point, approximately of respective lengths ℓ1 and ℓ2. The pressure on the downstream section is equal to p2. Except in the immediate vicinity of the mucosal wave peak, Bernoulli’s equation (27) gives for pu, the upstream wall pressure,
(32) |
where Ac ≡ Ag + 2ℓsh is the uniform cross-sectional area of the glottis upstream of the minimum cross-sectional area. These approximations neglect small edge-transitional modifications of the pressure discussed by McGowan & Howe [60].
Using equations (26), (29a) and (32), and eliminating Q|Q| by means of the Fant equation (31), we find
(33) |
4.3 The vocal fold equation of motion
The substitution
(34) |
permits the vocal fold equation (24) to be cast in the form
(35) |
The net force on the right hand side must be positive to open the glottis during a period of closure. Solutions of the simultaneous equations (31) and (35) are obtained below for the standard set of parameter values in Table 1. The damping ratio α is believed to be about 0.1 [25, 61], although no precise data is available.
The vocal fold motion starts at t = 0 by application of a sufficiently large and constant lung overpressure pI, with the initial conditions Ag = 0, dAg/dt = 0. Two simple alternative procedures can be applied to model inelastic and elastic vocal fold collisions [25]. In the inelastic case each fold is reduced to rest on contact, and the motion is re-started with the initial values Ag = 0, dAg/dt = 0 by the pressure force pI. During elastic impact Ag remains zero for a finite time dependant on tissue resilience. This is modelled by formally permitting Ag to become negative subject to the modified equation of motion
(36) |
The contact motion will be assumed to be critically damped by taking the damping ratio ᾱ = 1. When predictions are plotted, however, the glottis area Ag(t) is set to zero during those time intervals where the equations imply that Ag < 0, i.e. we plot Ag = 0 for times when the vocal folds are in contact.
To determine Q(t) and Ag equations (31), (33), (35), (36) are augmented by equations for Zn(t), n = 1, 2, ···. These are obtained from equation (29b), which implies that Zn is also given in terms of Q(t) by the equations
(37) |
where . In practice the component of the back-pressure p2 determined by the modal coefficient Zn(t) will be small when the glottal radian frequency Ω ≪ ωn. This is satisfied when n > 2 for normal voicing, which indicates that the infinite system (37) can safely be truncated at, say, n = 5. The validity of this approximation has been verified by numerical tests. The whole set of governing equations can then be solved by Runge-Kutta integration, subject to the additional initial conditions Zn = 0, at t = 0.
5. Numerical results
Typical numerical predictions when the separation point moves in accordance with the Berry et al. [32] measurements are depicted in Figure 6. The variations of Ag(t)/A, Q(t)/Qo, are plotted during the initial time period 0 < fot < 5, where Qo = pI A/ρoco is mean volume velocity from the lungs, corresponding to a subglottal pressure of amplitude pI. The variation of Q(t) determines the acoustic pressure radiated towards the mouth (prior to reflection and transmission). The motion is started at t = 0 with Ag = 0, dAg/dt = 0 with the folds just touching (ζo = 0) by a constant applied subglottal pressure pI. The damping ratio α = 0.1, and critical damping ᾱ = 1 is assumed in equation (36) during vocal fold impact. Other parameter values are given in Table 1.
The surface force F(t) falls almost to zero when the glottis is fully open, when the separation point is at the upstream end of the glottis, because the whole of this force is then furnished by the back pressure p2 (equation (29a)), which is very small when fo ≪ f1 = ω1/2π ~ 450 Hz. This is favourable to the maintenance of limit cycle oscillations. According to Figure 6 the asymmetry of F(t) relative to the instant at which Ag = Amax (produced by the corresponding asymmetric motion of the separation point) ensures that the surface force is always larger during opening than at the corresponding point during the closing phase of the glottis, conducive to the supply of energy to overcome vocal fold damping. The waveforms plotted in Figure 6a are for a mean subglottal pressure pI = 800 Pa, for which the peak of the limit cycle volume velocity wave profile is skewed slightly to the latter half of the cycle. The effective limit cycle fundamental frequency is about 40% larger than f0.
Figure 6b indicates how these conclusions are changed when the subglottal pressure is increased to 1600 Pa accompanied by the application of an adduction force Fa, say, to the vocal folds, causing them to be tightly pressed together in the rest position. This force corresponds to the second term (Ω2ζo) in the braces on the right of equation (35), where the parameter ζo is taken to be 0.5 mm. Because in ‘pressed voicing’ the folds are unable to execute a full cycle of the ‘free’ equation of motion (35), the effective frequency of the limit cycle is increased, in the present case by about 60% of f0.
5.1 The ‘exposed’ glottis
Insight into the influence on the glottal flow of standing acoustic waves in the upper tract is obtained by consideration of the configuration of Figure 1b, where the glottis radiates directly into free space. The pressure p2 downstream of the glottis now vanishes to a good approximation.
The calculations of Figure 6 have been repeated for an exposed glottis (with AL = A of Table 1) and the results displayed in Figure 7. The force and volume velocity waveforms are now much smoother, F(t) vanishing when the glottis is fully open. The predicted volume velocity profiles Q/Qo are perfectly symmetric, and exhibit marginally increased oscillation frequencies and reduced maximum amplitudes relative to the corresponding predictions in Figure 6 in the presence of the upper tract. Upper tract back-reaction therefore appears to be responsible for the asymmetry in the volume velocity waveforms.
5.2 Resonant back-reaction
The broadly similar predictions for corresponding cases in Figures 6, 7 suggest that the experimental data of [32] in Figure 4 measured using an excised canine hemilarynx (with the supraglottal tract removed) probably provides a good model also in the presence of the upper tract, at least when the back-reaction of standing waves is weak (when fo ≪ f1). But the data can also be used to investigate the back-reaction when fo is close to an upper tract resonance frequency, when the motion in the glottis could be strongly influenced by standing acoustic waves. This is known to be important for an exposed glottis when fo is close to a subglottal resonance [17 – 21]. According to Joliveau et al. [62] tuning of the first upper tract resonance frequency f1 to fo can be used by singers to attain increased radiated power, and Titze [30] has argued that vocal fold vibrations are enhanced when fo just exceeds f1.
To examine this effect in terms of the mechanical vocal tract of Figure 1a, the length L̄ of the upper tract must be artificially changed. Figure 8 depicts typical predictions when the unforced vocal fold resonance frequency fo = 250 Hz and the upper tract is adjusted to make f1 = 246.6 Hz (L̄ = 36.4 cm), and when the subglottal driving pressure pI = 800 Pa. No allowance is made for a possible corresponding reduction in the effective dynamic mass of the vocal folds. Equations (37) are truncated at n = 8 to ensure proper convergence.
In Figure 8 the limit cycle oscillations of the vocal folds are at frequency ~ f1. The glottis ejects two distinct puffs per cycle into the upper tract separated by a short interval of ‘silence’, during which the surface force F(t) rises to its maximum value ℓsℓgpI normally attained only when the glottis is closed. When boundary layer damping in the upper tract is ignored (Figure 8a), Q/Qo rises to a maximum of 0.165 when Ag has opened to about 0.5Amax, after which it falls to zero, and remains zero until Ag starts to decrease. During this time the supraglottal back pressure p2 ~ pI is large enough to completely block the flow, even though the glottis is fully open (cf. Lighthill [63], Section 2.5; [44]). A second larger volume velocity peak of Q/Qo ~ 0.207 occurs when Ag decreases to about 0.55Amax, at which time the surface force F(t) falls to a positive minimum. The overall profile of the volume velocity is therefore periodic with frequency fo, but with a substantial second harmonic component. The principal effect of upper tract damping (Figure 8b) is to replace the flow blockage near Ag ~ Amax with a deep, positive minimum of the volume velocity Q. The depth of this minimum will in practice be further reduced when more realistic damping associated with vocal tract wall flexing is taken into account.
For the simple mechanical model of Figure 1a, these ‘double puff’ volume velocity profiles are evident in numerical predictions only when the glottis frequency fo is close to an upper tract resonance frequency. However, effects of this kind are believed to occur in the human vocal tract over much wider frequency bands centred on upper tract resonance frequencies [64], because of the diffusive influence of the relatively large contribution to overall damping produced by tissue-flexure in the tract walls.
6. Conclusion
The acoustic source responsible for voiced speech involves the periodic release of high pressure air from the lungs by the opening and closing of the glottis. Air emerges in a rapid succession of puffs of unsteady volume velocity that is usually identified as a monopole source radiating into the supraglottal tract. But this is really a nonlinear aeroacoustic problem whose solution by the tailored Green’s function method propounded by Doak [1] is formally intractable, because the time history of Green’s function depends on the solution of the direct acoustic problem. Green’s function and the glottis volume velocity Q(t) are coupled by a consistency equation, i.e. by the Fant equation, usually derived by heuristic argument, but shown here to be a rigorous consequence of Lighthill’s theory of aerodynamic sound.
The mucosal wave propagates back and forth over the medial surfaces of the separated vocal folds, and is responsible for the amplitude and phase modulation of the surface force. The state of the wave determines the position within the glottis of the minimum of the glottis cross-sectional area (Ag). Flow separation from the wall near this moving minimum is responsible for the desired phasing of the force, achieved principally because of asymmetric motion of Ag during opening and closing of the glottis. This is the significance of the experimental data in Figure 4, obtained from measurements on an excised canine hemilarynx [32]. Our calculations using this data for a simplified vocal tract model indicate that the mechanism is robust enough to maintain limit cycle oscillations over a wide range of voicing conditions. The characteristic locus of Ag displayed in Figure 4 is likely to be typical of most voicing cycles involving an interval of closure of the glottis, but confirmation of this awaits the availability of in vivo data.
Acknowledgments
This work was supported by a subaward of grant No. R01 DC009229 from the National Institute on Deafness and other Communication Disorders to the University of California, Los Angeles.
Contributor Information
M. S. Howe, Boston University, College of Engineering, 110 Cummington Mall, Boston MA 02215.
R. S. McGowan, CReSS LLC, 1 Seaborn Place, Lexington MA 02420
References
- 1.Doak PE. Acoustic radiation from a turbulent fluid containing foreign bodies. Proceedings of the Royal Society. 1960;A254:129–145. [Google Scholar]
- 2.Lighthill MJ. On sound generated aerodynamically. Part I: General theory. Proceedings of the Royal Society. 1952;A211:564–587. [Google Scholar]
- 3.Doak PE. Momentum potential theory of energy flux carried by momentum fluctuations. Journal of Sound and Vibration. 1989;131:67–90. [Google Scholar]
- 4.Powell A. Theory of vortex sound. Journal of the Acoustical Society of America. 1964;36:177–195. [Google Scholar]
- 5.Howe MS. Contributions to the theory of aerodynamic sound with application to excess jet noise and the theory of the flute. Journal of Fluid Mechanics. 1975;71:625–673. [Google Scholar]
- 6.Howe MS. Theory of Vortex Sound. Cambridge University Press; 2003. [Google Scholar]
- 7.Baker BB, Copson ET. The Mathematical Theory of Huygens’ Principle. 2. Oxford University Press; 1969. [Google Scholar]
- 8.Ffowcs Williams JE, Hawkings DL. Sound generation by turbulence and surfaces in arbitrary motion. Philosophical Transactions of the Royal Society. 1969;A264:321–342. [Google Scholar]
- 9.Crighton DG, Dowling AP, Ffowcs Williams JE, Heckl M, Leppington FG. Modern Methods in Analytical Acoustics (Lecture Notes) London: Springer-Verlag; 1992. [Google Scholar]
- 10.Brentner KS, Farassat F. Helicopter noise prediction: the current status and future direction. Journal of Sound and Vibration. 1994;170:79–96. [Google Scholar]
- 11.Doak PE. An introduction to sound radiation and its sources. Vol. 154. University of Southampton, Institute of Sound and Vibration Research Memorandum; 1966. [Google Scholar]
- 12.Möhring W. Modelling low Mach number noise. In: Müller E-A, editor. Mechanics of Sound Generation in Flows. Berlin: Springer-Verlag; 1980. pp. 85–96. [Google Scholar]
- 13.Howe MS. Acoustics of Fluid-Structure Interactions. Cambridge University Press; 1998. [Google Scholar]
- 14.Fant G. Acoustic Theory of Speech Production. The Hague: Mouton; 1960. [Google Scholar]
- 15.Flanagan JL. Speech Analysis Synthesis and Perception. 2. New York: Springer-Verlag; 1972. [Google Scholar]
- 16.Stevens KN. Acoustic phonetics. Cambridge, MA: MIT Press; 1998. [Google Scholar]
- 17.Titze IR, Story BH. Acoustic interactions of the voice source with the lower vocal tract. Journal of the Acoustical Society of America. 1995;101:2234–2243. doi: 10.1121/1.418246. [DOI] [PubMed] [Google Scholar]
- 18.Austin SF, Titze IR. The effect of subglottal resonance upon the vocal fold vibration. Journal of Voice. 1997;11:391–402. doi: 10.1016/s0892-1997(97)80034-3. [DOI] [PubMed] [Google Scholar]
- 19.Zhang Z, Neubauer J, Berry DA. The influence of subglottal acoustics on laboratory models of phonation. Journal of the Acoustical Society of America. 2007;120:1558–1569. doi: 10.1121/1.2225682. [DOI] [PubMed] [Google Scholar]
- 20.Howe MS, McGowan RS. Analysis of flow-structure coupling in a mechanical model of the vocal folds and the subglottal system. Journal of Fluids and Structures. 2009;25:1299–1317. doi: 10.1016/j.jfluidstructs.2009.08.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.McGowan RS, Howe MS. Source-tract interaction with prescribed vocal fold motion. Journal of the Acoustical Society of America. 2012;131:2999–3016. doi: 10.1121/1.3685824. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Barney A, Shadle CH, Davies POAL. Fluid flow in a dynamic mechanical model of the vocal folds and tract. I. Measurement and theory. Journal of the Acoustical Society of America. 1999;105:444–455. [Google Scholar]
- 23.Howe MS, McGowan RS. On the single-mass model of the vocal folds. Fluid Dynamics Research. 2010;42:015001. doi: 10.1088/0169-5983/42/1/015001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sidlof P, Doare O, Cadot O, Chaigne A. Measurement of flow separation in a human vocal folds model. Experiments in Fluids. 2011;51:123–136. [Google Scholar]
- 25.Flanagan JL, Landgraf LL. Self-oscillating sources for vocal-tract synthesizer. IEEE Transactions Audio and Electroacoustics. 1968;AU-16:57–64. [Google Scholar]
- 26.Ishizaka K, Flanagan JL. Synthesis of voiced sounds from a two-mass model of the vocal cords. Bell System Technical Journal. 1972;51:1233–1267. [Google Scholar]
- 27.Pelorson X, Hirschberg A, van Hassel RR, Wijnands APJ. Theoretical and experimental study of quasi-steady flow separation within the glottis during phonation. Application to a modified two-mass model. Journal of the Acoustical Society of America. 1994;96:3416–3431. [Google Scholar]
- 28.Avanzini F. Simulation of vocal fold oscillation with a pseudo-one-mass physical model. Speech Communication. 2008;50:95–108. [Google Scholar]
- 29.Timcke R, Von Leden H, Moore P. Laryngeal vibrations: measurement of the glottic wave. Part 1. The normal vibration cycle. Archives of Otolaryngology. 1958;68:1–9. doi: 10.1001/archotol.1958.00730020005001. [DOI] [PubMed] [Google Scholar]
- 30.Titze IR. The physics of small-amplitude oscillation of the vocal folds. Journal of the Acoustical Society of America. 1988;83:1536–1552. doi: 10.1121/1.395910. [DOI] [PubMed] [Google Scholar]
- 31.Berke GS, Gerratt BR. Laryngeal biomechanics: an overview of mucosal wave mechanics. Journal of Voice. 1993;7:123–128. doi: 10.1016/s0892-1997(05)80341-8. [DOI] [PubMed] [Google Scholar]
- 32.Berry DA, Montequin DW, Tayama N. High-speed digital imaging of the medial surface of the vocal folds. Journal of the Acoustical Society of America. 2001;110:2539–2547. doi: 10.1121/1.1408947. [DOI] [PubMed] [Google Scholar]
- 33.Döllinger M, Berry DA, Berke GS. Medial surface dynamics of an in vivo canine vocal fold during phonation. Journal of the Acoustical Society of America. 2005;117:3174–3183. doi: 10.1121/1.1871772. [DOI] [PubMed] [Google Scholar]
- 34.Zhang Z. Influence of flow separation location on phonation onset. Journal of the Acoustical Society of America. 2008;124:1689–1694. doi: 10.1121/1.2957938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Zhang Z. Characteristics of phonation onset in a two-layer vocal fold model. Journal of the Acoustical Society of America. 2009;125:1091–1102. doi: 10.1121/1.3050285. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Pickup BA, Thomson SL. Identification of geometric parameters influencing the flow-induced vibration of a two-layer self-oscillating computational vocal fold model. Journal of the Acoustical Society of America. 2011;129:2121–2132. doi: 10.1121/1.3557046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lucero JC, Koenig LL, Lourenco KG. A lumped mucosal wave model of the vocal folds revisited: Recent extensions and oscillation hysteresis. Journal of the Acoustical Society of America. 2011;129:1568–1579. doi: 10.1121/1.3531805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Deverge M, Pelorson X, Vilain C, Lagree P-Y, Chentouf F, Willems J, Hirschberg A. Influence of collision on the flow through in-vitro rigid models of the vocal folds. Journal of the Acoustical Society of America. 2003;114:3354–3362. doi: 10.1121/1.1625933. [DOI] [PubMed] [Google Scholar]
- 39.Decker GZ, Thomson SL. Computational simulations of vocal fold vibrations: Bernoulli versus Navier-Stokes. Journal of Voice. 2007;21:273–284. doi: 10.1016/j.jvoice.2005.12.002. [DOI] [PubMed] [Google Scholar]
- 40.Cisonni J, Van Hirtum A, Pelorson X, Willems J. Theoretical simulation and experimental validation of inverse quasione-dimensional steady and unsteady glottal flow models. Journal of the Acoustical Society of America. 2008;124:535–545. doi: 10.1121/1.2931959. [DOI] [PubMed] [Google Scholar]
- 41.Hofmans GCJ, Groot G, Ranucci M, Graziani G, Hirschberg A. Unsteady flow through in-vitro models of the glottis. Journal of the Acoustical Society of America. 2003;113:1658–1675. doi: 10.1121/1.1547459. [DOI] [PubMed] [Google Scholar]
- 42.Lous NJC, Hofmans GCJ, Veldhuis RNJ, Hirschberg AA. A symmetrical two-mass vocal-fold model coupled to vocal tract trachea with application to prosthesis design. Acta Acustica (united with Acustica) 1998;84:1135–1150. [Google Scholar]
- 43.Jiang JJ, Titze IR. A methodological study of hemilaryngeal phonation. Laryngoscope. 1993;103:872–882. doi: 10.1288/00005537-199308000-00008. [DOI] [PubMed] [Google Scholar]
- 44.Howe MS, McGowan RS. Production of sound by unsteady throttling of flow into a resonant cavity with application to voiced speech. Journal of Fluid Mechanics. 2011;672:428–450. doi: 10.1017/S0022112010006117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Rayleigh Lord. Theory of Sound. Vol. 2. Dover; New York: 1945. [Google Scholar]
- 46.Howe MS, McGowan RS. On the role of glottis-interior sources in the production of voiced sound. Journal of the Acoustical Society of America. 2012;131:1391–1400. doi: 10.1121/1.3672655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Zhao W, Zhang C, Frankel SH, Mongeau L. Computational aeroacoustics of phonation Part I: Computational methods and sound generation mechanisms. Journal of the Acoustical Society of America. 2002;112:2134–2146. doi: 10.1121/1.1506693. [DOI] [PubMed] [Google Scholar]
- 48.Zhang C, Zhao W, Frankel SH, Mongeau L. Computational aeroacoustics of phonation Part II: Effects of flow parameters and ventricular folds. Journal of the Acoustical Society of America. 2002;112:2134–2146. doi: 10.1121/1.1506694. [DOI] [PubMed] [Google Scholar]
- 49.Thomson SL, Mongeau L, Frankel SH. Aerodynamic transfer of energy to the vocal folds. Journal of the Acoustical Society of America. 2005;118:1689–1700. doi: 10.1121/1.2000787. [DOI] [PubMed] [Google Scholar]
- 50.Duncan C, Zhai G, Scherer R. Modeling coupled aerodynamics and vocal fold dynamics using immersed boundary methods. Journal of the Acoustical Society of America. 2006;120:2859–2871. doi: 10.1121/1.2354069. [DOI] [PubMed] [Google Scholar]
- 51.Zheng X, Mittal R, Xue Q, Bielamowicz S. Direct-numerical simulation of the glottal jet and vocal-fold dynamics in a three-dimensional laryngeal model. Journal of the Acoustical Society of America. 2011;130:404–415. doi: 10.1121/1.3592216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Gupta V, Wilson TA, Beavers GS. A model for vocal cord excitation. Journal of the Acoustical Society of America. 1973;54:1607–1617. doi: 10.1121/1.1914457. [DOI] [PubMed] [Google Scholar]
- 53.Zanartu M, Mongeau L, Wodicka GR. Influence of acoustic loading on an effective single mass model of the vocal folds. Journal of the Acoustical Society of America. 2007;121:1119–1129. doi: 10.1121/1.2409491. [DOI] [PubMed] [Google Scholar]
- 54.Titze IR. Nonlinear source-filter coupling in phonation: Theory. Journal of the Acoustical Society of America. 2008;123:2733–2749. doi: 10.1121/1.2832337. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Fulcher LP, Scherer RC, Melnykov A, Gateva V, Limes ME. Negative Coulomb damping, limit cycles and self-oscillation of the vocal folds. American Journal of Physics. 2006;74:386–393. [Google Scholar]
- 56.Krausert CR, Olszewski AE, Taylor LN, McMurray JS, Dailey SH, Jiang JJ. Mucosal wave measurement and visualization techniques. Journal of Voice. 2011;25:395–405. doi: 10.1016/j.jvoice.2010.02.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Howe MS. Hydrodynamics and Sound. Cambridge University Press; 2007. [Google Scholar]
- 58.Howe MS, McGowan RS. Sound generated by aerodynamic sources near a deformable body with application to voiced speech. Journal of Fluid Mechanics. 2007;592:367–392. doi: 10.1017/S0022112010006117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Park JB, Mongeau L. Instantaneous orifice discharge coefficient of a physical driven model of the human larynx. Journal of the Acoustical Society of America. 2007;121:442–455. doi: 10.1121/1.2401652. [DOI] [PubMed] [Google Scholar]
- 60.McGowan RS, Howe MS. Comments on single-mass models of vocal fold vibration. Journal of the Acoustical Society of America. 2010;127:EL215–EL221. doi: 10.1121/1.3397283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Titze IR. Regulating glottal airflow in phonation: application of the maximum power transfer theorem to a low dimensional phonation model. Journal of the Acoustical Society of America. 2002;111:367–376. doi: 10.1121/1.1417526. [DOI] [PubMed] [Google Scholar]
- 62.Joliveau E, Smith J, Wolfe J. Vocal tract resonances in singing: The soprano voice. Journal of the Acoustical Society of America. 2004;116:2434–2439. doi: 10.1121/1.1791717. [DOI] [PubMed] [Google Scholar]
- 63.Lighthill James. Waves in Fluids. Cambridge University Press; 1978. [Google Scholar]
- 64.Titze IR. The physics of small-amplitude oscillation of the vocal folds. Journal of the Acoustical Society of America. 1988;83:1536–1552. doi: 10.1121/1.395910. [DOI] [PubMed] [Google Scholar]